Natural Language processing – ソフトウェアエンジニアの技術ブログ：Software engineer tech blog

Language Model
-probabilistic
-word-based
-Learned

P(word1, word2…)
L = {s2, s2, …}

Logical trees hand-coded

P(w1, w2, w3…Wn) = p(w1:n) = πi P(wi|w1:i-1)
Markov assumption
P(wi|w1:i-1) = P(wi|wi-k:i-1)

stationarity assumption p(wi|wi-1) = p(wj|wj-1)
smoothing

classification, clustering, input correction, sentiment analysis, information retrieval, question answering, machine translation, speech recognition, driving a car autonomously

P(the), p(der), p(rba)

Naive Bayes
k-Nearest Neighbor
Support vector machines
logistic regression
sort command
gzip command

S* = maxP(w|:n) = max√li p(wi|w1:i-1)
S* = maxP(wi)