Natural Language processing

Language Model
-probabilistic
-word-based
-Learned

P(word1, word2…)
L = {s2, s2, …}

Logical trees hand-coded

P(w1, w2, w3…Wn) = p(w1:n) = πi P(wi|w1:i-1)
Markov assumption
P(wi|w1:i-1) = P(wi|wi-k:i-1)

stationarity assumption p(wi|wi-1) = p(wj|wj-1)
smoothing

classification, clustering, input correction, sentiment analysis, information retrieval, question answering, machine translation, speech recognition, driving a car autonomously

P(the), p(der), p(rba)

Naive Bayes
k-Nearest Neighbor
Support vector machines
logistic regression
sort command
gzip command

(echo ‘cat new EN|gzip|wc -c’ EN;\
echo ‘cat new DE|gzip|wc -c’ DE; \
echo ‘cat new AZ|gzip|wc -c’ AZ) \
|sort -n|head -1

S* = maxP(w|:n) = max√li p(wi|w1:i-1)
S* = maxP(wi)