Before
Now
f(x) = look up(x)
+ remember, fast, simple
– generalization, overfit
distance stand for similarity
K-NN
Given: Training data D = {xi, yi}
distance metric d(q, x)
number or neighbors k
query point q
– NN = {i: d(q, xi) k smallest}
– Retrun
– classification: ploraity
– regression: mean
Preference Bias
+ locality -> near points are similar
+ smoothness -> averaging
+ an features matter equally
Curse of dimensionality
As the number of features or dimensions grows, the amount of data we need to generalize accurately grow exponentially