Instance Based Learning

Before
1, 2, … n => f(x)

Now
1, 2, … n
f(x) = look up(x)

+ remember, fast, simple
– generalization, overfit

distance stand for similarity

K-NN
Given: Training data D = {xi, yi}
distance metric d(q, x)
number or neighbors k
query point q
– NN = {i: d(q, xi) k smallest}
– Retrun
– classification: ploraity
– regression: mean

Preference Bias
+ locality -> near points are similar
+ smoothness -> averaging
+ an features matter equally

Curse of dimensionality
As the number of features or dimensions grows, the amount of data we need to generalize accurately grow exponentially