Feature scaling
– try to determine Chris’s t-shirt size: 140 lbs, 6.1ft
– training set: Cameron, Sarah: 175 lbs, 5.9ft, 115 lbs, 5.2ft
measure height + weight
-> who is Chirs closer to in height + weight
Cameron(large shirt), Sarah(small shirt)
Feature Scaling
X’ = (X – Xmin)/(Xmax – Xmin)
[115, 140, 175]
25 / 60 = 0.417
0<= X' <= 1
[python]
from sklearn.preprocessing import MinMaxScaler
import numpy
weights = numpy.array([[115],[140],[175]])
scaler = MinMaxScaler()
rescaled_weight = scaler.fit_transform(weights)
weights = numpy.array([[115.],[140.],[175.]])
rescaled_weight = scaler.fit_transform(weights)
rescaled_weight
[/python]
Which algorithm would be affected by feature rescaling?
- SVM with RBF
- K-MEAN clustering