Information Gain

information gain = entropy(parent) – [weighted average] entropy(children)
decision tree algorithm: maximize information gain

>>> -2/3*math.log(2/3, 2) – 1/3*math.log(1/3, 2)

entropy(children) = 3/4(0.9184)+1/4(0)
0.3112