technology

MDPs
POMDPs, Belief Space
Reinforcement Learning
A*; h function; Monte Carlo

chess, go, robot soccer, poker, hide-and-go-seek, card soliaire, minesweeper

s, p, actions(s, p), result(s,a), terminal(s), u(s, p)

deterministic, two-player, zero-sum

def maxValue(s):
m = -∞
for (a, s) in successors(s):
v = value(s’)
m = max(m, v)
return m

complexity
o(b)m