Planning under uncertainty

Planning under uncertainty and learning
MDP, POMDPs

deterministic, stochastic

fully observable A*, depth filter, deapth first, mdp
partially observable, POMDP

Markov decision process(MDP)
state, actions, state transition,
T(s,a,s’)
reword function R(s)

MDP cridworld
policy π(s)->A

tree too deep

stole
blanching factor large
many states visitied more than once