file:///Users/nathaniel.delrosario/Downloads/MDPs.pptx.pdf

https://inst.eecs.berkeley.edu/~cs188/sp23/assets/notes/cs188-sp23-note11.pdf

https://inst.eecs.berkeley.edu/~cs188/sp23/assets/notes/cs188-sp23-note12.pdf

In deterministic search, we wanted optimal plan from s → g

For MDP, we want an optimal policy P*: S → A

MDPs

Bellman Equation

Value Iteration, Q-Value Iteration

Policy Evaluation, Extraction, Iteration