7/18 - MDPs, Bellman Equations, Q-values, Policies

7/19 - RL, Bandit Problems, Regret

7/20 - RL: TD Learning, Q-learning, Policy Search