Video
Description
Disclaimer: This is the most mathematical lecture out of the StarAi series. Whilst we endeavoured to make the StarAi content as accessible as possible, this particular lecture covers the base fundamentals & therefore contains the most formulas. If formulas is not for you please proceed to week 3. If however you would like to dive deeper down the mathematical formulation of the RL framework stick around for lecture 2! We also highly recommend David Silver’s excellent course on Youtube.
In this lecture you will learn the fundamentals of Reinforcement Learning. We start off by discussing the Markov environment and its properties, gradually building our understanding of the intuition behind the Markov Decision Process and its elements, like state-value function, action-value function and policies. We then move on to discussing Bellman equations and the intuition behind them. At the end we will explore one of the Bellman equation implementations, using the Dynamic Programming approach and finish with an exercise, where you will implement state-value and action-value functions algorithms and find an optimal policy to solve the Gridworld problem.
Lecture Slides
StarAi Lecture 2 Markov Decision Processes slides
Exercise
Follow the link below to access the exercises for lecture 2:
Lecture 2 Exercise 1: Policy Evaluation Exercise
Lecture 2 Exercise 2: Policy Iteration Exercise
Lecture 2 Exercise 3: Value Iteration Exercise
Exercise Solutions
Follow the link below to access the exercise solutions for lecture 2:
Exercise Solutions 1: Policy Evaluation
Exercise Solutions: Value Iteration
Additional Learning Material
- Sutton & Barto’s Reinforcement Learning: An Introduction - All of Chapter 3 & 4.