Basic Mathematical Background: Please review this cribsheet to make sure you understand the concepts therein. You may also find these resources useful as occasional reference material.
On Using Matlab: Take a look at this handout Introduction to MATLAB giving an introduction to MATLAB (you may ignore the section about NETLAB). A further MATLAB tutorial is available at MTU Introduction to Matlab.
Note that the coursework will also require other tools and programming environments, which will be introduced and explained in lectures.
Date: |
Lecture content: |
Assignments and Deadlines: |
January 17, 2017 |
Introduction Slides (pdf) Reading: Ch 1 of Sutton & Barto book (1st ed.) |
January 20, 2017 |
Multi-armed Bandits; Review of Markov Chains; Introduction to Markov Decision Processes Slides (pdf) Reading: Ch 2, 3 of Sutton & Barto book (1st ed.) |
January 24, 2017 |
Intro to MDPs Contd. Reading: Ch 2, 3 of Sutton & Barto book (1st ed.) |
January 27, 2017 |
Dynamic Programming: Policy and Value Iteraction; Monte Carlo methods Slides (pdf) Reading Ch 4, 5 of Sutton & Barto book (1st ed.) |
January 31, 2017 |
Temporal Difference Methods Slides (pdf) Reading: Ch 6 of Sutton & Barto book (1st ed.) |
February 3, 2017 |
Discussion of On-policy/Off-policy Learning; TD Methods Contd. Slides (pdf) Reading: Ch 5, 6 Sutton & Barto book (1st ed.) |
February 7, 2017 |
[Tutorial] Worked examples Outline questions (pdf) |
Course Assignment 1 |
February 10, 2017 |
[Tutorial] Worked examples, continued |
February 14, 2017 |
[Tutorial] Introduction to the Arcade Learning Environment Reference: ALE Website Slides (pdf) |
February 17, 2017 |
[Tutorial] Q+A regarding tools for HW1 |
February 28, 2017 |
Generalization and Function Approximation Slides (pdf) Reading: Ch 8 of Sutton & Barto book (1st ed.) |
March 2, 2017 |
Assignment 1 Due (4 pm, submit electronically and hand in hardcopy to ITO) |
March 3, 2017 |
Abstraction: Options and Hierarchy Slides (pdf) Reading: Case study, Sec 11.4 (Elevator Dispatching) in print version of S+B book Optional Readings: 1. R.S. Sutton, D. Precup, S. Singh, Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning, Artificial Intelligence, Vol. 112, pp. 181 - 211, 1999. ( ElsevierLink) 2. A.G. Barto, S. Mahadevan, Recent Advances in Hierarchical Reinforcement Learning, Discrete Event Dynamic Systems 13(4):341-379, 2003. You can get the article via SpringerLink or get the preprint version here. |
Course Assignment 2 |
March 7, 2017 |
Partial Observability and the Partially Observed Markov Decision Process (POMDP) Slides (pdf) (based on material associated with Thrun et al. book) Optional Reading: Chapter 15 of S. Thrun, W. Burgard, D. Fox, Probabilistic Robotics, MIT Press. |
March 10, 2017 |
POMDPs Contd. |
March 14, 2017 |
Inverse Reinforcement Learning Slides (pdf) Optional Reading: A.Y. Ng, S.J. Russell, Algorithms for inverse reinforcement learning. In Proc. ICML, pp. 663-670, 2000. Preprint here. |
March 17, 2017 |
[Tutorial] Discussion and tools Assignment 2 Slides (pdf) |
March 21, 2017 |
Exploration and Controlled Sensing Slides (pdf) |
March 24, 2017 |
[Office Hour with TA] |
March 28, 2017 |
Multi-agent Reinforcement Learning Slides (pdf) Optional Reading: M. Bowling, M. Veloso, An analysis of stochastic game theory for multiagent reinforcement learning, CMU Technical Report CMU-CS-00-165, 2000. |
Assignment 2 Due (4 pm, submit electronically and hand in hardcopy to ITO) |
March 31, 2017 |
Policy Optimization [Not examinable] Slides (pdf) |
April 4, 2017 |
Deep Reinforcement Learning [Not examinable] Slides (pdf) Optional Reading: V. Mnih et al., Human level control through deep reinforcement learning, Nature 518:529-533, 2015. Optional Reading: A. Tamar et al., Value iteration networks, In Proc. NIPS 2016. Slides |
April 7, 2017 |
[Tutorial] Q+A and Review for Exam Slides (pdf) |
Informatics Forum, 10 Crichton Street, Edinburgh, EH8 9AB, Scotland, UK
Tel: +44 131 651 5661, Fax: +44 131 651 1426, E-mail: Please contact our webadmin with any comments or corrections. Logging and Cookies Unless explicitly stated otherwise, all material is copyright © The University of Edinburgh |