E1 277o Reinforcement Learning 3:1 (August 2022)

Course Instructor: :Shalabh Bhatnagar, CSA

Course description: Reinforcement learning refers to a class of techniques that combine aspects of optimal control, simulation/data driven optimization, and approximation methods for problems of dynamic decision making under uncertainty when the model of the underlying system and its processes is unknown. A large portion of the algorithms and techniques used here are model-free in nature and as a result and need no knowledge of the system dynamics and protocols used. Reinforcement Learning thus finds applications in several diverse areas such as Adaptive Control, Signal Processing, Manufacturing, Communication and Wireless Networks, Autonomous Systems and Data Mining.

The objective of the course will be to provide both a rigorous foundation in Reinforcement Learning through the various tools, techniques and algorithms used as well as cover the state-of-the-art algorithms in Deep Reinforcement Learning involving simulation-based neural network methods.

Syllabus

Introduction to Reinforcement Learning, Multi-armed bandits, Markov decision processes, Dynamic Programming - Value and Policy Iteration Methods, Model-Free Learning Approaches, Monte-Carlo Methods, Temporal Difference Learning, Q-learning, SARSA, Double Q-learning, Value Function Approximation Methods - TD Learning with Linear Function Approximation, Neural Network Architectures, Deep Q-Network Algorithm, Policy Gradient Methods, Actor-Critic Algorithms.

Textbooks / References

  1. R. Sutton and A. Barto, Reinforcement Learning, MIT Press, 2'nd Ed., 2018
  2. D.Bertsekas, Reinforcement Learning and Optimal Control, Athena Scientific, 2019
  3. Selected Recent Papers

Prerequisites: None

Grading:

    Homework 20%
  • Midterm 25%
  • Courseproject 25%
  • Final exam 30%.