About this Course

Reinforcement learning (RL) is an area of machine learning concerned with how agents take actions in an environment so as to maximize some notion of cumulative reward. In real word applications, RL is about the optimal control of an autonomous system. Reinforcement learning is considered as one of three machine learning paradigms, alongside supervised learning and unsupervised learning.

Main Contents

The main contents for this level is as follows:

Level 1

Introduction to Reinforcement Learning

  • Model Free and Model Based Learning
  • Probability Distribution
  • Stationary and Non-stationary
  • Policy
  • Value Function
  • Q-Function

Value and Policy Iteration

  • Value Iteration
  • Policy Iteration

Prediction Problems

  • Monte Carlo Learning
  • TD Learning
  • TD (Lambda)

Policy Control Problems

  • Q-Learning
  • Bellman Equation
  • Deep Q-Learning (DQN)
  • Case Studies

Improvements to DQN

  • Double DQN
  • Dueling Network Architecture
  • Soft Q-Learning
  • Recent Papers on Improvements of DQN


  • Sarsa Algorithm
  • Q-Learning vs Sarsa

Level 2

Policy Gradient Methods

  • Introduction to Policy Gradient Methods
  • Vanilla Policy Gradient
  • Actor-Critic
  • A3C
  • A2C
  • Natural Policy Gradient TRPO
  • Proximal Policy Optimization (PPO)
  • Deterministic Policy Gradient (DPG)
  • Deep Deterministic Policy Gradient (DDPG)