Understanding Algorithms for Reinforcement Learning
Reinforcement learning is a type of machine learning which allows decision makers to operate in an unknown environment. In the world of self-driving cars and exploring robots, RL is an important field of study for any student of machine learning.
What you'll learn
Traditional machine learning algorithms are used for predictions and classification. Reinforcement learning is about training agents to take decisions to maximize cumulative rewards. In this course, Understanding Algorithms for Reinforcement Learning, you'll learn basic principles of reinforcement learning algorithms, RL taxonomy, and specific policy search techniques such as Q-learning and SARSA. First, you'll discover the objective of reinforcement learning; to find an optimal policy which allows agents to make the right decisions to maximize long-term rewards. You'll study how to model the environment so that RL algorithms are computationally tractable. Next, you'll explore dynamic programming, an important technique used to cache intermediate results which simplify the computation of complex problems. You'll understand and implement policy search techniques such as temporal difference learning (Q-learning) and SARSA which help converge on to an optimal policy for your RL algorithm. Finally, you'll build reinforcement learning platforms which allow study, prototyping, and development of policies, as well as work with both Q-learning and SARSA techniques on OpenAI Gym. By the end of this course, you should have a solid understanding of reinforcement learning techniques, Q-learning and SARSA and be able to implement basic RL algorithms.
Table of contents
- Version Check 0m
- Module Overview 2m
- Prerequisites and Course Overview 2m
- Supervised and Unsupervised Machine Learning Techniques 5m
- Introducing Reinforcement Learning 6m
- Reinforcement Learning vs. Supervised and Unsupervised Learning 2m
- Modeling the Environment as a Markov Decision Process 7m
- Reinforcement Learning Applications 3m
- Understanding Policy Search 7m
- Policy Search Algorithms 4m
- Module Overview 1m
- Dynamic Programming 3m
- Demo: 8-Queens Algorithm Using Dynamic Programming, Helper Functions 7m
- Demo: 8-Queens Algorithm Using Dynamic Programming, Place Queens 5m
- Policy Search Techniques: Q-learning and SARSA 2m
- Intuition Behind Q-learning 9m
- Q-learning Using the Temporal Difference Method and SARSA 6m
- Exploring State Space 7m
- Demo: Q-learning for Shortest Path: Initialization 5m
- Demo: Q-learning for Shortest Path: Implementation 5m
- Intuitive Differences Between the Temporal Difference Method and SARSA 4m
- Q-values as a Memoization Technique 3m
- Module Overview 1m
- Exploring Reinforcement Learning Platforms 4m
- Exploring Environments in the Open AI Gym 3m
- Demo: Q-learning Using SARSA in the Frozen Lake Environment 7m
- Demo: Q-learning to Balance a Pole on a Cart 6m
- Demo: Q-learning to Balance a Pole on a Cart Simulation 7m
- Summary and Further Study 2m