This series provides an overview of reinforcement learning, a type of machine learning that has the potential to solve some control system problems that are too difficult to solve with traditional techniques.
We’ll cover the basics of the reinforcement problem and how it differs from traditional control techniques. We’ll show why neural networks are used to represent unknown functions and how the agent uses rewards from the environment to train them.
By the end of this series, you’ll be better prepared to answer questions like:
- What is reinforcement learning and why should I consider it when solving my control problem?
- How do I set up and solve the reinforcement learning problem?
- What are some of the benefits and drawbacks of reinforcement learning compared to a traditional controls approach?
Part 1: What Is Reinforcement Learning? Get an overview of reinforcement learning from the perspective of an engineer. Reinforcement learning is a type of machine learning that has the potential to solve some really hard control problems.
强化学习，第 2 部分：了解环境和奖励 In this video, we build on our basic understanding of reinforcement learning by exploring the workflow. What is the environment? How do reward functions incentivize and agent? How are policies structured?
Part 3: Policies and Learning Algorithms This video provides an introduction to the algorithms that reside within the agent. We’ll cover why we use neural networks to represent functions and why you may have to set up two neural networks in a powerful family of methods called actor-critic.
Part 4: The Walking Robot Problem This video shows how to use the reinforcement learning workflow to get a bipedal robot to walk, and how we can set up the RL problem to look more like a traditional control problem by adding a reference signal to the design.
Part 5: Overcoming the Practical Challenges of Reinforcement Learning There are a few challenges that occur when using reinforcement learning for production systems and there are some ways to mitigate them. This video covers the difficulties of verifying the learned solution and what you can do about it.