GitHub Wanxing Resources: reinforcement learning algorithm implementation, tutorial code, and detailed learning plan

Latest update time：2021-08-31 18:51

Reads：

Yuyang from Aofei Temple
Quantum Bit Report | Public Account QbitAI

Since the advent of reinforcement learning (RL), AI has been able to play StarCraft and become the king of Atari, which has fascinated experts and amazed laymen.

Here is a reinforcement learning resource with over 10,000 stars. It not only has tutorial recommendations but also supporting exercises. Netizens say it is good after learning it, and it is also being updated in real time.

The admission requirements are not high, only some basic knowledge of mathematics and machine learning is required.

Clear learning path

If you want to get started with reinforcement learning, a high-quality course is essential.

There are thousands of reinforcement learning resources, and project author Denny Britz strongly recommends these two:

David Silver's Reinforcement Learning course :
http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html

and Richard Sutton and Andrew Barto’s Reinforcement Learning: An Introduction (Second Edition) :
http://incompleteideas.net/book/RLbook2018.pdf

ps No magic required for actual testing

Denny Britz said that these two books cover almost all the research papers that you need to know to get started with reinforcement learning. The foundation determines the height, so theoretical knowledge still needs to be learned solidly.

The theory is there, but there is no algorithm implementation in the book.

Don’t worry, Denny Britz has personally implemented most of the standard reinforcement algorithms using Python, OpenAI Gym and Tensorflow, and shared them for everyone to use in conjunction with the teaching materials.

That’s so thoughtful.

In this Wanxing resource, each folder corresponds to one or more chapters of the textbook. In addition to exercises and solutions, each folder also contains a series of learning objectives, a summary of basic concepts, and related links.

Take the chapter Model-Based Reinforcement Learning: Policy and Value Iteration Using Dynamic Programming as an example.

This chapter is a companion to the third lecture of David Silver's RL course, Dynamic Programming Planning.

First, the learning objectives:

Understand the difference between policy evaluation and policy improvement, and how these processes interact
Understanding Policy Iteration Algorithms
Understanding the Value Iteration Algorithm
Understand the limitations of dynamic programming methods

Set your learning goals and this tutorial will also highlight the key concepts for you.

Finally, here is a practical exercise.

The big framework has been set up, just focus on how to fill in the blanks:

The standard answer is attached at the end of the article: