Article count:10350 Read by:146647018

Account Entry

GitHub Wanxing Resources: reinforcement learning algorithm implementation, tutorial code, and detailed learning plan

Latest update time:2021-08-31 18:51
    Reads:
Yuyang from Aofei Temple
Quantum Bit Report | Public Account QbitAI

Since the advent of reinforcement learning (RL), AI has been able to play StarCraft and become the king of Atari, which has fascinated experts and amazed laymen.

Here is a reinforcement learning resource with over 10,000 stars. It not only has tutorial recommendations but also supporting exercises. Netizens say it is good after learning it, and it is also being updated in real time.

The admission requirements are not high, only some basic knowledge of mathematics and machine learning is required.

Clear learning path

If you want to get started with reinforcement learning, a high-quality course is essential.

There are thousands of reinforcement learning resources, and project author Denny Britz strongly recommends these two:

David Silver's Reinforcement Learning course :
http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html

and Richard Sutton and Andrew Barto’s Reinforcement Learning: An Introduction (Second Edition) :
http://incompleteideas.net/book/RLbook2018.pdf

ps No magic required for actual testing

Denny Britz said that these two books cover almost all the research papers that you need to know to get started with reinforcement learning. The foundation determines the height, so theoretical knowledge still needs to be learned solidly.

The theory is there, but there is no algorithm implementation in the book.

Don’t worry, Denny Britz has personally implemented most of the standard reinforcement algorithms using Python, OpenAI Gym and Tensorflow, and shared them for everyone to use in conjunction with the teaching materials.

That’s so thoughtful.

In this Wanxing resource, each folder corresponds to one or more chapters of the textbook. In addition to exercises and solutions, each folder also contains a series of learning objectives, a summary of basic concepts, and related links.

Take the chapter Model-Based Reinforcement Learning: Policy and Value Iteration Using Dynamic Programming as an example.

This chapter is a companion to the third lecture of David Silver's RL course, Dynamic Programming Planning.

First, the learning objectives:

  • Understand the difference between policy evaluation and policy improvement, and how these processes interact

  • Understanding Policy Iteration Algorithms

  • Understanding the Value Iteration Algorithm

  • Understand the limitations of dynamic programming methods

Set your learning goals and this tutorial will also highlight the key concepts for you.

Finally, here is a practical exercise.

The big framework has been set up, just focus on how to fill in the blanks:

The standard answer is attached at the end of the article:

List of implemented algorithms

This tutorial now covers the following algorithm implementations.

  • Dynamic Programming Strategy Evaluation

  • Dynamic Programming Strategy Iteration

  • Dynamic Programming Value Iteration

  • Monte Carlo Forecasting

  • Monte Carlo Control of the Epslion-Greedy Strategy

  • Monte Carlo Off-Policy Control with Importance Sampling

  • SARSA (Strategy TD Learning)

  • Q-learning (off-policy TD learning)

  • Q-Learning for Linear Function Approximation

  • Deep Q-Learning for Atari Games

  • Dual Deep Q-Learning for Atari Games

  • Deep Q-Learning with Prioritized Experience Replay (under construction)

  • Policy Gradients: Baseline Strengthening

  • Policy Gradient: Baseline Actor-Critic Algorithm

  • Policy Gradient: Baseline Actor-Critic Algorithm with Continuous Action Space

  • Deterministic Policy Gradients in Continuous Action Spaces (WIP)

  • DDPG (under construction)

  • Asynchronous Advantage Actor-Critic Algorithm (A3C)

The learning path is so clear, and it’s such a high-quality resource, wouldn’t you mark it?

Portal:
https://github.com/dennybritz/reinforcement-learning

-over-

Join the community | Communicate with outstanding people

Mini Program | All categories of AI learning tutorials

Quantum Bit QbitAI · Toutiao signed author

Tracking new trends in AI technology and products

If you like it, click "Like"!



Featured Posts


Latest articlesabout

 
EEWorld WeChat Subscription

 
EEWorld WeChat Service Number

 
AutoDevelopers

About Us About Us Service Contact us Device Index Site Map Latest Updates Mobile Version

Site Related: TI Training

Room 1530, Zhongguancun MOOC Times Building,Block B, 18 Zhongguancun Street, Haidian District,Beijing, China Tel:(010)82350740 Postcode:100190

EEWORLD all rights reserved 京B2-20211791 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号 Copyright © 2005-2021 EEWORLD.com.cn, Inc. All rights reserved