Q-Learning is the most basic form of Reinforcement Learning, which doesn’t take advantage of any neural network but instead uses Q-table to find the best possible action to take at a given state.

Background information

  1. Environment

A CartPole-v0 is a simple playground provided by OpenAI to train and test Reinforcement Learning algorithms. The agent is the cart, which is controlled by two possible actions +1, -1 pointing on moving left or right. The reward +1 is given at every timestep if the pole remains upright. The goal is to prevent the pole from falling over(maximize total reward) as in GIF above. After…


In the first tutorial, I introduced the most basic Reinforcement learning method called Q-learning to solve the CartPole problem. Because of its computational limitations, it is working in simple environments, where the number of states and possible actions is relatively small. Calculating, storing, and updating Q-values for each action in the more complex environment is either impossible or highly inefficient. This is where the Deep Q-Network comes into play.

Background Information

The Deep Q-Learning has been introduced in 2013 in Playing Atari with Deep Reinforcement Learning paper by the DeepMind team. The first similar approach was made in 1992 using TD-gammon. The…

Maciej Balawejder

Free spirited artificial intelligence enthusiast https://github.com/maciejbalawejder

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store