Article
Pong AI
Updated 2 years ago
2.0 K
16
A Practical Example of Unity ML Agents Usage in Pong Game

## Motivation

I wanted to work on a small but practical case of Unity ML Agents usage in a game due to shortage of time, where training agents on a computer with weak processing power takes a lot of it. So, I decided to create a Pong game which could use the outcome of the training as a rival player and provide a computer versus player experience.
You can reach the repository of the code and the playable game down below.

## Training and Results

Firstly, for simplicity and as a proof of concept, agent (the paddle) collected only normalized y axis values of the paddle and the ball. In each training session ball is randomly thrown to right side of the scene. If it hits the paddle session ends with reward = 1, if it goes out of the play area session ends with reward = -1, that simple. The paddle had 3 actions to take, which were to go up, to go down or stay still. This approach proved to work quickly about 15e4 frames long training.
One down side of this simple approach was that the paddle followed the ball wherever it was, and this created a scripted, unbeatable opponent feeling and look. Additionally, the paddle was really shaky even though training went on for a 1e6 frames long training.
Before beginning the new trainings I started using curriculum learning to reach better results in shorter time. Only variable I changed is the size of the paddle and the curriculum had 3 lessons which scales the paddle down from 10 units to 4 units.
To reduce problematic results, several steps had to be taken. First, to enable agent to figure out when the ball is not moving towards it and when to the other side of the scene, direction of the ball have been registered to the collected states. Secondly, to let the agent know how close the ball is and change its speed accordingly, the x axis value of the ball is also registered to the collected states. Lastly to solve the shakiness problem and to complete the first two steps mentioned, staying still is encouraged by a small reward of 0.005.
These assumptions were purely hypothetical yet proved to be working to some extend. First thing I had to change was decreasing the reward value from 0.005 to 0.001, which taught the agent not moving at all is more profitable than moving. Then I tried different combinations of lastly added 3 states, best case came out of where I used all 3 of them. The paddle started to reduce its speed as the ball moved away, and the shakines cured hugely.
I expected training longer may solve the reducing accuracy problem and went for a 1e7 frames long training. It is proved if the model is flawed longer education does not make the agents do better. So I decided to examine Unity's own ML examples some more and improve the outcome.
One thing I noticed was that, in the examples changing variables of curriculum were also saved as states. I added one more variable to the curriculum which is ball speed and added both ball speed and paddle scale to the states. After this there were total 7 states. Also increased the number of lessons and reduced the change in the numbers per lesson, which helped me avoid sudden drops in the cumulative reward but let agent adapt slowly. This new model is trained for 2e6 frames long and it has got the best outcome among the all trainings results I got so far.
Finally, how I finished the project is:
• 1 brain
• 2e6 frames long training with original hyperparameter values on PPO
• 7 states which are paddle size [0,1], ball speed [0,1], ball direction x [-1,1], ball direction y [-1, 1], paddle position y [-1, 1], ball position x [-1, 1], ball position y [-1, 1]
• 1 reward if ball hits the paddle, -1 reward if ball goes out of area, 0.001 reward if paddle choses to keep still
• 6 lessons in the curriculum, 2 changing variables: paddle scale and ball speed
Gameplay
To turn on the play mode of the game, simply mark the isTraining flag true and play. You can use W and S to move up and down, alternatively up and down arrows. The AI stays on the left side of the scene.
You can find further instructions about how to play the game from the project in repository.

• itch.io: https://sarge.itch.io/pong-ai
• GitHub: https://github.com/srcnalt/Pong-AI
Sercan Altundas
MSc Software Engineer - Student
5
Tufan Aydin
9 months ago
Sercan Altundas
2 years ago
MSc Software Engineer
MüjdehanThis example will useful for my students to understand machine learning and Artificial Intelligence. So thank yo for your job. I share. Good Luck.
I am glad it will be useful for you. I am hoping to create a small series of instruction videos in both English and Turkish languages, to help beginners have a quick start with Unity ML Agents by examples. Machine Learning itself is a big concept, yet this tool can help many students easily understand the fundamentals without needing to program their own Neural Networks.
M
Müjdehan
2 years ago
This example will useful for my students to understand machine learning and Artificial Intelligence. So thank yo for your job. I share. Good Luck.
Sercan Altundas
2 years ago
MSc Software Engineer
MüjdehanDesteğe geldik.
Cok tesekkür ederim :0)
M
Müjdehan
2 years ago
Desteğe geldik.