Notifications
Article
Another Happy Landing
Updated 10 months ago
257
2
Another Happy Landing

Set-up:

A physics based movement task, where agent uses engines to get to the destination (and don't fall down :))
Goal: Move to the destination using engines (based on Unity Physics)
Agents: One Agent (RocketAgent) conected to a single brain

Reward Function:

  • + 0.2 when agent is getting closer to the destination
  • - 0.2 when agent is getting further from the destination
  • + 1.0*(1-DistanceFactor()) when agent is stable on the ground

Trening:

After a few trainning sessions I've decided to increase (from default) hidden_units parameter to 256 and max_steps to 1e6 (I guess that physics based scenarios are in general complex ones).

Here are some trainning logs:

  • First 5000:
Step: 10000. Mean Reward: -65.06871912450517. Std of Reward: 27.93401587399242. Step: 20000. Mean Reward: -56.05377736508384. Std of Reward: 27.7518681752101. Step: 30000. Mean Reward: -53.98639004683636. Std of Reward: 29.390447105359623. Step: 40000. Mean Reward: -40.138117082683685. Std of Reward: 27.213910876560146. Step: 50000. Mean Reward: -33.191751570866465. Std of Reward: 29.1101959372431.
  • Last 5000:
Step: 960000. Mean Reward: 19.008488875254894. Std of Reward: 39.820911471603004. Step: 970000. Mean Reward: 26.63405592301969. Std of Reward: 39.30830012058275. Step: 980000. Mean Reward: 27.002674998157566. Std of Reward: 46.23449917840254. Step: 990000. Mean Reward: 35.71054615465551. Std of Reward: 44.22499510824181. Step: 1000000. Mean Reward: 26.03905608469843. Std of Reward: 45.53669765642713.

Future improvements:

  • Make model more stable
  • Increasing complexity (f.ex. landing on floating barge ;))
  • Adding lurriculum learning
Thanks to Tomasz Słoma for graphics :)

Mateusz Kaleciński
Unity3d Developer - Programmer
1
Comments
Abhimanyu Aryan
a year ago
The Virtual Guy - Owner
sell this solution to Tesla :p
0
Unity3D Developer
perfect landing :D
0