I created a grid of perlin noise values and rendered them to the screen as a texture with various parameters for better control over my noise. Next I had generated the map mesh 12X12.
My aim was to train the agent to roll the ball into the red target by changing map mesh parameters.
At first I changed two agent parameters - noise scale and lacunarity - but the training agent always set equal parameters and didn't change them at all.
Then I edited reward and added two more parameters for control map mesh - movement map mesh (noise) on two axis - so agent had learned how to shake the duster ;)
At the end I gave agent the possibility to change two original parameters - noise scale and lacunarity. Also l changed a lot of hyperparameters in PPO. It was the 19th build and it was the most successful despite max_steps = 200000. Interesting fact: when I changed a few hyperparameters and max_steps = 400000, the result was worse again.
Then I also tried to improve agent's code but it got back to shaking duster and unfortunately my map generation was written in such a way that I couldn't create several agents... :(