Is it possible to change some parameters during a RL training?

64 次查看(过去 30 天)
Hello everyone,
My idea is to change, during a training, some parameters such as the number of allowed interventions of the external controller as well as their duration. Another whish is to change accordingly these parametes as the agent makes progress, e.g. its episode reward will increase during training. Is it possible to do this during a training? That is, is it possible to implement some for loops that exploits the trainStats object while the training is runnig? Thank you

采纳的回答

Subhajyoti
Subhajyoti 2024-8-5,12:11
Hi Leonardo,
Custom training loops and callbacks can adjust the parameters dynamically during the training of supervised learning or reinforcement learning models in the Reinforcement Learning Toolbox.
You can implement a custom training loop by defining your environment and agent, then loop through a predefined number of episodes. Within each episode, interact with the environment by selecting actions, applying them, and updating the agent based on the received rewards and observations. Log the total reward for each episode to monitor performance.
Monitor metrics like total reward per episode and adjust parameters such as the number of allowed interventions and their duration if specific criteria are met (e.g., every 50 episodes or when the average reward exceeds a threshold).
Use the ‘trainStats’ object to log and monitor performance metrics during training. Store the metrics like total reward for each episode, which can then be used to calculate statistics for dynamic parameter adjustment.
By combining a custom training loop, dynamic parameter adjustment, and the ‘trainStats’ object, an adaptive training process can be created that enhances the learning efficiency of the RL agent.
You may go through the following MathWorks documentation links to know more about Training RL Agents options and ‘trainStats’.
I hope this helps.

更多回答(1 个)

Yash Sharma
Yash Sharma 2024-8-5,9:15
Hi Leonardo,
If you want to change some parameters while the training is running, you will need to implement a custom training loop. Below is the documentation that can help you achieve this.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by