Live Monitoring of Critic Predictions in the RL Toolbox
1 次查看(过去 30 天)
显示 更早的评论
I'm wondering if it is possible to monitor the Q-value predictions within any critic-based RL approach using the RL toolbox? For example, having a multi-output DQN agent the internal deep NN has to be called at every step in order to evaluate all possible discrete actions given the current state sample - hence, somewhere internally there must be a Q-value prediction for every discrete action available which are then evaluated in order to find the optimal action.
However, having spend some time on the 2020a documentation I was not able to find a way accessing these internal Q-value predictions at each time step. In particular, it would be nice if the Simulink-based agent block would be able to provide these predictions for further processing and monitoring reasons during the training and deployment phase.
Does somebody have a useful hint in order to retrieve the Q-value estimates during learning?
0 个评论
回答(0 个)
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!