Reinforcement Learning Zero Reward
显示 更早的评论
I'm Training multiple reinforcement learning agents using a Simulink model with a custom function (to simulate a card game).
I can compile and run the model in Simulink with no problems, and attatching a scope to the reward and isdone signals show that they are set correctly (The reward is non-zero, and the isdone signal terminates the simulation at the correct time).
However, when I try to train the model, each agent shows zero reward.
Similar problems suggest that the problem may be with the isdone flag being set incorrectly, however I am confident that this is not the case, as each step outputs text into the command window (as desired), and so suggests that the model is simulating correctly during training.
To receate, run the 'CreateAgents' and 'CreateEnvironment' programs, open the 'WhistLearningVaribles.mat' file (containing necessary variables for the simulation and the training options), run 'myResetFunction', and train using the command: (Other functions must be present - the model references them during simulation)
stats = train([Player1, Player2, Player3, Player4],env,trainOpts);
Other functions must be present in the files structure - the model references them during simulation
Any advice would be much appreciated. Thanks!
采纳的回答
更多回答(0 个)
类别
在 帮助中心 和 File Exchange 中查找有关 Environments 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!