How should I fix the error 'too many outputs argument'?

Question

ryunosuke tazawa 2021-8-6

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/893357-how-should-i-fix-the-error-too-many-outputs-argument

编辑： Ronit 2024-11-7

I made the cord using by reinforcement learning tool.

train_agent
Error: reset
There are too many output arguments.
Error: rl.env.MATLABEnvironment / simLoop (line 235)
                observation = reset (env);
Error: rl.env.MATLABEnvironment / simWithPolicyImpl (line 106)
                    [expcell {simCount}, epinfo, siminfos {simCount}] = simLoop (env, policy, opts, simCount, usePCT);
Error: rl.env.AbstractEnv / simWithPolicy (line 83)
            [experiences, varargout {1: (nargout-1)}] = simWithPolicyImpl (this, policy, opts, varargin {:});
Error: rl.task.SeriesTrainTask / runImpl (line 33)
            [varargout {1}, varargout {2}] = simWithPolicy (this.Env, this.Agent, simOpts);
Error: rl.task.Task / run (line 21)
            [varargout {1: nargout}] = runImpl (this);
Error: rl.task.TaskSpec / internal_run (line 166)
            [varargout {1: nargout}] = run (task);
Error: rl.task.TaskSpec / runDirect (line 170)
            [this.Outputs {1: getNumOutputs (this)}] = internal_run (this);
Error: rl.task.TaskSpec / runScalarTask (line 194)
                runDirect (this);
Error: rl.task.TaskSpec / run (line 69)
                runScalarTask (task);
Error: rl.train.SeriesTrainer / run (line 24)
            run (series taskspec);
Error: rl.train.TrainingManager / train (line 424)
            run (trainer);
Error: rl.train.TrainingManager / run (line 215)
            train (this);
Error: rl.agent.AbstractAgent / train (line 77)
    TrainingStatistics = run (trainMgr);
Error: train_agent (line 90)
trainingStats = train (agent, env, trainingOptions);

But the above error happed.

How should I fix it and, please teach me the way to check thet outputs argument and its number.

% DDPG エージェントのトレーニング
% 環境の設定
env = Environment;
obsInfo = env.getObservationInfo;
actInfo = env.getActionInfo;
numObs = obsInfo.Dimension(1);              % 2
numAct = numel(actInfo);                    % 1           
% CRITIC
statePath =[
    featureInputLayer(numObs, 'Normalization','none','Name','observation')
    fullyConnectedLayer(128, 'Name','CriticStateFC1')
    reluLayer('Name','CriticRelu1')
    fullyConnectedLayer(200,'Name','CriticStateFC2')];
actionPath = [
    featureInputLayer(numAct,'Normalization','none','Name','action')
    fullyConnectedLayer(200,'Name','CriticActionFC1','BiasLearnRateFactor', 0)];
commonPath = [
    additionLayer(2,'Name','add')
    reluLayer('Name','CriticCommonRelu')
    fullyConnectedLayer(1,'Name','CriticOutput')];
criticNetwork = layerGraph(statePath);
criticNetwork = addLayers(criticNetwork, actionPath);
criticNetwork = addLayers(criticNetwork, commonPath);
criticNetwork = connectLayers(criticNetwork,'CriticStateFC2','add/in1');
criticNetwork = connectLayers(criticNetwork,'CriticActionFC1','add/in2');
criticOptions = rlRepresentationOptions('LearnRate',1e-03,'GradientThreshold',1);
critic = rlQValueRepresentation(criticNetwork,obsInfo,actInfo,'Observation',{'observation'},'Action',{'action'},criticOptions);
% ACTOR
actorNetwork = [
    featureInputLayer(numObs,'Normalization','none','Name','observation')
    fullyConnectedLayer(128,'Name','ActorFC1')
    reluLayer('Name','ActorRelu1')
    fullyConnectedLayer(200,'Name','ActorFC2')
    reluLayer('Name','ActorRelu2')
    fullyConnectedLayer(1,'Name','ActorFC3')
    tanhLayer('Name','ActorTanh1')
    scalingLayer('Name','ActorScaling','Scale',max(actInfo.UpperLimit))];
actorOptions = rlRepresentationOptions('LearnRate',5e-04,'GradientThreshold',1);
actor= rlDeterministicActorRepresentation(actorNetwork,obsInfo,actInfo,'Observation',{'observation'},'Action',{'ActorScaling'},actorOptions);
% エージェントオプション
agentOptions = rlDDPGAgentOptions(...
    'SampleTime',env.Ts,...
    'TargetSmoothFactor',1e-3,...
    'ExperienceBufferLength',1e6,...
    'MiniBatchSize',128);
% ノイズ
agentOptions.NoiseOptions.Variance = 0.4;
agentOptions.NoiseOptions.VarianceDecayRate = 1e-5;
agent = rlDDPGAgent(actor,critic,agentOptions);
% トレーニングオプション
maxepisodes = 20000;
maxsteps = 1e8;
trainingOptions = rlTrainingOptions(...
    'MaxEpisodes',maxepisodes,...
    'MaxStepsPerEpisode',maxsteps,...
    'Verbose',false,...
    'Plots','training-progress',...
    'StopOnError','on',...
    'StopTrainingCriteria','AverageReward',...
    'StopTrainingValue',Inf,...
    'ScoreAveragingWindowLength',10);
% 描画の環境
%plot(env);
% トレーニングエージェント
trainingStats = train(agent,env,trainingOptions);           %%  ←  the error happend here.
% シミュレーション　エージェントのトレーニング
simOptions = rlSimulationOptions('MaxSteps',maxsteps);
experience = sim(env,agent,simOptions);

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Ronit 2024-11-7

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/893357-how-should-i-fix-the-error-too-many-outputs-argument#answer_1542150

编辑：Ronit 2024-11-7

在 MATLAB Online 中打开

Hello @ryunosuke tazawa,

The error seems to originate from the "reset" function within your reinforcement learning environment. The "reset" function must return two outputs: "InitialObservation" and "Info". This is necessary for the "sim" and "train" functions to properly initialize the environment at the start of each simulation or training episode.

Ensure that the calling code captures both outputs. The function call should be structured as follows:

[InitialObservation, Info] = reset(env);

Note: Ensure your "reset" function is implemented to return both "InitialObservation" and "Info" as specified.

Refer to the following MATLAB documentation link for more deatils:

https://www.mathworks.com/help/reinforcement-learning/ref/rl.env.rlfunctionenv.html#mw_dfd74bba-2b62-41fb-b48f-a0bf9666f74a

I hope it helps resolve your query!

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

How should I fix the error 'too many outputs argument'?

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

Community Treasure Hunt

How should I fix the error 'too many outputs argument'?

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论