I had this problem once as well when training an AC agent. This can happen when an equation(probably in your step function), tried to calculate one of the following:
zero/zero, zero*infinity, infinity/infinity, infinity-infinity.
Try troubleshooting with something like this at the end of your step function:
if any(isnan(NextObs), 'all') % if any element in NextObs matrix contains a NaN
[row, col] = find(isnan(NextObs)) % Display the row and column position in the matrix
end
Note this will also work with a NextObs vector. This will give you the row and column position of the first NaN value and ouput it to the command line. You can then determine which NextObs value this corresponds to and find where in your code that value is calculated.
Without looking at the code I can only give limited advice. Also make sure you have a "fallback" value when calculating your NextObs if your implementation requires it:
if something == 1
NextObs = 2; % your regular calculations you have implemented already
else
NextObs = -1; % return a number instead to represent the NextObs value doesnt apply for this step
end
I referenced two posts: