validateEnvironment() with a python environment
7 次查看(过去 30 天)
显示 更早的评论
I am following this tutorial: https://www.youtube.com/watch?v=oeLSGHdN4A0&list=PLUHjJ91-nf0T4rEJk8eLrMT3XSe_wJOhn&index=6&ab_channel=ChiDotPhi where the user imports a python environment into a matlab class. The code is copied from github https://github.com/sol0invictus/MAT-DL/blob/main/RL-in-MATLAB/mountain_car_1.m
classdef mountain_car_1 < rl.env.MATLABEnvironment
properties
open_env = py.gym.make('MountainCar-v0');
end
methods
function this = mountain_car_1()
ObservationInfo = rlNumericSpec([2 1]);
ObservationInfo.Name = 'MountainCar Descreet';
ObservationInfo.Description = 'Position, Velocity';
ActionInfo = rlFiniteSetSpec([0 1 2]);
ActionInfo.Name = 'Acceleration direction';
this = this@rl.env.MATLABEnvironment(ObservationInfo,ActionInfo);
end
function [Observation,Reward,IsDone,LoggedSignals] = step(this,Action)
result = cell(this.open_env.step(int16(Action)));
Observation = double(result{1})';
Reward = double(result{2});
IsDone = double(result{3});
LoggedSignals = [];
if (Observation(1)>=0.4)
Reward = 0;
IsDone = 1;
end
end
function InitialObservation = reset(this)
result = this.open_env.reset();
InitialObservation = double(result)';
end
end
end
When I load this environment as :
a=mountain_car_1()
it detects the environment:
a = mountain_car_1 with properties:
open_env: [1×1 py.gym.wrappers.time_limit.TimeLimit]
It works! But when I proceed to:
validateEnvironment(a)
Error using rl.env.MATLABEnvironment/validateEnvironment Unable to evaluate reset function.
Error in untitled (line 3) validateEnvironment(a) Caused by: Error using py.tuple/double
Conversion of Python element at position 1 to type 'double' failed.
All Python elements must be convertible as scalar to the requested type.
The code for the environment has been completely copied so it must be some package / format / version issue?
I know that MATLAB is compatible with Python up to 3.10 so I reinstalled it:
pyenv
ans =
PythonEnvironment with properties:
Version: "3.10"
Executable: "E:\Users\Onoe\Desktop\WPS4\sidechick\Python\python.exe"
Library: "E:\Users\Onoe\Desktop\WPS4\sidechick\Python\python310.dll"
Home: "E:\Users\Onoe\Desktop\WPS4\sidechick\Python"
Status: Loaded
ExecutionMode: InProcess
ProcessID: "13544"
ProcessName: "MATLAB"
I also checked calling the environment:
open_env = py.gym.make('MountainCar-v0');
open_env =
Python TimeLimit with properties:
action_space: [1×1 py.gym.spaces.discrete.Discrete]
metadata: [1×1 py.dict]
np_random: [1×1 py.numpy.random._generator.Generator]
observation_space: [1×1 py.gym.spaces.box.Box]
render_mode: [1×1 py.NoneType]
reward_range: [1×2 py.tuple]
spec: [1×1 py.gym.envs.registration.EnvSpec]
unwrapped: [1×1 py.gym.envs.classic_control.mountain_car.MountainCarEnv]
env: [1×1 py.gym.wrappers.order_enforcing.OrderEnforcing]
<TimeLimit<OrderEnforcing<PassiveEnvChecker<MountainCarEnv<MountainCar-v0>>>>>
But the validateEnvironment() error persists. Any soluton or alternative to use Open AI Gym from MATLAB?
1 个评论
Ganesh
2023-8-8
编辑:Ganesh
2023-8-8
The issue seems to be arising from the open_env.reset() function. The reset() function returns two variables, a list and a dictionary. While the list can be cast into a double, the dictionary cannot be cast. This may be indeed be due to a version mismatch. Older versions of the gym module may or may not be returning variables that are compatible with the double() function. Kindly note that using
result = result(1);
as second line of your reset() function will solve your issue, but does not guarantee perfect execution of the code as there maybe other version related issues.
回答(1 个)
Venu
2023-11-21
编辑:Venu
2023-11-22
I understand you are experiencing issues with the "reset" method of the "mountain_car_1" class in MATLAB. Let's address the problem and provide you with a solution.
The issue lies in the way the Python object returned by the Gym environment's "reset" method is being converted to a MATLAB array. In the original code you provided, the "reset" method attempts to directly convert the Python object to a MATLAB double array, which is not possible due to the nature of the Python object being a tuple, not a directly convertible array.
Here's the problematic part of your original "reset" method:
function InitialObservation = reset(this)
result = this.open_env.reset();
InitialObservation = double(result)';
end
To correct this, we need to properly handle the conversion from the Python tuple to a MATLAB array. The updated "reset" method should first extract the NumPy array from the tuple, convert it to a Python list, and then convert that list to a MATLAB double array.
Here's the corrected "reset" method:
function InitialObservation = reset(this)
result = this.open_env.reset();
InitialObservation = double(py.array.array('d', result{1}.tolist()))';
end
This change ensures that the first element of the tuple (which contains the NumPy array of observations) is correctly processed and converted into a MATLAB array that matches the expected 2x1 column vector format defined in the "ObservationInfo".
Please update your "reset" method with the provided code to ensure compatibility and proper functionality within the MATLAB Reinforcement Learning Toolbox.
Thank You
Venu
0 个评论
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Call Python from MATLAB 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!