Error converting python DataFrame to Table

I have used the following commands to load in a python .pkl file.
fid = py.open("data.pkl");
data = py.pickle.load(fid);
T = table(data);
This loads a python DataFrame object. Newer versions of MATLAB have the ability to convert this object to a table using the table command, which I tried but encountered the below error:
Error using py.pandas.DataFrame/table
Dimensions of the key and value must be the same, or the value must be scalar.
What does this error mean? I'm guessing it's because the DataFrame object in the .pkl contains a couple nested fields. Most of the fields are simply 1xN numeric vectors, but a couple are 1xN objects which then have their own fields.
How can I convert this DataFrame object to something usable in MATLAB? I was given this datafile and did not generate it, and I am much more proficient in MATLAB than python, so I would rather solve this within MATLAB rather than having to create a python script or change how the file is created.

 采纳的回答

Umar
Umar about 3 hours 前
编辑:Umar about 3 hours 前
Hi @David, Thanks for writing in — you've actually already diagnosed this correctly, so let me just confirm it and get you moving. The table() conversion (available since R2024a) only handles one level of DataFrame nesting. Those 1×N object columns in your file that carry their own sub-fields push past that limit, and that's exactly what's throwing the dimension mismatch. Your flat numeric columns are completely fine — it's only the nested ones tripping it up. Before anything else, just run this to see what you're working with: fid = py.open("data.pkl", "rb"); data = py.pickle.load(fid); py.print(data.dtypes); py.print(data.head(int32(3))); For your flat columns, pull them out directly: flat_cols = {"col1", "col2", "col3"}; arrays = cellfun(@(c) double(data{c}.values), flat_cols,'UniformOutput', false); T = array2table(cell2mat(arrays), 'VariableNames', flat_cols); For the nested ones, you don't need a separate Python script. Call pandas.json_normalize inline from MATLAB — it flattens nested fields into dot-separated columns (e.g. sensor.value becomes a normal flat column), and after that table() will convert without issue: records = data.to_dict("records"); flat_data = py.pandas.json_normalize(records); T = table(flat_data); If that still gives you trouble, take a look at the PandasToMatlab utility on File Exchange (https://www.mathworks.com/matlabcentral/fileexchange/111770-pandastomatlab). The df2t() function there handles more edge cases than the built-in path and works entirely in memory. Full type-conversion details are in the docs here if you want to check what maps to what: https://www.mathworks.com/help/matlab/matlab_external/python-pandas-dataframes.html Hope this helps!

1 个评论

This is very helpful, thank you! I was able to use the json_normalize method to make it easily convertible to a table.
A couple other things: Would you mind formatting your answer? It is very difficult to read as is. Also, your line arrays = cellfun(@(c) double(data{c}.values), flat_cols,'UniformOutput', false); gives the error "Brace indexing is not supported for variables of this type."

请先登录,再进行评论。

更多回答(0 个)

类别

帮助中心File Exchange 中查找有关 Call Python from MATLAB 的更多信息

产品

版本

R2025b

提问:

2026-4-21,18:28

评论:

about 6 hours 前

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by