Error converting python DataFrame to Table

Question

0 个投票

I have used the following commands to load in a python .pkl file.

fid = py.open("data.pkl");
data = py.pickle.load(fid);
T = table(data);

This loads a python DataFrame object. Newer versions of MATLAB have the ability to convert this object to a table using the table command, which I tried but encountered the below error:

Error using py.pandas.DataFrame/table
Dimensions of the key and value must be the same, or the value must be scalar.

What does this error mean? I'm guessing it's because the DataFrame object in the .pkl contains a couple nested fields. Most of the fields are simply 1xN numeric vectors, but a couple are 1xN objects which then have their own fields.

How can I convert this DataFrame object to something usable in MATLAB? I was given this datafile and did not generate it, and I am much more proficient in MATLAB than python, so I would rather solve this within MATLAB rather than having to create a python script or change how the file is created.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Follow Question

Answer 1

Umar 2026-4-22

编辑：Umar 2026-4-22

1 个投票

Hi @David, Thanks for writing in — you've actually already diagnosed this correctly, so let me just confirm it and get you moving. The table() conversion (available since R2024a) only handles one level of DataFrame nesting. Those 1×N object columns in your file that carry their own sub-fields push past that limit, and that's exactly what's throwing the dimension mismatch. Your flat numeric columns are completely fine — it's only the nested ones tripping it up. Before anything else, just run this to see what you're working with: fid = py.open("data.pkl", "rb"); data = py.pickle.load(fid); py.print(data.dtypes); py.print(data.head(int32(3))); For your flat columns, pull them out directly: flat_cols = {"col1", "col2", "col3"}; arrays = cellfun(@(c) double(data{c}.values), flat_cols,'UniformOutput', false); T = array2table(cell2mat(arrays), 'VariableNames', flat_cols); For the nested ones, you don't need a separate Python script. Call pandas.json_normalize inline from MATLAB — it flattens nested fields into dot-separated columns (e.g. sensor.value becomes a normal flat column), and after that table() will convert without issue: records = data.to_dict("records"); flat_data = py.pandas.json_normalize(records); T = table(flat_data); If that still gives you trouble, take a look at the PandasToMatlab utility on File Exchange (https://www.mathworks.com/matlabcentral/fileexchange/111770-pandastomatlab). The df2t() function there handles more edge cases than the built-in path and works entirely in memory. Full type-conversion details are in the docs here if you want to check what maps to what: https://www.mathworks.com/help/matlab/matlab_external/python-pandas-dataframes.html Hope this helps!

2 个评论
显示无隐藏无

David K 2026-4-22

This is very helpful, thank you! I was able to use the json_normalize method to make it easily convertible to a table.

A couple other things: Would you mind formatting your answer? It is very difficult to read as is. Also, your line arrays = cellfun(@(c) double(data{c}.values), flat_cols,'UniformOutput', false); gives the error "Brace indexing is not supported for variables of this type."

Umar 2026-4-23

IMG_8286.jpeg

Glad json_normalize worked out, David! Apologies for the messy formatting — here's a cleaner version of the full answer. Step 1-Inspect what you're working with: fid = py.open("data.pkl", "rb"); data = py.pickle.load(fid); py.print(data.dtypes); py.print(data.head(int32(3))); Step 2 — For flat numeric columns only: flat_cols = {"col1", "col2", "col3"}; arrays = cell(1, numel(flat_cols)); for i = 1:numel(flat_cols) arrays{i} = double(data{py.str(flat_cols{i})}.values); end T = array2table(cell2mat(arrays), "VariableNames", flat_cols); Step 3 — For nested columns (the one that solved your problem): records = data.to_dict("records"); flat_data = py.pandas.json_normalize(records); T = table(flat_data); On the brace-indexing error: the issue is that {} is a MATLAB cell array operation — a py.pandas.DataFrame isn't a cell array, so MATLAB rejects it. The fix is wrapping the column name in py.str() so MATLAB passes a proper Python string key to the DataFrame. I've also swapped cellfun for a plain loop since it's more reliable across MATLAB versions. That said, since json_normalize already handles both flat and nested columns in one shot, you probably won't need the flat-column path at all. Hope that clears things up!

请先登录，再进行评论。

Error converting python DataFrame to Table

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

2 个评论
显示无隐藏无

更多回答（0 个）

类别

产品

版本

标签

Community Treasure Hunt

Error converting python DataFrame to Table

0 个评论 显示 -2更早的评论 隐藏 -2更早的评论

采纳的回答

2 个评论 显示 无 隐藏 无

更多回答（0 个）

类别

产品

版本

标签

另请参阅

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

2 个评论
显示无隐藏无