Excel sheet data upload "Invalid 'DataRange'. The column size must match the number of variables".

14 次查看（过去 30 天）

显示更早的评论

Chanille 2023-1-25

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1900735-excel-sheet-data-upload-invalid-datarange-the-column-size-must-match-the-number-of-variables

评论： Chanille 2023-1-31

采纳的回答： Walter Roberson

Error:

"Invalid 'DataRange'. The column size must match the number of variables".

I imported xlsx (excel) data into matlab then generated a script after which I copied into the beginning of a different script i wrote but I get the above error. Can you help me fix this? Why do I get this error?

2 个评论
显示无隐藏无

Mathieu NOE 2023-1-26

you didn't tell us what the error is

Chanille 2023-1-26

Sorry all:

I put it in the subject, this was the error. "Invalid 'DataRange'. The column size must match the number of variables".

请先登录，再进行评论。

请先登录，再回答此问题。

采纳的回答

Walter Roberson 2023-1-26

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1900735-excel-sheet-data-upload-invalid-datarange-the-column-size-must-match-the-number-of-variables#answer_1157165

在 MATLAB Online 中打开

opts.DataRange = "A2:AT78";

That requests that column "A" to "AT" be read in. However, M1_New_Macro1.xlsx only has data up to column "AS"

The headers in the xlsx file do not align with the content of the data. Observe that column A is titled "cut" and contains a file name, and column B is titled "Count" and contains integer numbers (plausibly counts of something.) Then look at column L "Slice" which is empty, then column M which named "Count" but contains file names. The Total Area in column C contains fractions but the Total Area in column N contains only integers (looking like a count)

It looks to me as if there are blank columns in the data but that the headers in row 1 do not take that into account.

When you look at the variable named in the code, it is not obvious to me that the variable names match up to the data in the file.

Starting in row 34 the file contains additional tables that are not in the same format as the data above that.

You should probably be using readcell() instead of readtable(), and extracting the data you need from the resulting cells.

12 个评论
显示 10更早的评论隐藏 10更早的评论

Walter Roberson 2023-1-29

在 MATLAB Online 中打开

filename = 'https://www.mathworks.com/matlabcentral/answers/uploaded_files/1274435/M1_New_Macro1.xlsx';

C = readcell(filename);

nmask = cellfun(@isnumeric, C);

spy(nmask)

L = bwlabel(nmask, 4); %4 connected

stats = regionprops(L, 'BoundingBox');

bb = ceil(vertcat(stats.BoundingBox)); %ceil is important here!

blobs = {};

bloblocs = {};

for row = 1 : size(bb,1)

sc = bb(row,1);

sr = bb(row,2);

ec = sc + bb(row,3) - 1;

er = sr + bb(row,4) - 1;

blobcols = sc:ec;

blob = cell2mat(C(sr:er, blobcols));

need4 = blob ~= round(blob,3);

col_needs_4 = any(need4, 1);

extract = blob(:,col_needs_4);

if ~isempty(extract)

blobs{end+1} = extract;

usedcols = blobcols(col_needs_4);

bloblocs{end+1} = [sr er usedcols];

end

bloblocs

bloblocs = 1×14 cell array

{[2 13 6 7 11]} {[19 30 6 7 11]} {[2 13 18 19 23]} {[19 30 18 19 23]} {[2 13 30 31 35]} {[19 30 30 31 35]} {[2 13 37 38 39 40 41]} {[19 30 38 39 40 41]} {[67 78 37 38 39 40 41]} {[2 13 44 45]} {[19 30 44 45]} {[35 46 44 45]} {[51 62 44 45]} {[67 78 44 45]}

The code is able to detect 14 different blocks of numeric data, and identify which columns within the blocks need 4 significant digits. The bloblocs output you see above, such as [2 13 6 7 11] means there is a block of numeric data starting from row 2 and ending at row 13, in which columns 6 7 and 11 contained 4+ digit numbers. In terms of the headings of your file those are density, norm/area, and ferret/minferret . You can see from [19 30 6 7 11] that rows 19 to 30 had a similar pattern of numeric values.

When you get to [2 13 37 38 39 40 41] that is again rows 2 to 13, columns AK to AO; the column headers are messed up by that point, but probably those are intended to be density columns.

Then you get the blobs such as [67 78 37 38 39 40 41] which is rows 67 to 78, AK to A0, and probably intended to be ferret/minferret values. You will also see that the end of those rows have blobs for Avg, std, N, where the std and N columns need at least 4 digits.

In each case, the contents of the columns that need 4 digits (only those columns) from each blob are stored into the cell array blobs -- blobs{K} is a numeric array with multiple columns, and bloblocks{K} tells you where that data was originally from inside the spreadsheet.

So we can now automatically identify where in the file there are blocks containing data that needs at least 4 digits, and we have extracted those sections into a cell array, and we have recorded the locations we took them from.

Now what? Should we assume that each blob will have exactly the same number of rows, and just horizontally concatenate them all together and do pca on that??

allblocks = horzcat(blobs{:});
size(allblocks)
ans = 1×2
    12    42

Chanille 2023-1-31

Thank you @Walter Roberson

请先登录，再进行评论。

类别

AI and Statistics Statistics and Machine Learning Toolbox Dimensionality Reduction and Feature Extraction

在 Help Center 和 File Exchange 中查找有关 Dimensionality Reduction and Feature Extraction 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by