Is this a bug of the detectImportOptions function?
9 次查看(过去 30 天)
显示 更早的评论
My goal is to determine the location of the variable names row.
Attached is a sample Excel data file. Metadata information could be stored at any cell above the column header row. As you can see, the variable names are located at #14, but the program mistakenly claims the answer is Row #4.
Below is my code:
opts = detectImportOptions("test.xlsx")
N = str2double(extract(opts.VariableNamesRange, digitsPattern));
disp(N);
3 个评论
dpb
2025-3-10
编辑:dpb
2025-3-10
While we can see the variable names are in row 14, I think it's expecting a lot for detectImportOptions to be able to parse any random text in the file ahead of the specific line, especially since it's an attempt to determine the file data structure by scanning only a portion of the file.
I wouldn't hold my breath about this one being resolvable in general given the previous other example we just looked at.
I expect this will end up requiring you to parse the file similarly as in the other thread and knowing something about the problem regime, have to locate the variable names line by searching for unique, known variable names.
It is, I think, going to be hit 'n miss about whether there's any other way with the present tools; you'll find some that work as desired and others that fail in various other manners besides this one example is my guess...
回答(1 个)
Madheswaran
2025-8-8
Hello Leon,
As far as I know, "detectImportOptions" uses a heuristic which may not always work well with small files that have little structure (e.g. inconsistent number of columns, missing delimiters, missing variable names, etc.). In this case, file is small and missing data in the first row, so the heuristic struggles to accurately guess the underlying structure of the data.
For now, if you know the row number where the headers are, you can manually set the "VariableNamesRange" property in the options object. This ensures MATLAB reads the correct row as headers, regardless of what "detectImportOptions" guesses.
Additionally, to skip the metadata and start at the right row, you can set "DataRange" as well.
opts = detectImportOptions("test.xlsx");
opts.VariableNamesRange = 'A14'; % Correct row
opts.DataRange = 'A15'; % Data starts after headers
% ... rest of your code
More information on the options provided by "detectImportOptions", please refer here: https://mathworks.com/help/matlab/ref/detectimportoptions.html#namevaluepairarguments
Hope it helps!
0 个评论
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Spreadsheets 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!