How to figure out the number of rows for headers?

9 次查看(过去 30 天)
I have Excel files with unknown number of headers. For exmple, some files only have one row of header info. Another one could have 12 header lines.
I would need to know the accurate count of the number of rows of headerlines, so that my program would know the Excel Row# for a cetain data value. Right now, it only knows the relative Row # of the numerical portion of the Excel file.
What makes things more complicated is that my data also routinely contain columns that are made of text-strings.
Right now, I use readtable to read the Excel file. It basically assume the first Row is the headerline, and the rest of them are data. Here is my question. Is there a straightforward way to identify the number of headerlines in my case?
Thanks!
  1 个评论
Leon
Leon 2020-4-2
The below code works for flat csv or txt files. It does a good job of reading all of the headers. Is there an easy way to make it work for xlsx files as well?
header_end = false;
header_count = 0;
% preallocate 50 rows of header text
header = cell(50, 1);
%% Loop through header lines
while ~header_end
%% Get a whole line from the file
aline = fgetl(FID);
%% Use a regular expression to look for the data block header
tokens = regexp(aline, ...
['(\d+)' ... % capture a number
'\s+!\s+' ... % followed by an !
'(\w+\s+)+'],... % and capture a series of strings
'tokens', 'warnings');
%% Parse data block header if found
if ~isempty(tokens) && numel(tokens{:}) == 2
% parse numeric token using string to double
num_datum = str2double(tokens{1}(1));
% parse variable string
varNameStr = deblank(tokens{1}(2));
names = strsplit(varNameStr{:});
%names = names{1};
% stop looping
header_end = true;
else
% Increment header line count and add new line
header_count = header_count+1;
header{header_count} = aline;
end
end

请先登录,再进行评论。

回答(2 个)

Fangjun Jiang
Fangjun Jiang 2020-3-30
If you use [Num, Txt, Raw]=xlsread(), can you figure it out based on the numerical, text and raw data?

Josh Zagorski
Josh Zagorski 2020-3-30
[A DELIM NHEADERLINES] = importdata(...) returns the detected number of header
lines in the input ASCII file.
So, [Matrix_converted,'delimiter',No_headerlines = importdata('file.xlsx');
  5 个评论
Josh Zagorski
Josh Zagorski 2020-3-31
Do you have the correct filepath?
Otherwise, I glossed over your "text-strings" data comment - wondering if data needs to be double/numeric "data contains a double array."
Leon
Leon 2020-4-2
Yes, I do.
Just tried a different file, below is what I get:
>> [A,B,C]= importdata('G01.xlsx')
A =
struct with fields:
data: [1×1 struct]
textdata: [1×1 struct]
colheaders: [1×1 struct]
B =
NaN
C =
0

请先登录,再进行评论。

产品


版本

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by