Reading a text file and rows with data

Question

Damith 2016-6-14

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/289870-reading-a-text-file-and-rows-with-data

回答： Shameer Parmar 2016-6-17

采纳的回答： dpb

023002_Q_1997.zip

在 MATLAB Online 中打开

Hi,

I need to read the text file shown in the image below using a MATLAB script.I have tried the code below and I appreciate if someone can guide me to make it working. I have attached the text file here.

Thanks in advance.

myFolder = 'C:\Users\Desktop\'
filePattern = fullfile(myFolder, '*.txt');
csvFiles = dir(filePattern);
fmt='%d %4d/%2d/%2d %2d:%2d %d %*[^\n]';
for i=1:length(csvFiles)
  fid = fopen(fullfile(myFolder,csvFiles(i).name));
  c=cell2mat(textscan(fid,fmt,'headerlines',18,'collectoutput',1,'delimiter','\t'));
  fid=fclose(fid);
end

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

dpb 2016-6-14

在 MATLAB Online 中打开

Two problems, one easy, another "not so much"...

a) You can't mix types in 'collectoutput' and the floating point value fails with the %d format string so need to switch all %d to %f in the format string.

b) The file is NOT tab-delimited, it's the bane of C text input, blank-delimited with missing data. This is essentially impossible to deal with in C (and hence Matlab since it uses C formatted i/o routines derived from fscanf and friends). You can see the is make the above correction and return the intermediary result from textscan --

>> fmt='%f %4f/%2f/%2f %2f:%2f %f %*[^\n]';
>> d=cell2mat(textscan(fid,fmt,'headerlines',18,'collectoutput',1));
>> whos d
Name      Size            Bytes  Class     Attributes
d         1x7                56  double              
>> d
d =
 1.0e+04 *
  2.3002    0.1997    0.0010    0.0001         0         0    0.0000
>> d(end)
ans =
  0.2928
>>

This shows read the first record correctly; what happened that there's only one record???

>> frewind(fid) % we'll try again from the top...

Now read but find out where the file pointer is afterwards...

>> [d,n]=textscan(fid,fmt,1,'headerlines',18,'collectoutput',1);
>> n
n =
    463865
>>

??? That's an awfully big number for one record, isn't it!!!???

    C:\ML_R2012b\work> dir 023002_Q_1997.txt
   Volume in drive C is unlabeled      Serial number is BC9D:AAD0
   Directory of  c:\ml_r2012b\work\023002_q_1997.txt
023002_q_19►   463865   6/13/16  18:09
        463,865 bytes in 1 file and 0 dirs    466,944 bytes allocate
113,018,556,416 bytes free
C:\ML_R2012b\work>

What we see is that's identically the file size; with no delimiter the skip-end-of-line went all the way to the end of the file.

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

dpb 2016-6-15

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/289870-reading-a-text-file-and-rows-with-data#answer_225658

编辑：dpb 2016-6-15

在 MATLAB Online 中打开

Well, since the file was malformed after all, here's how to clean it up and create a version which can then be read...

c=char(textread('023002_Q_1997.txt','%s','delimiter','\n','whitespace','','headerlines',18));
ix=[7 20:3:32 42];
c(:,ix)=',';
d=cell2mat(textscan(c(:,1:42).',repmat('%f',1,7),'collectoutput',1,'delimiter',','));
>> whos d
Name         Size             Bytes  Class     Attributes
d         8832x7             494592  double              
>> d(1:10,:)
ans =
 1.0e+04 *
  2.3002    0.1997    0.0010    0.0001         0         0    0.0000
  2.3002    0.1997    0.0010    0.0001         0    0.0015       NaN
  2.3002    0.1997    0.0010    0.0001         0    0.0030       NaN
  2.3002    0.1997    0.0010    0.0001         0    0.0045       NaN
  2.3002    0.1997    0.0010    0.0001    0.0001         0       NaN
  2.3002    0.1997    0.0010    0.0001    0.0001    0.0015       NaN
  2.3002    0.1997    0.0010    0.0001    0.0001    0.0030       NaN
  2.3002    0.1997    0.0010    0.0001    0.0001    0.0045       NaN
  2.3002    0.1997    0.0010    0.0001    0.0002         0       NaN
  2.3002    0.1997    0.0010    0.0001    0.0002    0.0015       NaN
>>

NB: Trotted out the old standby textread to read the file originally as a cellstring array as it does it without the need for the extra fopen/fclose pair and has all the flexibility needed for the purposes here.

The "magic numbers" were found by looking at the file in an editor and noting the columns following each numeric field, including the '/' and ':' for the date/time fields. These could have been done with string substitution or left as date/time fields but I just went ahead and turned the whole file into a csv file for simplicity.

Following that, saved the character array through the terminating comma after the last (possibly missing) numeric field and passed that to textscan. The key "trick" here is to remember storage is colum-major in Matlab so must transpose the array in memory to work by row from memory instead of down each column.

Also, there's a problem with the file format that the cast to char takes care of--on those records missing the final character the record length is short by a character and those that have a two-character trailing string are long by a character over the predominant length. This means don't actually have a fixed-length record file altho the first columns are fixed-width; another government contractor "feature", no doubt. :) This means can't just read the file as bytes and reshape but must scan for record terminators and then fixup.

Anyway, this should be something that works for all files.

2 个评论
显示无隐藏无

Damith 2016-6-17

Thanks so much again. Appreciate it.

dpb 2016-6-17

Then ACCEPT the answer, please..

请先登录，再进行评论。

Answer 2

dpb 2016-6-14

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/289870-reading-a-text-file-and-rows-with-data#answer_225506

编辑：dpb 2016-6-15

在 MATLAB Online 中打开

Oh, one thing I noted also when checking on the delimiter; the EOL marker is '\r'; wonder what would happen if skip for it explicitly instead of the default '\n'?

>> fmt='%f %4f/%2f/%2f %2f:%2f %f %*[^\r]';
>> [d,n]=textscan(fid,fmt,'headerlines',18,'collectoutput',1,'endofline','\r')
d = 
  [2x7 double]
n =
      1171
>>

OK, read the second record but croaked on missing value...what if create specific for it, too?

>> frewind(fid);
>> fmt1='%f %4f/%2f/%2f %2f:%2f %*[^\r]';  % format w/o the last float
>> [d,n]=textscan(fid,fmt1,11,'headerlines',18,'collectoutput',1,'endofline','\r')
d = 
  [11x6 double]
n =
      4080
>>

Aha! Now we're getting somewhere; all we have to do is to wrap the two calls in a loop --

d=cell2mat(textscan(fid,fmt,1,'headerlines',18,'collectoutput',1,'endofline','\r'));  % 1st record only
d=[d;[cell2mat(textscan(fid,fmt1,11,'collectoutput',1,'endofline','\r')) nan(11,1)]]; % next group
while ~feof(fid)
  d=[d;[cell2mat(textscan(fid,fmt,1,'collectoutput',1,'endofline','\r'))];
  d=[d;[cell2mat(textscan(fid,fmt1,11,'collectoutput',1,'endofline','\r')) nan(11,1)]];
end

While wouldn't normally dynamically allocate like this, unless the file is extremely large this should be "fast enough". It it does bog down excessively with time before finishing, preallocate a large array, keep index of rows read and store them appropriately.

5 个评论
显示 3更早的评论隐藏 3更早的评论

dpb 2016-6-15

编辑：dpb 2016-6-15

Well you can certainly count on the government to be there to "help"... :(

Only can eliminate a row by finding out which rows they are which is back to the line-by-line parsing.

All in all would likely be simpler to insert the delimiter; it's really quite simple to read a file as a 'blob' of characters.

Damith 2016-6-15

OK. Can you help me how to insert a delimiter and read the files?

请先登录，再进行评论。

Answer 3

Shameer Parmar 2016-6-17

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/289870-reading-a-text-file-and-rows-with-data#answer_225883

在 MATLAB Online 中打开

For reading any text file.. try this code..

   clear all;
   count = 1;
   fid = fopen('ascii_file.txt');
   tline = fgetl(fid);
   while ischar(tline)
       if (tline ~= -1)
    data(count,:) = {tline};
       else
    data(count,:) = {''};
       end
       count = count + 1;
       tline = fgetl(fid);
   end
   fclose(fid);

replace 'ascii_file.txt' with your filename..

"data" will be the output (multi dimensional array), which store all your data from txt file.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Reading a text file and rows with data

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

采纳的回答

2 个评论
显示无隐藏无

更多回答（2 个）

5 个评论
显示 3更早的评论隐藏 3更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

Community Treasure Hunt

Reading a text file and rows with data

1 个评论 显示 -1更早的评论隐藏 -1更早的评论

采纳的回答

2 个评论 显示 无隐藏 无

更多回答（2 个）

5 个评论 显示 3更早的评论隐藏 3更早的评论

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

Community Treasure Hunt

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

2 个评论
显示无隐藏无

5 个评论
显示 3更早的评论隐藏 3更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论