How to put (tab delimited) text files together removing header text?

3 次查看(过去 30 天)
Hi, I have many text files in the following format:
Name of the file
Date
Other useless data
Column1 [unit] Column2 [unit] Column3 [unit] Column4 [unit] ...
0.025 6.8 9.4 9.5 ...
0.050 2.8 4.4 4.2 ...
0.075 3.3 7.4 6.1 ...
...
I would like to copy all the data from all the files into a single file. I am familiar with the command:
!copy a.txt+b.txt ab.txt
However, I would like to remove all header lines and have only the numerical data in the new file (and then put a new header line in the first row so that 'tdfread' can read it easily). I would like my output file to look like this
MyHeader1 MyHeader2 MyHeader3 MyHeader4 ...
0.025 6.8 9.4 9.5 ...
0.050 2.8 4.4 4.2 ...
0.075 3.3 7.4 6.1 ...
...
Another challenge is that there are several thousands of files, so I would need an automated procedure to read the files after one another. Or alternatively, a way to select all the files in a folder to concatenate. Unfortunately they are not conveniently named so I cannot construct the file names in a for loop for example. Any help is very much appreciated.
  1 个评论
dpb
dpb 2013-11-11
Are the number of header lines in each file the same?
As for the obtaining all files in a subdirectory,
d=dir('*.txt');
and then iterate over d.name
This should be basically trivial if the headerlines are consistent; a little bit of a pain otherwise.

请先登录,再进行评论。

采纳的回答

dpb
dpb 2013-11-16
编辑:dpb 2013-11-16
So, the answer is the same as originally given, then...use sotoo
fmto=['%12.3f' repmat('%12.3f',1,nCols-1)];
fido=fopen(youroutputfilename,'w');
fprintf(fido,'%s\n', yourheadertext)
for j=1:length(fileList)
fid = fopen(fileList(j).name,'r');
d=cell2mat(textscan(fid,'%f','headerlines', 6, 'treatasempty',{'n/a';'N/A'}));
fid=fclose(fid);
fprintf(fido,fmto,d')
end
fido=fclose(fido);
Adjust the various parameters to suit.
doc textscan % and friends
for more detail on the various options for empty values, and
doc fprintf % etc.
for detail of format strings to match you desired output formats. With a regular file format it is really pretty straightforward. The other respondents use of save is somewhat less verbose at the cost of less control over the output format--your choice depending on wants/needs.
ERRATUM:
Forgot the \n character for the output format...
fmto=['%12.3f' repmat('%12.3f',1,nCols-1) '\n'];
Also if do want the tab-delimited form retained then need it as well...
fmto=['%12.3f' repmat('\t%12.3f',1,nCols-1) '\n'];

更多回答(3 个)

G A
G A 2013-11-12
you can use this algorithm:
fid1=fopen('fileName1','w');%open output file to write headers
fprintf(fid1,formatSpec,H1,H2,Hn);%write headers into the file
fid2=fopen('fileName2');%open file with the data
A = fscanf(fid2, '%f');%read from file numerical data only
fclose(fid2);
save(fid1,'-ascii','-tabs','-append','A');%append data to the output file
  2 个评论
dpb
dpb 2013-11-12
fid2=fopen('fileName2');%open file with the data
A = fscanf(fid2, '%f');%read from file numerical data only
The above will fail for these files w/ the header lines...
László Arany
László Arany 2013-11-12
Also, I forgot to mention that the files are measurement results and each column contains data from different sensors. Now when some of the sensors were damaged/unreachable/off then its column has n/a or N/A or NaN (etc). Therefore, I probably cannot read them as '%f', I was trying to read them as strings.

请先登录,再进行评论。


László Arany
László Arany 2013-11-12
In the meanwhile I managed to sort out the last part of the question. This is a simple way to reach all files from a folder and then open them in a loop:
fileList = dir(PathToFolder);
fileList = fileList(~[fileList.isdir]); %remove folders inside target folder
L = length(fileList);
for j=1:L
fid = fopen(fileList(j).name,'r');
... % operations using the file
fclose(fid)
end
  2 个评论
László Arany
László Arany 2013-11-16
Hi dpb,
the number of lines to remove is the same for all files. Sorry, I did not know there are comments and answers separately, and I did not see your comment, that is why I wrote this. I added the new info to the question.
dpb
dpb 2013-11-16
OK...I'll go back and delete previous and then you can clean up the unnecessary comments leaving only a clean response in database going forward for somebody else's later use, perhaps. At least that's the hope in Answers--how much different it is in reality than a conventional newsgroup in that regard I've my doubts...

请先登录,再进行评论。


Alex Z.
Alex Z. 2017-6-15
This can be done in EasyMorph using Append transformation. The tool is free.

类别

Help CenterFile Exchange 中查找有关 Data Import and Export 的更多信息

产品

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by