how to concatenate multiple CSV files with mix of numeric / non-numeric into one final table

3 次查看(过去 30 天)
Hi all,
I have 60 CSV files that contain one column of text and the remaining 2 columns are numeric.
The first row in all CSV files contains 2 headers. The names of the CSV files have different lengths with this type of sequence: Test1.csv, Test2.csv, Test3.csv, Test4.csv, and so on. Below is the code that I'm using, but somehow I couldn't get rid of the 2nd header.
Code:
INfolder='C:\Users\SamanthaTzeeSan\Documents\MATLAB\Example_3';
cd(INfolder);
% get a list of the source files, which shall be concatenated
csvFiles=dir('Test*.csv');
numfiles=length(csvFiles); % total number of the source files
% create an empty target file to store the result
fid_out=fopen('concatenated_files3.csv','w');
%
for jj=1:numfiles % loop over all source files
fid_in=fopen(csvFiles(jj).name,'r'); % open the current source file
% read the file, interpret the bytes as characters, input the result in the
% variable, str
str=transpose(fread(fid_in,'*char'));
fclose(fid_in); % close the current source file
% keep the header lines of the first source file and remove it for the other
% source files
if not (jj==1)
% line breaks are indicated with either the two bytes "\r\n" or by single
% byte "\n". ix1 is the start position of the first group of "\n" and
% "\r" (any number and any order)
ix1=regexp(str,'[\r\n]++','once');
% keep the bytes from the position, ix1, to the end (strip off the
% header lines)
str=str(ix1:end);
end
%
% find the starting position, ix2, of the trailing group of "\n" and "\r"
ix2=regexp(str,'[\r\n]++$','once');
% there might not be any line breaks at the end of the file
if not (isempty(ix2))
str(ix2:end)=[]; %strip off the line breaks at the end
end
% write the remaining row of characters to the target file
fwrite(fid_out,str,'*char');
end
fclose('all'); %close the target file (and others)
Thanks, Sam
  2 个评论
Star Strider
Star Strider 2016-1-10
Can you use xlsread to read your .csv files? (You have to have Microsoft Excel installed.) If so, that would make your task significantly easier.
per isakson
per isakson 2016-1-10
  • It's much easier to help if you upload a couple of sample files.
  • Do you rather want a working solution or help to debug your code?

请先登录,再进行评论。

回答(1 个)

Walter Roberson
Walter Roberson 2016-1-11
textscan() each file with HeaderLines 2 and format '%[^,],%f,%f' and default delimiter (*not* ',')
Unless, that is, your first column of text can contain commas. In that case we need to know which escaping mechanism you are using. %q for the case of double-quote delimited text.
If you are sure the first column does not contain blanks then you can use a format of '%s%f%f' with 'Delimiter',','

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by