how to append multiple CSV files with mix of numeric / non-numeric into one final table

1 次查看(过去 30 天)
Hi all,
I have 60 CSV files that contain one column of text and the remaining 6 columns are numeric.
The first row in all CSV files contains a header with all the variable names: var1 var2 var3 ... var7. The names of the CSV files have different lengths with this type of sequence: x.csv, xx.csv, xxx.xx.csv, xxx-x.csv, and xx.x.csv.
I would like to have a script that can append all CSV files into one and then import as a table. I am using matlab 2013b that includes the table data type.
Thanks, Jeff

采纳的回答

per isakson
per isakson 2014-3-16
编辑:per isakson 2014-3-18
Assumptions:
  • the files have the same format and are "correct" - no checking needed
  • the first character of all file names is "x"
  • the concatenated file shall have one header
  • the files may have trailing empty lines, which shall not be included in the concatenated file.
function ccsm()
sad = dir( 'x*.csv' ); % 1
% Is the order of the files important?
fid_out = fopen( 'concatenated_files.csv', 'w' ); % 2
for jj = 1 : length( sad ) % 3
fid_in = fopen( sad(jj).name, 'r' ); % 4
str = transpose( fread( fid_in, '*char' ) ); % 5
fclose( fid_in ); % 6
if not( jj == 1 ) % 7
ix1 = regexp( str, '[\r\n]++', 'once' ); % 8
str = str(ix1:end); % 9
end
ix2 = regexp( str, '[\r\n]++$', 'once' ); % 10
if not( isempty( ix2 ) ) % 11
str(ix2:end) = []; % 12
end
fwrite( fid_out, str, '*char' ); % 13
end
fclose('all'); % 14
end
where there are four csv-files in the current directory with the content
rowhead, var1, var2, var3
d1, 0.5377, 0.3188, 3.5784
d2, 1.8339, -1.3077, 2.7694
d3, -2.2588, -0.4336, -1.3499
d4, 0.8622, 0.3426, 3.0349
.
Description
A file is a row of bytes on some storage media. The program interprets those bytes.
  1. get a list of the source files, which shall be concatenated
  2. create an empty target file to put the result
  3. loop over all source files
  4. open the current source file
  5. read the file, interpret the bytes as characters, put the result in the variable, str
  6. close the current source file
  7. we want to keep the header line of the first source file and remove it for the others
  8. line breaks are indicated with either the two bytes "\r" or by the single byte "". ix1 is the start position of the first group of "" and "\r" (any number and any order).
  9. keep the bytes from the position, ix1, to the end, i.e. strip off the header line.
  10. find the starting position, ix2, of the trailing group of "" and "\r"
  11. there might not be any line breaks at the end of the file
  12. strip off the line breaks at the end
  13. write the remaining row of characters to the target file.
  14. close the target file (and others, which might be open by mistake)
Fill in the details with the help of the Matlab documentation!
  3 个评论
per isakson
per isakson 2014-3-16
编辑:per isakson 2014-3-16
The function reads all files in the current directory, which match x*.csv, and writes their content to concatenated_files.csv.
Use readtable to read concatenated_files.csv. I don't have table. However, I believe it is simple to import a csv-file to a table with readtable, Create table from file
Jeff
Jeff 2014-3-17
Thanks actually I will be able to now get it into a table, not a problem.
Can I ask for some further explanation on what you are doing in this section? I am keen on trying to learn
for jj = 1 : length( sad )
fid_in = fopen( sad(jj).name, 'r' );
str = transpose( fread( fid_in, '*char' ) );
fclose( fid_in );
if not( jj == 1 )
ix1 = regexp( str, '[\r\n]++', 'once' );
str = str(ix1:end);
end
ix2 = regexp( str, '[\r\n]++$', 'once' );
if not( isempty( ix2 ) )
str(ix2:end) = [];
end

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Environment and Settings 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by