Multi-batch csv processing error
1 次查看(过去 30 天)
显示 更早的评论
I have a series of csv files where each file containts data that I want to extract located at data(1, [1 3]) (please see attached file). The data will then be exported to a one csv file named test.csv with there conrresonsding label (filename) at the first column, as seen below.
What I got so far is this code.
clear all; clc
path = '/Applications/trial';
list = dir(fullfile(path, '*.csv'));
for i = 1:numel(list)
F = fullfile(path, list(i).name);
data = readtable (F);
T = data(1, [1 3]);
energy = table2array(T);
dlmwrite('test.csv',energy,'-append')
end
However, when I tried exporting, it produces a series of number that is contained in each cell.
0 个评论
采纳的回答
Jorg Woehl
2021-3-16
编辑:Jorg Woehl
2021-3-16
Hi Denxybel, the main problem lies in the readtable statement. Althought the documentation states that "by default, readtable creates variables that have data types that are appropriate for the data values detected in each column of the input file", this does not work here:
data = readtable('DoseAtDemoSphereR1Q1C1.csv')
data =
1×14 table
x_ Results xFor scorer DoseAtDemoSphereR1Q1C1 Var6 Var7 Var8 Var9 Var10 Var11 Var12 Var13 Var14
________________________ __________________________ ________ __________________________ __________________________ _________________________ __________________________ _________ ____ _____ _____ _____ _____ _____
{'0.02753477916121483,'} {'2.753477962114824e-10,'} {'170,'} {'2.866090017161038e-05,'} {'2.866090045821939e-13,'} {'5.35358762496883e-07,'} {'4.827696467064431e-07,'} 0.0036571 NaN NaN NaN NaN NaN NaN
As you can see, most variables are read in as character vectors, especially the first and third column that you are interested in. Also, the comma has not been properly recognized as the delimiter in the csv file; instead, it appears as part of the data character vectors.
You will therefore have to provide more details to readtable so that it can parse the input file correctly. There are multiple ways for doing this, for example:
data = readtable('DoseAtDemoSphereR1Q1C1.csv', 'Format', '%f', 'Delimiter', ',')
data =
8×1 table
Var1
__________
0.027535
2.7535e-10
170
2.8661e-05
2.8661e-13
5.3536e-07
4.8277e-07
0.0036571
In addition (although not strictly necessary here), you could supply details about the header comments by adding 'CommentStyle','#' to the list of arguments.
Your output is now a 8-by-1 table, so to reduce it to a 2-by-1 table containing only the first and third value you would use
data([1 3],1)
Instead of converting your table to an array and then use dlmwrite (which is no longer recommended), you can simply write the table directly to a file using writetable. I would also suggest to first collect all data inside the loop and write them to a file all at once after the loop; this should be much more efficient than appending data to a file inside the loop.
Your final code could then look something like this:
path = '/Applications/trial';
list = dir(fullfile(path, '*.csv'));
% preallocate table and define variables
T = table('Size', [numel(list) 3],...
'VariableTypes', {'string','double','double'},...
'VariableNames', {'Name','Energy','Number of Particles'});
for i = 1:numel(list)
F = fullfile(path, list(i).name);
data = readtable(F, 'Format', '%f', 'Delimiter', ',');
T{i,1} = string(F);
T{i,2} = data{1,1};
T{i,3} = data{3,1};
end
% write data to file
writetable(T, 'test.csv');
更多回答(0 个)
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!