Multi-batch csv processing error

Question

0 个投票

trial.zip

I have a series of csv files where each file containts data that I want to extract located at data(1, [1 3]) (please see attached file). The data will then be exported to a one csv file named test.csv with there conrresonsding label (filename) at the first column, as seen below.

What I got so far is this code.

clear all; clc
path = '/Applications/trial';
list = dir(fullfile(path, '*.csv'));
for i = 1:numel(list)
    F = fullfile(path, list(i).name);
    data = readtable (F);
    T = data(1, [1 3]);
    energy = table2array(T);
    dlmwrite('test.csv',energy,'-append')
    
end

However, when I tried exporting, it produces a series of number that is contained in each cell.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Follow Question

Answer 1

Jorg Woehl 2021-3-16

编辑：Jorg Woehl 2021-3-16

在 MATLAB Online 中打开

0 个投票

Hi Denxybel, the main problem lies in the readtable statement. Althought the documentation states that "by default, readtable creates variables that have data types that are appropriate for the data values detected in each column of the input file", this does not work here:

data = readtable('DoseAtDemoSphereR1Q1C1.csv')
data =
  1×14 table
               x_                        Results                xFor                scorer                DoseAtDemoSphereR1Q1C1                Var6                          Var7                 Var8       Var9    Var10    Var11    Var12    Var13    Var14
    ________________________    __________________________    ________    __________________________    __________________________    _________________________    __________________________    _________    ____    _____    _____    _____    _____    _____
    {'0.02753477916121483,'}    {'2.753477962114824e-10,'}    {'170,'}    {'2.866090017161038e-05,'}    {'2.866090045821939e-13,'}    {'5.35358762496883e-07,'}    {'4.827696467064431e-07,'}    0.0036571    NaN      NaN      NaN      NaN      NaN      NaN 

As you can see, most variables are read in as character vectors, especially the first and third column that you are interested in. Also, the comma has not been properly recognized as the delimiter in the csv file; instead, it appears as part of the data character vectors.

You will therefore have to provide more details to readtable so that it can parse the input file correctly. There are multiple ways for doing this, for example:

data = readtable('DoseAtDemoSphereR1Q1C1.csv', 'Format', '%f', 'Delimiter', ',')
data =
  8×1 table
       Var1   
    __________
      0.027535
    2.7535e-10
           170
    2.8661e-05
    2.8661e-13
    5.3536e-07
    4.8277e-07
     0.0036571

In addition (although not strictly necessary here), you could supply details about the header comments by adding 'CommentStyle','#' to the list of arguments.

Your output is now a 8-by-1 table, so to reduce it to a 2-by-1 table containing only the first and third value you would use

data([1 3],1)

Instead of converting your table to an array and then use dlmwrite (which is no longer recommended), you can simply write the table directly to a file using writetable. I would also suggest to first collect all data inside the loop and write them to a file all at once after the loop; this should be much more efficient than appending data to a file inside the loop.

Your final code could then look something like this:

path = '/Applications/trial';
list = dir(fullfile(path, '*.csv'));
% preallocate table and define variables
T = table('Size', [numel(list) 3],...
    'VariableTypes', {'string','double','double'},...
    'VariableNames', {'Name','Energy','Number of Particles'});
for i = 1:numel(list)
    F = fullfile(path, list(i).name);
    data = readtable(F, 'Format', '%f', 'Delimiter', ',');
    T{i,1} = string(F);
    T{i,2} = data{1,1};
    T{i,3} = data{3,1};
end
% write data to file
writetable(T, 'test.csv');

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

Denxybel Montinola 2021-3-17

在 MATLAB Online 中打开

Thank you very much for the comprehensive explanation @Jorg Woehl! It is really helpful.

I have actually figured out how to spit the data and write it in a matrix form yesterday. This is what I got

clear all; clc
path = '/Applications/trial';
list = dir(fullfile(path, '*.csv'));
for i = 1:numel(list)
    F = fullfile(path, list(i).name);
    rawdata = readtable([F], 'Delimiter', 'comma', 'readvariable',false);
    data = table2array(rawdata(1,[1 3]));
    writematrix(data,'test.csv','WriteMode','append'); 
end

but with your help, everthing works great!

P.S. Under the name column, I think this should be this in order to write the filename not the whole path. but no biggies, it works really well!

T{i,1}=string(list(i).name);

请先登录，再进行评论。

Multi-batch csv processing error

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

更多回答（0 个）

类别

产品

版本

标签

Community Treasure Hunt

Multi-batch csv processing error

0 个评论 显示 -2更早的评论 隐藏 -2更早的评论

采纳的回答

1 个评论 显示 -1更早的评论 隐藏 -1更早的评论

更多回答（0 个）

类别

产品

版本

标签

另请参阅

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

1 个评论
显示 -1更早的评论隐藏 -1更早的评论