separate excel file based one single column in matlab

Question

Daphne Mariaravi 2017-6-25

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/346205-separate-excel-file-based-one-single-column-in-matlab

回答： Guillaume 2017-6-28

TEST.csv

I have a multiple .csv file which has time series data. I have attached a sample file. Is it possible to find every 5 minute interval in the HH:MM:SS column, cut the entire data until that time and save it as a separate file with the header row.? Likewise it has to do until the entire time series up to 5 hrs or so. Any suggestions on how to do this?

10 个评论
显示 8更早的评论隐藏 8更早的评论

Joshua 2017-6-26

编辑：Joshua 2017-6-28

在 MATLAB Online 中打开

Daphne,

For a given file test.csv, you can load it into MATLAB using the command

data=load('test.csv')

From there you should have all the data in one big array called 'data' that will show up in your workspace. Then you can set up a for or while loop to separate the 'data' array into several smaller arrays. For example,

data=load('test.csv')
s=size(data);
five_min=[];
count=1;
for i=1:s(1)
    if(mod(data(i,1),5)==0)
        five_min(count,:)=data(i,:);
        count=count+1;
    end
end
fileID = fopen('filename.txt','w');
fprintf(fileID,'%5d %5d %5d %5d %5d %5d\n',five_min');
fclose(fileID);

will read in data from test.csv, separate the five minute data into the array five_min, and then write it to a file 'five_min'. The mod function decides whether or not athe number is a multiple of 5. Also, in the fprintf function, note that if you have x number of columns, you need x number of '%5d's (puts each column in the double format with 5 decimal places). This code is not perfect, but hopefully you get the gist of it. Let me know if you have any questions.

Rik 2017-6-27

doc fopen will give you an idea of what 'first.txt' means and where you need to put them in your code.

Joshua 2017-6-28

Daphne,

I apologize as formatted my response wrong at first. I fixed the post so the code is all in the correct order. Also, first.txt was just the name of a random file, but in retrospect that name does not make any sense. I changed it to be filename.txt where you can put anything for 'filename'. Also, 'w' indicates that you give MATLAB write access to the file as opposed to read access only.

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Guillaume 2017-6-28

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/346205-separate-excel-file-based-one-single-column-in-matlab#answer_272259

在 MATLAB Online 中打开

I don't know what is all this conversation about fopen which is probably the worst way of parsing your data. Using modern tools such as readtable makes a lot more sense.

alldata = readtable('test.csv');  %See note 2
timestamp = datetime(alldata.HH_MM_SS, 'InputFormat', 'HH:mm.s', 'Format', 'HH:mm:ss');  %see note 1
group = discretize(timestamp, minutes(5));
splitdata = splitapply(@(rows) {alldata(rows, :)}, (1:height(alldata))', group);
for fileidx = 1:numel(splitdata)
   writetable(splitdata{fileidx}, sprintf('split%02d.csv', fileidx));  %see note 3
end

Note 1: Your header implies that the column format is HH:MM:SS, yet the data in the column is of the form XX:YY.z, so it's really not clear if the format is actually hours:minutes.seconds or minutes:seconds.fractionofseconds. I assumed the first in the above. Adjust the 'InputFormat' if necessary.

Note 2: You can specify column format in the readtable call to directly read the HH:MM:SS column as datetime. I've not bothered here.

Note 3: readtable will convert your header into valid variable names, slightly altering your headers. These slightly altered headers is what will be saved in the split files. If the original headers are absolutely required, it can be done with a slightly more complex for loop, but relying on the undocumented fact that the table VariableDescription property holds the original name of the columns:

columnnames = regexp(alldata.Properties.VariableDescriptions, '(?<='')[^'']+(?='')', 'match', 'once');
notmodified = cellfun(@isempty, columnnames);
columnnames(notmodified) = alldata.Properties.VariableNames(notmodified);
for fileidx = 1:numel(splitdata)
   xlswrite(sprintf('split%02d.csv', fileidx), [columnnames; table2cell(splitdata{fileidx})]);
end

As said, the fact that the original column names are saved in a property is not documented so this may only work in some versions (tested with R2017a)

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

separate excel file based one single column in matlab

10 个评论
显示 8更早的评论隐藏 8更早的评论

回答（1 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

Community Treasure Hunt

separate excel file based one single column in matlab

10 个评论 显示 8更早的评论隐藏 8更早的评论

回答（1 个）

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

Community Treasure Hunt

10 个评论
显示 8更早的评论隐藏 8更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论