Sorting data based on time

I have multiple csv files and each looks like this.
First column is an ID and the second one is the time (yyyymmddhhmmss). Here date does not change (yymmdd) but the hours changes. Is it possible to create seperate csv (or txt) files with a time limit?
For example, for one ID, could following files be created?
  • data raws between 00hr and 03hr
  • data raws between 03hr and 06hr
  • data raws between 06hr and 09hr etc.
For instance, if there are no data for a given time period, it either creates no csv file or creates an empty csv file.
Again, I'm trying to explore more with this approach. Any suggestion is appreciated!

3 个评论

Can you provide a sample CSV file? I will help in suggesting a solution.
Sure.
As you can see, the sample starts with a timestamp of 2020-02-21: 07:55:14. Which means the first data raw should belong to a csv (or txt) file that stores data between 06hr and 09hr. And so on.
Thanks a lot! I've been grinding my head over this but I couldn't get to where I need yet.
Hi James, please check the code in the answer below.

请先登录,再进行评论。

 采纳的回答

Ameer Hamza
Ameer Hamza 2020-3-6
编辑:Ameer Hamza 2020-3-6
You can use readmatrix and writematrix functions to read and write to CSV files. The following code will create partitions of data and create several CSV files.
If you are using R2018b and earlier, you can use the commented lines.
data = int64(readmatrix('Sample.csv'));
% data = int64(csvread('Sample.csv')); % for R2018b and earlier
time = data(:,2);
% remove empty rows
data = data(time>0, :);
time = time(time>0);
time = mod(time, 1000000); % yyyymmdd are not important so discard them
current_time = 60000; % time in hhmmss format
while size(data, 1) > 0
next_time = current_time + 30000; % 30000 represent 3 hours
index = time < (current_time + 30000);
partial_data = data(index, :);
writematrix(partial_data, ...
['data-' num2str(current_time/10000) '-' num2str(next_time/10000) '.csv']);
% dlmwrite(['data-' num2str(current_time/10000) '-' num2str(next_time/10000) '.csv'], ...
% partial_data, 'precision', '%i'); % for R2018b and earlier
data(index,:) = [];
time(index,:) = [];
current_time = current_time + 30000;
end

6 个评论

Thank you so much! This seems to work fine and I will work around this more.
Glad to help.
Hi,
I thought of starting a new thread but it seems continuing on this might be a better approach. I hope @Ameer would see this :)
First, this code has been a great help and thank you so much.
However, when I run different samples of data, outcome seems to be a bit faulty.
Attached sample contains time-stamps (Column B) starting from 02hr to 13hr, which means I should be getting following files.
  1. HH00123456789
  2. HH03123456789
  3. HH06123456789
  4. HH09123456789
  5. HH12123456789
However, the output files are different. I've tried seevral changes and alterations but nothing has made progress so far.
(PS: Column G and H are not similar and they carry totally different, random values. This is a dummy sample)
Do you have any recommendations? TIA.
Edit: please ignore the File naming. I changed the following part.
writematrix(partial_data, ...
['data-' num2str(current_time/10000) '-' num2str(next_time/10000) '.csv']);
Hi James, Can you recheck whether the file Sample_02.csv is correct. In the file you shared earlier, the timestamps in column two were in ascending order. However, in this file, the timestamps are in random order. My code was assuming that the timestamps will increase from top to bottom.
Even if the timestamp column is not in ascending order, my code will still generate files. It will search the entire column2 to find the timestamps in a specific interval and create a new file. However, in that case, the order of values in the original file and new files will not be the same.
I have also attached the output files I created by running the original codes. They present correct data as present in the original file; however, as already explained, the order is not preserved.
Thank you, Ameer. I think that narrows down the error to something that I've done along the way and I think that might be something related to renaming the files. I can retrace back and fix it.
Thanks again, this has been a great help!
Glad to be of help.

请先登录,再进行评论。

更多回答(0 个)

类别

帮助中心File Exchange 中查找有关 File Operations 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by