Comparing a million data from csv files takes too much time

4 次查看(过去 30 天)
Hello everyone,
I am quite new to this program and need some help regarding this problem. I want to compare 1 million number to make sure there are no same number meet each other (n-1 ~= n). I tried to program the code, and using tic toc to measure time, elapsed time recorded is 40944.541765 seconds. This amount of time just for one csv file. actually i do want to make the code run for every csv file in the folder, but it is quite complicated so i just tried to focus to make calculation to one csv file first. How could i optimize this piece of code and make the calculation more accurate ? Thank You
data = csvread('data.csv',9); % Read the csv
a = zeros(1,999999); % Initialize a variable
for i=1:999998
t = data(i) ~= data(i+1); % make sure that n != n+1
a(i) = t; % Saving t value to a array
v=sum(a(:)==0); % Counting boolean 0 in a array
end
csvwrite('count.csv',v); % Writing the number to new csv file

采纳的回答

Bhaskar R
Bhaskar R 2022-9-17
I assume, you want to calculate the number of nonzero difference data from one value to next to that value
We can do without loops, this may help you
tic
data = randi(100, [1, 999999]); % taken a randon data of your data length
v = sum(diff(data) ~= 0);
toc
Elapsed time is 0.022935 seconds.
  1 个评论
Rizky Alfi
Rizky Alfi 2022-9-17
Thank you sir. Actually I've tried to calculate it in microsoft excel first to make sure the matlab output is correct using =a2<>a1 in column B and =COUNTIF(B1:B1000000;"false"). Your answer is insightful. I've tried your answer but the adjustment i need to do is change the
v = sum(diff(data) ~= 0);
to
v = sum(diff(data8a) == 0);
to output the same output as microsoft excel. I will accept your answer. Thank you

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Logical 的更多信息

产品


版本

R2016a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by