Can this for-loop code get faster in some way?

I got a big result table named resTbl.
There I need for each row to grab a timestamp (posix time) and construct a period interval vector (I use a constant, ts_length, to construct this).
Then I need to find all those in period_interval that are represented in the variable data (column 2), and take the sum of those in data (column 7).
The below code works as I want it to:
for i = 1 : size(resTbl ,1)
period_interval = resTbl(i,2) : 60000 : resTbl(i,2) + ts_length;
[hd, he] = ismember(period_interval,data(:,2));
resTbl(i,10) = sum(data(he(hd),7));
end
The problem is that it is slow since resTbl has many rows. Does anyone have a suggestion how to make it faster?

回答(1 个)

Maybe ismember function can be replaced:
dt = 60000;
period_interval = 0 : dt : ts_length;
n = length(period_interval);
for i = 1 : size(resTbl ,1)
cond = ~mod( data(:,2)-resTbl(i,2),dt ); % multiple by 60 000
mult = (data(:,2)-resTbl(i,2))/dt;
ind = (0 <= mult & mult <= n) & cond; % (0 <= multiplier <= n) and multiple by 60 000
% [hd, he] = ismember(period_interval,data(:,2));
resTbl(i,10) = sum(data(ind,7));
end

4 个评论

Thank you very much. Unfortunately I am not sure how much faster this is. It looks like it is almost the same
What about this?
dt = 60000;
period_interval = 0 : dt : ts_length;
n = length(period_interval);
[D,R] = ndgrid(data(:,2),resTbl(:,2)); % 2D matrix
cond = ~mod( D-R,dt ); % multiple by 60 000
mult = (D-R)/dt; % multiplier
ind = (0 <= mult & mult <= n) & cond; % (0 <= multiplier <= n) and multiple by 60 000
ndata = ind .* repmat( data(:,7), [1 size(resTbl,1)] );
resTbl(:,10) = sum(ndata)';
Pretty brilliant solution I really have to admit!
It is faster but unfortunately it is however also pretty time consuming and VERY heavy on the memory due to the matrices!
With my current memory, 32 GB, I can not run the full resTbl set (3400 rows) on the dataset which is like 2 million rows and 7 columns.
I tried with smaller resTbl and data set, and for some reason I had to alter n to this: n = length(period_interval)-1; to match my own results.
I am not even sure if there exist a better solution to this problem than yours. I gladly hear from you again, but otherwise I say thank you very much!
2 million rows and 7 columns
Maybe time is a price in this case

请先登录,再进行评论。

类别

帮助中心File Exchange 中查找有关 Matrix Indexing 的更多信息

产品

提问:

2019-10-5

评论:

2019-10-6

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by