how to avoid for loop to increase speed
2 次查看(过去 30 天)
显示 更早的评论
I would like to calculate the mean value of vector A for every sample as defined in vector B. So, the first point of the resulting vector C is the mean of the first 9 (=B(1)) datapoints of A. The second point of C is the mean of the following 10 (=B(2) datapoints of A. Etc. The following code works, but takes time when processing large vectors:
A=rand(91,1); % vector with 91 random samples
B=[9,10,10,8,11,10,10,9,10,6]; % vector with the number of samples
C= zeros(length(B),1); % preallocate C
first=1;
for i = 1: length(C)
last = first+B(i)-1;
interval=(first:last);
C(i) = mean(A(interval));
first =last+1;
end
Is there a way to use B in an index of A, instead of using this for loop?
2 个评论
Matt J
2023-1-4
Shouldn't sum(B) equal length(A)?
A=rand(91,1); % vector with 91 random samples
B=[9,10,10,8,11,10,10,9,10,6]; % vector with the number of samples
sum(B)
length(A)
采纳的回答
Matt J
2023-1-4
编辑:Matt J
2023-1-5
A=rand(91,1); % vector with 91 random samples
B=[9,10,10,8,11,10,10,9,10,6]; % vector with the number of samples
G=repelem(1:numel(B), B);
n=min(numel(A),numel(G));
G=G(:); A=A(:);
C=accumarray(G(1:n),A(1:n))./accumarray(G(1:n),1); %NOTE: faster than accumarray(G,A,[],@mean)
4 个评论
Matt J
2023-1-5
编辑:Matt J
2023-1-5
Better to avoid two ACCUMARRAY() calls and specify the function instead:
That would be quite a bit slower, unfortunately. I very deliberately avoided it for that reason:
A=rand(900000,1); % vector with 91 random samples
B=ones(1,90000)*10; % vector with the number of samples
assert(sum(B)==numel(A))
G=repelem(1:numel(B), B);
n=min(numel(A),numel(G));
G=G(:); A=A(:);
tic;
C=accumarray(G(1:n),A(1:n))./accumarray(G(1:n),1);
toc
tic;
C=accumarray(G(1:n),A(1:n),[],@mean); % simpler and faster
toc
Stephen23
2023-1-5
"That would be quite a bit slower, unfortunately. I very deliberately avoided it for that reason:"
Aah, that is a shame. Perhaps a code comment would help to make that choice clear.
更多回答(2 个)
Mathieu NOE
2023-1-4
hello
had to change A to 93 samples so that it matches with sum(B) = 93
see my suggestion below. On this small data we can see an iùprovement of factor 4
Elapsed time is 0.004598 seconds (your code)
Elapsed time is 0.001204 seconds. (my code)
delta = 1.0e-14 *
0
-0.0222
-0.0111
-0.0333
-0.0555
0.1110
0.0777
-0.0222
-0.0444
-0.0222
wonder if that is going to be even better for larger vectors ?
A=rand(93,1); % vector with 93 random samples
B=[9,10,10,8,11,10,10,9,10,6]; % vector with the number of samples
C= zeros(length(B),1); % preallocate C
first=1;
tic
for i = 1: length(C)
last = first+B(i)-1;
interval=(first:last);
C(i) = mean(A(interval));
first =last+1;
end
toc
% alternative code
tic
As = cumsum(A(:));
Bs = cumsum(B(:));
As = As(Bs);
Cs = [As(1); diff(As)]./B(:);
toc
delta = C - Cs
plot(C,'-*b')
hold on
plot(Cs,'dr')
hold off
7 个评论
Matt J
2023-1-6
编辑:Matt J
2023-1-6
@Stephen23 already showed you earlier that you can use accumarray to apply any function to the blocks.
A = randi(5,1,9300); % vector with random integers ranging from 1 to 5
B=ones(1,930)*10; % vector the number of samples
tic
C= zeros(length(B),1); % preallocate C
first=1;
for i = 1: length(C)
last = first+B(i)-1;
interval=(first:last);
C(i) = mode(A(interval));
first =last+1;
end
toc
tic
G=repelem((1:numel(B)),B);
C=accumarray(G(:),A(:),[],@mode);
toc
It is to be expected that speed-up is more modest, unfortunately. Accumarray isn't as well optimized for arbitrary functions.
Matt J
2023-1-6
编辑:Matt J
2023-1-6
And instead of the mean, I am interested in the mode.
This method should offer speed-up for a generic function, provided that it can ignore NaNs and provided the blocks don't vary too greatly in length:
B = randi([5,10],1,9300);
A = randi(5,1,sum(B));
discrepancy = max( abs(loopMethod(A,B)-altMethod(A,B)),[],'all')
timeit(@() loopMethod(A,B))
timeit(@() altMethod(A,B))
function C=loopMethod(A,B)
C= zeros(1,length(B)); % preallocate C
first=1;
for i = 1: length(C)
last = first+B(i)-1;
interval=(first:last);
C(i) = mode(A(interval));
first =last+1;
end
end
function C=altMethod(A,B)
bmax=max(B);
I=(1:bmax)'<=B;
T=nan(size(I));
T(I)=A(:);
C=mode(T,1);
end
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Matrix Indexing 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!