Faster alternate to all() function

Question

Abinesh G 2024-8-22

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2147179-faster-alternate-to-all-function

评论： Abinesh G 2024-8-24

I am running a simulation of 100s of thousands loop and I am noticing that inside the each loop an all() function consumes almost 95% of the time. I want a faster alternate to this.

So here is my algorithm:

varname=rand([24665846,4])
for i=1:1000000
    idx=idxkeep(i);
    ind1=all(varname(:,1:4)==varname(idx,1:4),2);
    ind=find(ind1);
end

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Stephen23 2024-8-22

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2147179-faster-alternate-to-all-function#answer_1503594

在 MATLAB Online 中打开

N = 10000;
varname = rand(246658,4)
varname = 246658x4
    0.4250    0.9945    0.8854    0.7153
    0.3372    0.4636    0.6927    0.0979
    0.8001    0.5691    0.8808    0.8928
    0.2207    0.6044    0.5163    0.3355
    0.4153    0.7160    0.5742    0.4115
    0.3663    0.2421    0.6953    0.7484
    0.8841    0.4988    0.9958    0.3616
    0.6892    0.5110    0.4978    0.6139
    0.2841    0.1725    0.5703    0.7684
    0.9457    0.2001    0.3747    0.8266
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
idxkeep = randi(size(varname,1),1,N);
tic
for i=1:N
    idx = idxkeep(i);
    idy = all(varname(:,1:4)==varname(idx,1:4),2);
    idz = find(idy);
end
toc
Elapsed time is 15.020756 seconds.
tic
for i=1:N
    idx = idxkeep(i);
    idy = ...
        varname(:,1)==varname(idx,1) & ...
        varname(:,2)==varname(idx,2) & ...
        varname(:,3)==varname(idx,3) & ...
        varname(:,4)==varname(idx,4);
    idz = find(idy);
end
toc
Elapsed time is 2.358080 seconds.

3 个评论
显示 1更早的评论隐藏 1更早的评论

Stephen23 2024-8-22

@Abinesh G: if my answer works for you then please remember to click the accept button!

Abinesh G 2024-8-24

sure. Thanks

请先登录，再进行评论。

Answer 2

Steven Lord 2024-8-22

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2147179-faster-alternate-to-all-function#answer_1503659

在 MATLAB Online 中打开

Are you sure the longest time is spent in all? Reducing the size of the varname variable and the number of iterations a bit (so it doesn't time out in MATLAB Answers) and looking at the times for the == operation and the all call:

varname=rand([246658,4]);
n = 10000;
timingData = zeros(n, 2);
% You never defined the variable idxkeep, so defining it here using random data
idxkeep = randi(size(varname,1),1,n);
for i=1:n
    idx=idxkeep(i);
    tic
    x = varname(:,1:4)==varname(idx,1:4);
    timingData(i, 1) = toc;
    tic
    ind1=all(x,2);
    timingData(i, 2) = toc;
    ind=find(ind1);
end
format longg
seconds(sum(timingData, 1))
ans = 1x2 duration array
          12.990059 sec   3.09445199999994 sec

So it looks like the all call does not take the majority of the time. Most of the time is spent creating the logical array (named x in the modified example above.)

Now my choice of idxkeep is somewhat arbitrary. I'm guessing you have an alternate purpose for idxkeep. If you tell us in words not code what the purpose of this whole block of code is, we may be able to offer a more performant solution (perhaps one that avoids creating the large logical array.)

3 个评论
显示 1更早的评论隐藏 1更早的评论

Steven Lord 2024-8-22

So you want the rows with unique entries for the first four columns? What happens if you call unique with the 'rows' and 'stable' options? Or use groupsummary or grouptransform to perform your filtering on the rows based on unique combinations of the elements in the first four columns?

Abinesh G 2024-8-22

I have tried unique function infact idxkeep is the result from the unique function. But in my case I want rows with repeatations and perform filter only on them.

Also, groupsummary will not work in my case since I have to put multiple filter (for example: each row has a region id, I have to check whether the regions share boundary or not and some more) to decide whether to retain the rows with repeatations/or delete.

So I believe I cannot avoid loop. For the case of logical array, it would have been better if I reduce the time further. So far @Stephen23 solution is faster I would be happy if it can be further reduced.

请先登录，再进行评论。

Answer 3

arushi 2024-8-22

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2147179-faster-alternate-to-all-function#answer_1503469

在 MATLAB Online 中打开

Hi Abhinesh,

Here’s a possible approach that uses vectorized operations and logical indexing to improve performance:

% Assuming idxkeep is a pre-defined vector of indices
% Precompute the subset of varname
varname_subset = varname(:, 1:4);
% Preallocate for results if needed
results = cell(1, 1000000);
for i = 1:1000000
    idx = idxkeep(i);
    target_row = varname_subset(idx, :);
    
    % Compare using vectorized operations
    % Use implicit expansion (broadcasting) for comparison
    ind1 = sum(varname_subset == target_row, 2) == 4;
    
    % Find the indices
    ind = find(ind1);
    
    % Store results if needed
    results{i} = ind;
end

Hope this helps.

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

Abinesh G 2024-8-22

Hi Arushi,

Thank you for your quick response and for writing an improved script for my query. However, this approach does not solve my problem. When I ran the profiler, I noticed that this approach takes a little more time than the original. I am exploring faster alternatives, if any.

请先登录，再进行评论。

Faster alternate to all() function

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

3 个评论
显示 1更早的评论隐藏 1更早的评论

更多回答（2 个）

3 个评论
显示 1更早的评论隐藏 1更早的评论

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

另请参阅

类别

标签

Community Treasure Hunt

Faster alternate to all() function

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

3 个评论 显示 1更早的评论隐藏 1更早的评论

更多回答（2 个）

3 个评论 显示 1更早的评论隐藏 1更早的评论

1 个评论 显示 -1更早的评论隐藏 -1更早的评论

另请参阅

类别

标签

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

3 个评论
显示 1更早的评论隐藏 1更早的评论

3 个评论
显示 1更早的评论隐藏 1更早的评论

1 个评论
显示 -1更早的评论隐藏 -1更早的评论