Indexing a range of a 3d matrix incredibly slow?

13 次查看(过去 30 天)
I have a function that does several different operations between a source and destination matrix (each 3d), generally inserting a small source into a target area of a large destination. I need to run this function thousands-tens of thousands of times, so speed is valuable. However, when profiling I have found that virtuall all of the runtime is on the line that performs the final operation, where there's a lot of range indexing in 3d. example:
dest(dx,dy,dz) = source(sx,sy,sz) + dest(dx,dy,dz);
where all of di and si are ranges. That line alone takes ~99% of the runtime per call (taking ~6 seconds for 100 test runs). Is this sort of indexing operation inherently super slow? I find the slow speed odd, especially when binarizing each of those same 3d regions is ~30 times faster.
Is there a huge inefficiency i'm not seeing, or some trick to make this process faster?
EDIT: the rest of the code, where the ranges are generated
[d1, d2, d3]=size(dest);
[s1, s2, s3]=size(source);
%dest
dx=max(1,coord(1)):min(d1,coord(1)+s1-1);
dy=max(1,coord(2)):min(d2,coord(2)+s2-1);
dz=max(1,coord(3)):min(d3,coord(3)+s3-1);
%source
sx=max(-coord(1)+2,1):min(d1-coord(1)+1,s1);
sy=max(-coord(2)+2,1):min(d2-coord(2)+1,s2);
sz=max(-coord(3)+2,1):min(d3-coord(3)+1,s3);
note that the ranges are generated in order to find the overlap between the arrays, given some location coord relative to the destination array, that is still fully inside both arrays to avoid out-of-bounds problems.
  3 个评论
KSSV
KSSV 2022-6-13
Show us your code snippet, how you are indexing.
Carson Purnell
Carson Purnell 2022-6-13
binarizing those volumes, as in
dbin = imbinarize(dest(dx,dy,dz)); sbin = imbinarize(source(sx,sy,sz));
are the same ranges and that line runs dramatically faster than the slow sum operation.

请先登录,再进行评论。

回答(1 个)

Jan
Jan 2022-6-13
If you post the relevant part of the code and some matching input data, it is very likely, that this forum can provide some improvements. With the currently givendetails, the best answer I know is:
If the memory access is the bottleneck of your code, it is the bottleneck of you code.
dest(dx,dy,dz) = source(sx,sy,sz) + dest(dx,dy,dz)
This line creates a copy of source(sx,sy,sz) ate first, than dest(dx,dy,dz), adds them and inserts them in dest. Indexing with vectors requires a range-check for each element.
Are the ranges sx, sy, ... contiguous? Then this would be much faster: source(sx(1):sx(end), sy(1):sy(end), sz(1):sz(end)) etc.
  3 个评论
Jan
Jan 2022-6-13
编辑:Jan 2022-6-13
Try this:
[d1, d2, d3]=size(dest);
[s1, s2, s3]=size(source);
dxi = max(1, coord(1))
dxf = min(d1, coord(1)+s1-1);
dyi = max(1, coord(2));
dyf = min(d2, coord(2)+s2-1);
dzi = max(1, coord(3));
dzf = min(d3, coord(3)+s3-1);
%source
... the same here...
dest(dxi:dxf, dyi:dyf, dzi:dzf) = ... same for source and dest
This saves the times for producing the index vectors. I do not expect that this causes a dramatic speedup.
It would be useful to have a minimal working example, which includes the bottleneck of your code. It is less smart, if I invent such a piece of code, because my guesses of the details might be misleading. But let me start:
dest = rand(1000, 1000, 281);
source = rand(4000, 2000, 312);
cord = [17, 81, 90];
Something like that?!?
Carson Purnell
Carson Purnell 2022-6-13
I have been using these for testing, on a for loop of 100.
src = rand(20,20,10);
dest = zeros(500,500,100);
coord = [1 2 3];

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Graphics Performance 的更多信息

产品

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by