slow imwarp with large arrays

77 次查看(过去 30 天)
Nic Bac
Nic Bac 2024-11-5,16:27
评论: Umar 2024-11-7,15:12
Hello,
I have a piece of code that uses imwarp to transform images of varying size (up to 20k x 20k pixels), which works fine with input images up to around 500x500px:
tform = fitgeotform2d(input,spatial,"lwm",lwmZone);
[Imagenew, Rnew] = imwarp(refimg,Rimage,tform,"cubic",FillValues=0);
What I noticed is that the function seem to use only one core of the 72 available in my machine, even if the parallel pool has been enabled using parpool('threads').
This to me seems odd since the imwarp help mentiones that parallel options are available/supported.
What am I doing wrong? Is there a way to accelerate the execution time of imwarp with large arrays?
Thank you.

采纳的回答

Joss Knight
Joss Knight 2024-11-6,12:11
I think the documentation is just referring to using the GPU or the ability to process in the background using backgroundPool to free up your MATLAB to do other things.
You could look into blockproc, which can process images on a parallel pool, but it's mainly intended for filters rather than geometric transforms.
  4 个评论
Nic Bac
Nic Bac 2024-11-7,9:42
understood, then I indeed misinterpreted that sentence. I guess the only option I have is to split the array into smaller ones and run the code there.
Thanks
Joss Knight
Joss Knight 2024-11-7,10:01
I think if that option is available to you, then it means you can use blockproc to do it automatically. blockproc has automatic parallel support.

请先登录,再进行评论。

更多回答(3 个)

Umar
Umar 2024-11-5,18:29

Hi @Nic Bac ,

As mentioned in the documentation,

https://www.mathworks.com/help/images/ref/imwarp.html

specifying the OutputView parameter can enhance performance by defining the output size and location. For large images, defining an appropriate output view can minimize unnecessary computations and speed up processing:

   outputView = affineOutputView(size(refimg), tform);
   [Imagenew, Rnew] = imwarp(refimg, Rimage, tform, "OutputView", 
    outputView, "cubic", FillValues=0);

If performance is still an issue, consider using other functions or techniques that may better utilize multi-core capabilities. For example, if you have access to a compatible GPU, leveraging GPU computing could significantly speed up your image transformations:

     refimgGPU = gpuArray(refimg);
     [ImagenewGPU, Rnew] = imwarp(refimgGPU, RimageGPU, tform, "cubic", 
     FillValues=0);
     Imagenew = gather(ImagenewGPU); % Transfer back to CPU

For extremely large images or when dealing with multiple images, consider breaking down the image into smaller tiles or batches and processing them individually in parallel:

   % Example of tiling approach (pseudo-code)
   parfor i = 1:numTiles
       [Imagenew{i}, Rnew{i}] = imwarp(tiles{i}, RimageTile{i}, tform);
   end

Please bear in mind that large images require significant memory resources. Ensure your system has enough RAM to handle the image sizes you are working with and using a recent version of MATLAB that supports advanced features in parallel computing and GPU acceleration. Versions beyond R2021a have improved support for these capabilities. Also, utilize MATLAB's built-in profiler (profile on; ...; profile viewer;) to identify bottlenecks in your code execution and determine whether imwarp is indeed the limiting factor.

Hope this helps.

  2 个评论
Nic Bac
Nic Bac 2024-11-6,8:47
Hello Umar, thank you for the detailed suggestions. I did try using gpu arrays, but unfortunately this option is not supported for the “lwm” case.
Perhaps I wasn’t fully clear on the principal issue I’m facing, which is the fact that imwarp uses only 1 core out of the many available, which shouldn’t be (at leas as far as I understand it) since matlab help does say that cpu acceleration is supported.
I initially thought that the issue was that the parallel pool was not enabled, but that doesn’t seem to solve the fact that imwarp uses only 1 core – this is independent on the size of the array.
Umar
Umar 2024-11-7,15:12
Hi @Nic Bac,
Completely understand. In my opinion @Joss Knight provided some good suggestions.

请先登录,再进行评论。


埃博拉酱
埃博拉酱 2024-11-6,14:35
编辑:埃博拉酱 2024-11-6,14:44
There are two different levels of parallel acceleration in MATLAB, and you need to check if you are confusing them.
  1. Parallel pools, which rely on the Parallel Computing Toolbox, need to be explicitly specified using syntax such as parfor. Each parallel thread calculates independently and cannot share data.
  2. Automatic parallelization, does not depend on the toolbox, and does not need to be explicitly specified. During the vectorization of large arrays, the main process of MATLAB automatically invokes parallel computation. However, automatic parallelization is disabled when parallel pool (1) is enabled.
This means that if you choose option 1, you have to manually split the data into undependent chunks and distribute them across different worker processes, each of which can only use one CPU core. Conversely, if you want to take advantage of the automatic parallelization of option 2, you must not use any parallel pools, but compute entirely on the main process. In general, MATLAB will automatically apply parallelized calculations for you.
If you ensure that the calculations are only in the main process, and MATLAB does not initiate automatic parallelization, it most likely means that the algorithm you are using cannot be parallelized. You can check out the LWM algorithm explained in the documentation. Given that you have already mentioned that your algorithm cannot be executed on a GPU, it is likely that this is because the algorithm is not logically designed for parallelization, and the parts of the image may have interdependent relationships. As far as I know, very few algorithms that can be executed in parallel cannot be accelerated by a GPU. If this is the case, the only accelerating method you have available is to process multiple different images simultaneously on different CPU cores. While each image itself uses single-core computation, you can still effectively utilize multi-core CPU if you have a large number of images.

Matt J
Matt J 2024-11-5,20:17
20k x 20k is an incredibly high resolution. Do you really need it, and if so, do you really need to use cubic interpolation, as opposed to computationally simpler linear interpolation? Also, what data type are these images? Are they integer type (uint8,uint16, ...) or floats?
  1 个评论
Nic Bac
Nic Bac 2024-11-6,8:48
Hello Matt J. Eventually I will need to scale to that size and use cubic interpolation, images are either single or double. The size of the array though is not my main problem, which instead is the fact that imwarp uses only 1 core out of the many available, this is true even for small arrays.
In my case GPU acceleration is not possible, nevertheless the machine does have plenty of cores (72) and RAM (512GB) to do the job, but so far I have been unsuccessful in making imwarp use more cores efficiently.

请先登录,再进行评论。

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by