ifft2 on GPU array
6 次查看(过去 30 天)
显示 更早的评论
I am trying to compute the ifft2 of a multiple matrices. The simplete code snippet is:
gAs = gpuArray.rand(999, 519, 20);
gBs = gpuArray.rand(999, 519);
ifft2(gAs .* gBs, "symmetric");
Error using gpuArray/ifft2
An invalid array was used on the GPU.
I thought that I was using all the GPU memory. I tried using single GPU arrays but it However, I then tried the following code (bigger matrix) and worked just fine.
gAs = gpuArray.rand(1000, 519, 2);
gBs = gpuArray.rand(1000, 519);
ifft2(gAs .* gBs, "symmetric");
I know that I can also do a for-loop through gAs slices and it works but I want to get some speedup by doing it in one call to ifft2.
I wanted to understand why this is happening and if there is a way in which I can pad the matrices so that I can still get the ifft2 of the original matrices.
For reference:
>> gpuDevice()
ans =
CUDADevice with properties:
Name: 'Tesla V100-SXM2-32GB'
Index: 1
ComputeCapability: '7.0'
SupportsDouble: 1
DriverVersion: 11.2000
ToolkitVersion: 11
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [2.1475e+09 65535 65535]
SIMDWidth: 32
TotalMemory: 3.4090e+10
AvailableMemory: 3.3167e+10
MultiprocessorCount: 80
ClockRateKHz: 1530000
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 0
CanMapHostMemory: 1
DeviceSupported: 1
DeviceAvailable: 1
DeviceSelected: 1
3 个评论
Walter Roberson
2022-1-3
Sorry, I would have to boot into a different operating system to test (GPU is not supported on my MacOS.)
采纳的回答
Matt J
2022-1-4
编辑:Matt J
2022-1-4
I think you should probably just omit the 'symmetric' flag. On the GPU (mine at least), it doesn't seem to make a big difference in performance:
A = gpuArray.rand(512,512,512);
gputimeit(@() ifft2(A,'symmetric') ) % 0.0706 seconds
gputimeit(@() ifft2(A) ) % 0.0753 seconds
Whether this is an indication of sub-optimal software design on Mathworks part, I'm not sure. On the CPU, the 'symmetric' flag means the software does fewer flops, but on a parallel system like the GPU, it's not the number of flops that matters.
0 个评论
更多回答(1 个)
Matt J
2022-1-3
编辑:Matt J
2022-1-3
I think it's a bug, but one solution might be,
fn=@(z,d) ifft(z,[],d,'symmetric');
out = fn( fn(gAs .* gBs,1) ,2);
2 个评论
Matt J
2022-1-4
编辑:Matt J
2022-1-4
It seems I had a conceptual error. ifft(ifft(X,1,'sym'),2,'sym') is not a valid replacement for ifft2(X,'sym') unless X is symmetric about both the x and y axes.
However, it does seem like a bug that only certain array sizes work for gpuArray.ifft2(). The CPU version of ifft2() doesn't have that problem.
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 GPU Computing 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!