How to adapt pmtm for a gpuArray?

3 次查看(过去 30 天)
This would really speed up some signal processing that I am doing at work. Does anyone know how to easily adapt the function pmtm for use with a gpuArray? I know that the fastest way would be to use mex functions with cuda and not use either toolbox, but this is a bit beyond me at the moment.

采纳的回答

Daniel Harvey
Daniel Harvey 2018-5-24
I finally got around to readdressing this challenge. The following is a function that will evaluate a power spectral density using a thomson multitaper method on a 2D input matrix with signal in each column. It uses the czt method to compute the DFT, and the eigen method for estimating the average of dpss windows. It takes about 1/15 of the time as using pmtm() in a parfor loop (on my machine). This is a rather specific implementation, but hopefully it will be useful to someone.
function psd = mypmtm(xin,fs,bins_per_hz)
[m, n] = size(xin);%m=length of signal, n=# of signals
k=fs*bins_per_hz;
nfft = 2^nextpow2(m+k-1);%- Length for power-of-two fft.
[E,V] = dpss(m,4);
g=gpuDevice;
s=(m+nfft)*32*length(V);%how many bytes will be needed for ea. signal
ne=floor(g.AvailableMemory/s);%number of signals that can be processed at once with available memory
indx=[0:ne:n,n];%number of iterations that will be necessary
psd=zeros(k/2,n);%initialize output
for i=1:length(indx)-1
x=gpuArray(xin(:,1+indx(i):indx(i+1)));
w = exp(-1i .* 2 .* pi ./ k);
x=x.*permute(E,[1 3 2]); %apply dpss windows
%------- Premultiply data.
kk = ( (-m+1):max(k-1,m-1) )';
kk = (kk .^ 2) ./ 2;
ww = w .^ (kk); % <----- Chirp filter is 1./ww
nn = (0:(m-1))';
x = x .* ww(m+nn);
%------- Fast convolution via FFT.
x = fft( x, nfft );
fv = fft( 1 ./ ww(1:(k-1+m)), nfft ); % <----- Chirp filter.
x = x .* fv;
x = ifft( x );
%------- Final multiply.
x = x( m:(m+k-1), : , :) .* ww( m:(m+k-1) );
x = abs(x).^2;
x=x.*permute(V,[2 3 1])/length(V);%'eigen' method of estimating average
x=sum(x,3);
x=x(2:end/2+1,:)./fs;
psd(:,1+indx(i):indx(i+1))=gather(x);
end
end
  1 个评论
ForTex
ForTex 2019-9-4
Thank you for this!
Testing this out made me realize, that since the dpss function is the most computationally intensive here or in native pmtm, I wonder if that could be sped up with a gpu. If calculating a bunch of multitaper PSDs with the same length m, you could precalculate the dpss, and change the code to use E as input rather than calculating it every time you call the function.
I do however wonder, why there is a discrepancy of about 1/pi between mypmtm and pmtm?

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Fourier Analysis and Filtering 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by