Finding peaks in a very noisy signal

43 次查看(过去 30 天)
Hi, I have a large set of noisy signals, see examples below.
I want to calculate the mean frequency of the pulsations, and to do that I have to first numerically detect the peaks. I have an algorithm, but it relies heavily on empirical choice of parameters for the findpeaks function, and typically it takes me 3-4 tries before I get them right for a certain dataset.
Can someone point me in the right direction on how to make my algorithm more adaptive, or how to approach this differently? My code pasted below. Code in .mlx and test files added as an attachment.
filename = uigetfile("Test*.mat");
if ~(filename), return, end
load(filename,"dt","signal")
time = dt*(1:length(signal));
halfM = 0.5*mean(signal);
% First find maxima separated by a multiple of time step, of a minimum height of 1% of the max value in the whole dataset
[pks, locs] = findpeaks(signal, time,"MinPeakDistance",5*dt,"MinPeakProminence",0.01*max(signal));
% Then find minima to make sure closely spaced, sharp peaks are differentiated in the next step
[min_pks, min_locs] = findpeaks(-signal, time,"MinPeakDistance",5*dt,"MinPeakProminence",0.01*max(signal));
% Combine minimum and maximum points
pks = [pks -min_pks]; locs = [locs min_locs];
% Sort the points for further processing
[locs, idx] = sort(locs); pks = pks(idx);
% Custom values for minimum separation and maximum peak width
minSep = 5e-9; maxPW = 1e6;
% Plot the results with peak prominences and widths marked
figure
findpeaks(pks, locs, "MinPeakProminence", halfM, "MinPeakHeight", 2*halfM, "MinPeakDistance", minSep, "MaxPeakWidth", maxPW,"Annotate","extents");
% Plot the original signal with all peaks and valleys marked in small orange dots,
% and major peaks marked in large yellow dots
figure, plot(time, signal), hold on, scatter(locs, pks, Marker=".",LineWidth=1)
[pks,locs] = findpeaks(pks, locs, "MinPeakProminence", halfM, "MinPeakHeight", 2*halfM, "MinPeakDistance", minSep, "MaxPeakWidth", maxPW);
scatter(locs, pks, Marker="o", MarkerFaceColor="flat"), hold off
xlabel 'Time [\mus]', ylabel 'Signal power [a.u.]'

采纳的回答

Star Strider
Star Strider 2024-9-17
Your signals have broadband noiise, so a frequency-selective filter is not going to apply here. The Savitzky-Golay filter (or wavelet denoising) are best suited for signals with broadband noise, so I miplemented it here. Change the ‘framelen’ value to give the best result. (I chose a 3-order polynomiial, since that generally works best imn my experience, however you can experiiment with that as well.)
Try this —
files = dir('*.mat');
for k = 1:numel(files)
LD = load(files(k).name);
Fs(k) = 1/LD.dt;
s{k} = LD.signal;
L = numel(s{k});
t{k} = linspace(0, L-1, L)/Fs(k);
end
figure
tiledlayout(numel(files),1)
for k = 1:numel(files)
nexttile
plot(t{k}, s{k})
grid
xlabel('Time')
ylabel('Amplitude')
title(string(files(k).name))
end
sgtitle('Original')
figure
tiledlayout(numel(files),1)
for k = 1:numel(files)
nexttile
[FTs1,Fv] = FFT1(s{k},t{k});
plot(Fv, abs(FTs1)*2)
grid
xlabel('Frequency')
ylabel('Magnitude')
xlim([0 5E+8])
title(string(files(k).name))
end
sgtitle('Fourier Transforms')
figure
tiledlayout(numel(files),1)
for k = 1:numel(files)
nexttile
framelen = 1+5E+3;
s_filt = sgolayfilt(double(s{k}), 3, framelen);
plot(t{k}, s_filt)
grid
xlabel('Time')
ylabel('Amplitude')
title(string(files(k).name))
end
sgtitle('Savitzky-Golay Filtered Signal')
function [FTs1,Fv] = FFT1(s,t)
% Arguments:
% s: Signal Vector Or Matrix
% t: Associated Time Vector
t = t(:);
L = numel(t);
if size(s,2) == L
s = s.';
end
Fs = 1/mean(diff(t));
Fn = Fs/2;
NFFT = 2^nextpow2(L);
FTs = fft((s - mean(s)) .* hann(L).*ones(1,size(s,2)), NFFT)/sum(hann(L));
Fv = Fs*(0:(NFFT/2))/NFFT;
% Fv = linspace(0, 1, NFFT/2+1)*Fn;
Iv = 1:numel(Fv);
Fv = Fv(:);
FTs1 = FTs(Iv,:);
end
.
  9 个评论
Vitek Stepien
Vitek Stepien 2024-9-19
编辑:Vitek Stepien 2024-9-19
Star Strider, it was a pleasure working with you. I did some more tweaking with findpeaks, sorting the pulses, trying to pick only the "right ones" but I don't seem to make a good progress. This is for my work, but that particular piece of analysis was more interesting to me personally than important in the professional sense, so I'm putting it on hold and moving on for now. Thanks for sharing your knowledge, I definitely learned something new! If I ever pick it up and make a significant progress, I'll update this thread.
P.S. This data is all generated with a numerical model, so there is no instrumentation impact at all. All of the broadband noise comes from the dynamics of the modeled process itself.
Star Strider
Star Strider 2024-9-19
Thank you!
As always, my pleasure!
I iintend to continue working on my independent components analysis function, since I believe it can be helpful here, and in other appications. When I get it up and running successfully, I will update that analysis here as well.
I wish you well in your research.
If I can be of any further help, please let me know.

请先登录,再进行评论。

更多回答(1 个)

Vitek Stepien
Vitek Stepien 2024-9-18
Let me explain my work a bit, but first comments on the previous answers:
  • I often have very noisy signals to analyze, and I use a bunch of different filtering techniques, but I wasn't familiar with the S-G filter. It works really well, I'm very happy with the quality-preserving smoothing I get, so thank you for introducing me to it!
  • I was vaguely familiar with EMD but didn't use it for a long time. I don't think it helps me a lot here.
  • I use wavelet transform for other aspects of my work, but I'm not sure if it would here. I believe it produces qualitatively similar results to the EMD, but in a more continuous format.
Now to what I'm doing: I'm analyzing the output of a nonlinear proces called stimulated Brillouin scattering, which is an optical phenomenon that manifest itself in a generation of an optical wave that, in the context of my work, propagates in the system in the direction opposite to the original signal. This generated wave is seeded by noise and amplified in a nonlinear fashion, producing not-quite-random signal that is full of noise, with more or less sporadic pulses in power. This power vs time is the data in the four test files. In the figure below, I marked three such pulses with arrows.
The problem is that these pulsations are not quite periodic, and the signal is not really zero at other times - it fluctuates all the time. The distance between the first two peaks is about 50 nanoseconds, and between the next two is about 65 nanoseconds, so they are pretty close. Visually inspecting a whole bunch of datasets I see that there is a regularity to the pulsations, and so I wanted to estimate the regularity. See below a sample histogram for one of the datasets:
As you can see, I can kind of calculate the mean time between consecutive pulsations, and inverse that to get an estimate of mean pulsation frequency. But I'm not happy with how the pulsations are detected, because depending on the algorithm I use, some of the are overlooked, others are counted multiple times because they have a more complex structure, etc. See example of my old algorithm's output below:
Yellow dots mark the pulsations detected by the algorithm. Now "eyeballing", I would say there are only two large pulses here, marked with the red arrows. Obviously, many more were detected, including the really small one on the right side, which skews the analysis. I'm trying to improve the algorithm so that it doesn't count the high-frequency spikes, or the tiny peaks on the sides.

产品


版本

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by