histogram of signals gaps width

2 次查看(过去 30 天)
Michal
Michal 2021-5-25
编辑: Michal 2021-5-26
I am looking for algorithm (effective + vectorized) how to find histogram of gaps (NaN) width in the following manner:
  1. signals are represented by (Nsamples x Nsig) array
  2. gaps in signal are encoded by NaN's
  3. width of gaps: is number of consecutive NaN's in the signal
  4. gaps width histogram: is frequency of gaps with specific widths in signals
And the following conditions are fulfilled:
[Nsamples,Nsig ]= size(signals)
isequal(size(signals),size(gapwidthhist)) % true
isequal(sum(gapwidthhist.*(1:Nsamples)',1),sum(isnan(signals),1)) % true
Of course, compressed form of gapwidthhist (represented by two cells: "gapwidthhist_compressed_widths" and "gapwidthhist_compressed_freqs") is required too.
Example:
signals = [1.1 NaN NaN NaN -1.4 NaN 8.3 NaN NaN NaN NaN 1.5 NaN NaN; % signal No. 1
NaN 2.2 NaN 4.9 NaN 8.2 NaN NaN NaN NaN NaN 2.4 NaN NaN]' % signal No. 2
gapwidthhist = [1 1 1 1 0 0 0 0 0 0 0 0 0 0; % gap histogram for signal No. 1
3 1 0 0 1 0 0 0 0 0 0 0 0 0]' % gap histogram for signal No. 2
where integer histogram bins (gap widths) are 1:Nsamples (Nsamples=14).
Coresponding compressed gap histogram looks like:
gapwidthhist_compressed_widths = cell(1,Nsig)
gapwidthhist_compressed_widths =
1×2 cell array
{[1 2 3 4]} {[1 2 5]}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
gapwidthhist_compressed_freqs = cell(1, Nsig)
gapwidthhist_compressed_freqs =
1×2 cell array
{[1 1 1 1]} {[3 1 1]}
Typical problem dimension:
Nsamples = 1e5 - 1e6
Nsig = 1e2 - 1e3
Thanks in advance for any help.

回答(2 个)

Image Analyst
Image Analyst 2021-5-25
If you have the Image Processing Toolbox and can use regionprops() to count the number and length of NaN regions, you can do this:
signals = [1.1 NaN NaN NaN -1.4 NaN 8.3 NaN NaN NaN NaN 1.5 NaN NaN; % signal No. 1
NaN 2.2 NaN 4.9 NaN 8.2 NaN NaN NaN NaN NaN 2.4 NaN NaN]' % signal No. 2
[numData, numSignals] = size(signals)
gapwidthhist = zeros(ceil(numData/2), numSignals);
for column = 1 : numSignals
thisSignal = signals(:, column); % Extract this column.
% Find lengths of all NAN runs
props = regionprops(isnan(thisSignal), 'Area');
allLengths = [props.Area];
hc = histcounts(allLengths)
% Load up gapwidthhist
for k2 = 1 : length(hc)
gapwidthhist(k2, column) = hc(k2);
end
end
% Should be
% gapwidthhist = [1 1 1 1 0 0 0 0 0 0 0 0 0 0; % gap histogram for signal No. 1
% 3 1 0 0 1 0 0 0 0 0 0 0 0 0]' % gap histogram for signal No. 2
% What it is:
gapwidthhist
  4 个评论
Image Analyst
Image Analyst 2021-5-25
Michael:
You're right. Try this:
signals = [1.1 NaN NaN NaN -1.4 NaN 8.3 NaN NaN NaN NaN 1.5 NaN NaN; % signal No. 1
NaN 2.2 NaN 4.9 NaN 8.2 NaN NaN NaN NaN NaN 2.4 NaN NaN;
1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN]' % signal No. 2
[numData, numSignals] = size(signals)
gapwidthhist = zeros(numData, numSignals);
for column = 1 : numSignals
thisSignal = signals(:, column); % Extract this column.
% Find lengths of all NAN runs
props = regionprops(isnan(thisSignal), 'Area');
allLengths = [props.Area]
edges = [1:max(allLengths), inf]
hc = histcounts(allLengths, edges)
% Load up gapwidthhist
for k2 = 1 : length(hc)
gapwidthhist(k2, column) = hc(k2);
end
end
% What it is:
gapwidthhist'
Michal
Michal 2021-5-25
Well done ... Thanks! Your code is pretty fast even for large dimension problem.
But still, I am looking for pure Matlab code without any toolbox functions, because final user have only basic Matlab.
There is no way how to extract source code of the core functionality, because function "regionprops" calls some
internal built-in functions.

请先登录,再进行评论。


Michal
Michal 2021-5-26
编辑:Michal 2021-5-26
This is much more simple Matlab implementation but still not optimal (+ not vectorized):
signals = [1.1 NaN NaN NaN -1.4 NaN 8.3 NaN NaN NaN NaN 1.5 NaN NaN; % signal No. 1
NaN 2.2 NaN 4.9 NaN 8.2 NaN NaN NaN NaN NaN 2.4 NaN NaN; % signal No. 2
1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN]'; % signal No. 3
signals
[numData, numSignals] = size(signals);
gapwidthhist = zeros(numData, numSignals);
gaps = zeros(numData+1,numSignals);
auxnan = isnan(signals);
for i = 1:numSignals
c = 0;
for j = 1:numData
if auxnan(j,i)
c = c + 1;
else
gaps(j,i) = c;
c = 0;
end
end
gaps(numData+1,i) = c;
gapwidthhist(:,i) = histcounts(gaps(:,i),1:numData+1);
end
gapwidthhist
Any idea how to optimize (vectorize) this code to be more effective?

类别

Help CenterFile Exchange 中查找有关 Numeric Types 的更多信息

产品


版本

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by