How to remove jumps and bring the data down?

25 次查看(过去 30 天)
I need a way to deal with spikes and jumps in the data.
The spikes should be replaced with NaNs and the data after the jump should be brought down to the earlier data level.
The 'hampel' function and 'deleteoutliers' too didn't work.
I have attached a sample data herewith.
Please let me know how to deal with such data set.
Thank you.
  1 个评论
Greg Heath
Greg Heath 2018-1-7
编辑:Greg Heath 2018-1-7
My favorite method of dealing with jumps is to
1. Standardize to zero-mean/unit-variance
2. Check points with first-differences greater than
a threshold
3. Replace the outliers using the mean of the
surrounding data points.
Hope this is helpful
Greg

请先登录,再进行评论。

回答(2 个)

Image Analyst
Image Analyst 2017-12-24
You can try a modified median signal. This is where you replace the signal by the median value of the signal in a window around any point where the different between the signal and the median filtered version of the signal is more than some amount. Here is a demo:
clc; % Clear the command window.
close all; % Close all figures (except those of imtool.)
clear; % Erase all existing variables. Or clearvars if you want.
workspace; % Make sure the workspace panel is showing.
format short g;
format compact;
fontSize = 13;
% Read in signal.
s = load('spike_jump.mat')
spike_jump = s.spike_jump;
% Plot the signal.
subplot(4, 1, 1);
plot(spike_jump, 'b*-', 'LineWidth', 2);
grid on;
xlabel('x', 'FontSize', fontSize);
ylabel('spike_jump', 'FontSize', fontSize);
%------------------------------------------------------------------------------
% Set up figure properties:
% Enlarge figure to full screen.
set(gcf, 'Units', 'Normalized', 'OuterPosition', [0, 0.04, 1, 0.96]);
% Get rid of tool bar and pulldown menus that are along top of figure.
% set(gcf, 'Toolbar', 'none', 'Menu', 'none');
% Give a name to the title bar.
set(gcf, 'Name', 'Demo by ImageAnalyst', 'NumberTitle', 'Off')
% Take the median filter of the signal
windowWidth =75;
filteredSignal = medfilt1(spike_jump, windowWidth);
% Plot the signal.
subplot(4, 1, 2);
plot(filteredSignal, 'b-', 'LineWidth', 2);
grid on;
xlabel('x', 'FontSize', fontSize);
ylabel('filteredSignal', 'FontSize', fontSize);
% Compute the absolute deviation:
mad = abs(spike_jump - filteredSignal);
% Plot the signal.
subplot(4, 1, 3);
plot(mad, 'b-', 'LineWidth', 2);
grid on;
xlabel('x', 'FontSize', fontSize);
ylabel('Median Absolute Deviation', 'FontSize', fontSize);
% h = histogram(mad)
% Replace elements with mad > 20 by the median.
outputSignal = spike_jump; % Initialize
mask = mad > 20;
outputSignal(mask) = filteredSignal(mask);
% Plot the signal.
subplot(4, 1, 4);
plot(outputSignal, 'b-', 'LineWidth', 2);
grid on;
xlabel('x', 'FontSize', fontSize);
ylabel('Repaired Signal', 'FontSize', fontSize);
  2 个评论
Venkata
Venkata 2017-12-26
编辑:Venkata 2017-12-26
Thank you so much for your detailed explanation.
Is there any possibility to remove the jump and bring the later data down to their earlier data range?
Any approach/method that can be applied only for the data around the jump, if not the entire data set..?
Thank you.
Image Analyst
Image Analyst 2017-12-26
Add these lines to the code:
% Find out indexes where the signal is more than 500.
stepIndexes = outputSignal > 500;
% Find the mean of those values.
meanStep = mean(outputSignal(stepIndexes))
% Assume the mean should be subtracted from all indexes where the value is more than 500.
outputSignal(stepIndexes) = outputSignal(stepIndexes) - meanStep;
% Plot the signal.
subplot(4, 1, 4);
plot(outputSignal, 'b-', 'LineWidth', 2);
grid on;
xlabel('x', 'FontSize', fontSize);
ylabel('Repaired Signal', 'FontSize', fontSize);

请先登录,再进行评论。


Greg Dionne
Greg Dionne 2017-12-29
I tried this with hampel and finchangepts. Seemed to work okay.
% load data attached at top of question
load spike_jump
% get a crude estimate
yest = hampel(spike_jump,40,.5);
plot(yest)
% get breakpoint of shelf
idx = findchangepts(yest);
% correct baseline of left and right portions
% The transition takes a few samples so we will guard by 10 samples
yleft = spike_jump(1:idx) - mean(yest(1:idx-10));
yright = spike_jump(idx+1:end) - mean(yest(idx+10:end));
y = [yleft; yright];
% re-filter and plot
yf = hampel(y,120,.2);
plot(yf);
  7 个评论
Greg Dionne
Greg Dionne 2018-1-8
Hopefully the error signal introduced in the instrumentation will be separable enough from the field you are trying to measure. I think Image Analyst's sgolayfilt approach to first remove the trend is worth a try. Let us know how it goes.
If you have a recent copy of MATLAB (R2017a) you can also try using filloutliers (which has implemented part of Greg Heath's approach as well).
Greg Heath
Greg Heath 2018-1-9
编辑:Greg Heath 2018-1-9
I agree with Image Analyst that the sliding median is more stable than the sliding mean.HOWEVER, the replacement mean to which I referred is post outlier removal mean.
I didn't invent the technique. This was common back in the pre-deskcomputer days when calculating a median was much more of a pain in the butt than calculating an average.
Greg ( Another wannabe stable genius !)
( WHAAT? ... OPRAH for president? ... SHEESH ! )

请先登录,再进行评论。

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by