Outlier Detection and Removal [hampel]
HAMPEL(X,Y,DX,T,varargin) returns the Hampel filtered values of the
elements in Y. It was developed to detect outliers in a time series,
but it can also be used as an alternative to the standard median
filter.
References
Chapters 1.4.2, 3.2.2 and 4.3.4 in Mining Imperfect Data: Dealing with
Contamination and Incomplete Records by Ronald K. Pearson.
Acknowledgements
I would like to thank Ronald K. Pearson for the introduction to moving
window filters. Please visit his blog at:
http://exploringdatablog.blogspot.com/2012/01/moving-window-filters-and-pracma.html
X,Y are row or column vectors with an equal number of elements.
The elements in Y should be Gaussian distributed.
Input DX,T,varargin must not contain NaN values!
DX,T are optional scalar values.
DX is a scalar which defines the half width of the filter window.
It is required that DX > 0 and DX should be dimensionally equivalent to
the values in X.
T is a scalar which defines the threshold value used in the equation
|Y - Y0| > T*S0.
Standard Parameters for DX and T:
DX = 3*median(X(2:end)-X(1:end-1));
T = 3;
varargin covers addtional optional input. The optional input must be in
the form of 'PropertyName', PropertyValue.
Supported PropertyNames:
'standard': Use the standard Hampel filter.
'adaptive': Use an experimental adaptive Hampel filter. Explained under
Revision 1 details below.
Supported PropertyValues: Scalar value which defines the tolerance of
the adaptive filter. In the case of standard Hampel filter this value
is ignored.
Output YY,I,Y0,LB,UB,ADX are column vectors containing Hampel filtered
values of Y, a logical index of the replaced values, nominal data,
lower and upper bounds on the Hampel filter and the relative half size
of the local window, respectively.
NO is a scalar that specifies the Number of Outliers detected.
Examples
1. Hampel filter removal of outliers
X = 1:1000; % Pseudo Time
Y = 5000 + randn(1000, 1); % Pseudo Data
Outliers = randi(1000, 10, 1); % Index of Outliers
Y(Outliers) = Y(Outliers) + randi(1000, 10, 1); % Pseudo Outliers
[YY,I,Y0,LB,UB] = hampel(X,Y);
plot(X, Y, 'b.'); hold on; % Original Data
plot(X, YY, 'r'); % Hampel Filtered Data
plot(X, Y0, 'b--'); % Nominal Data
plot(X, LB, 'r--'); % Lower Bounds on Hampel Filter
plot(X, UB, 'r--'); % Upper Bounds on Hampel Filter
plot(X(I), Y(I), 'ks'); % Identified Outliers
2. Adaptive Hampel filter removal of outliers
DX = 1; % Window Half size
T = 3; % Threshold
Threshold = 0.1; % AdaptiveThreshold
X = 1:DX:1000; % Pseudo Time
Y = 5000 + randn(1000, 1); % Pseudo Data
Outliers = randi(1000, 10, 1); % Index of Outliers
Y(Outliers) = Y(Outliers) + randi(1000, 10, 1); % Pseudo Outliers
[YY,I,Y0,LB,UB] = hampel(X,Y,DX,T,'Adaptive',Threshold);
plot(X, Y, 'b.'); hold on; % Original Data
plot(X, YY, 'r'); % Hampel Filtered Data
plot(X, Y0, 'b--'); % Nominal Data
plot(X, LB, 'r--'); % Lower Bounds on Hampel Filter
plot(X, UB, 'r--'); % Upper Bounds on Hampel Filter
plot(X(I), Y(I), 'ks'); % Identified Outliers
3. Median Filter Based on Filter Window
DX = 3; % Filter Half Size
T = 0; % Threshold
X = 1:1000; % Pseudo Time
Y = 5000 + randn(1000, 1); % Pseudo Data
[YY,I,Y0] = hampel(X,Y,DX,T);
plot(X, Y, 'b.'); hold on; % Original Data
plot(X, Y0, 'r'); % Median Filtered Data
引用格式
Michael Lindholm Nielsen (2024). Outlier Detection and Removal [hampel] (https://www.mathworks.com/matlabcentral/fileexchange/34795-outlier-detection-and-removal-hampel), MATLAB Central File Exchange. 检索时间: .
MATLAB 版本兼容性
平台兼容性
Windows macOS Linux类别
- Signal Processing > Signal Processing Toolbox > Signal Generation and Preprocessing > Smoothing and Denoising >
标签
致谢
启发作品: Hampel filter in C++
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!版本 | 已发布 | 发行说明 | |
---|---|---|---|
1.6.0.0 | (1) Corrected potential error in internal median function.
|
||
1.5.0.0 | (1) The elements in X and Y are now temporarily sorted for internal computations.
|
||
1.4.0.0 | (1) If the number of elements (X,Y) are below 2 the output YY will be a copy of Y. No outliers will be detected. No error will be issued. |
||
1.3.0.0 | (1) Changed a calculation in the adaptive Hampel filter. See details in file.
|
||
1.1.0.0 | (1) Replaced output S with lower and upper bounds on the Hampel filter.
|
||
1.0.0.0 |