Problem having isoutlier detecting anything
6 次查看(过去 30 天)
显示 更早的评论
I have what looks like a very easy problem but I cannot seem to solve it. I have a dataset (attached) that has some obvious (to the human eye) outliers.
I cannot get isoutlier to detect it in any way. My attempt is essentially this:
idx = isoutlier(x(:,2),'movmedian',w);
I have put the code in a for loop, spanning all possible values of w, I get at most 3 outliers detected when the window size is 3 and those detected are not actually outliers.
Using movmean instead movmedian detects no outliers for any value of w. I have also played with the threshold factor, but without luck. This seemed to me like a straightforward application for the outlier detection. What am I missing?
0 个评论
采纳的回答
Akira Agata
2018-8-10
One possible way to detect this type of outlier would be like this:
load('out1.mat');
% Assuming that data has 2nd order polynomial curve trend
p = polyfit(x(:,1),x(:,2),2);
y = polyval(p,x(:,1));
% Detect outlier in de-trend data
idx = isoutlier(x(:,2)-y);
% Show the result
plot(x(:,1),x(:,2))
hold on
plot(x(idx,1),x(idx,2),'ro')
legend({'Original data','Detected outlier'},'FontSize',14)
3 个评论
Chris Turnes
2018-8-14
You can also generalize this approach a little bit if your data doesn't globally fit a polynomial, but does over large local regions by replacing the polyfit portion with a call to smoothdata using the loess, lowess, or sgolay methods. For your data, you can get similar results doing:
% Local weighted quadratic fit on a window of span 0.3
y = smoothdata(x(:,2), 'loess', 0.3, 'SamplePoints', x(:,1));
% Find outliers in the difference between the smoothed and original.
tf = isoutlier(y-x(:,2));
% Visualize the difference.
plot(x(:,1), x(:,2), x(tf,1), x(tf,2), 'o')
legend({'Original data','Detected outlier'},'FontSize',14)
BAPPA MUKHERJEE
2019-11-11
Currently I am working on this topic. Can you please help me to load smoothdata function in directory, because in 2012 version its shows undifined function.
更多回答(2 个)
Ryan Takatsuka
2018-8-9
The outlier detection generally works best on single data point outliers (not multiple in a row). Your data has a large amount of outliers in a row that have a significant effect on the trend, or moving average, of the data.
In order to detect these outliers, you will need to use a very large moving average window to minimize the effect that the outlier have on it. Additionally, the threshold will need to be modified a bit. I used a window size of w=50 and a threshold of 0.5. This detects the outliers, but also falsely identifies points and the beginning and end of the dataset because the moving average has such a large window.
a = isoutlier(x(:,2), 'movmean', 50, 'ThresholdFactor', 0.5);
It also helps to plot the moving average that is used to calculate the outliers. This is shown in the image:
1 个评论
BAPPA MUKHERJEE
2019-11-11
I am unable to plot the last figure. could you please elaborate this code upto the plotting stage.
BAPPA MUKHERJEE
2019-11-11
Currently I am working on this topic. Can any one help me to load smoothdata function in directory, because in 2012 version its shows undifined function.
0 个评论
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Data Preprocessing 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!