How to remove the outliers

12 次查看(过去 30 天)
I have a sequence data and I assumed there are some outliers which us plott in my excel in the red colour of shading. I attach the xfiles of my data.
I have a question about how function of the matlab can detect dan delete those data in the red shading.
If any one can help, I would be appreciated.
Thanks

采纳的回答

Steven Lord
Steven Lord 2019-7-11
Take a look at the filloutliers and rmoutliers functions on this documentation page.
  3 个评论
Jon
Jon 2019-7-11
Maybe you are running an old version of MATLAB that does not have the filloutliers function.
filloutliers was introduced in MATLAB version 2017A
What version of MATLAB are you running? To find out you can type the ver command.
In the future it is good to use the code button in the MATLAB answers toolbar for inserting code. That way it comes out nicely formatted and is easier to read, use and or copy.
Skydriver
Skydriver 2019-7-11
编辑:Skydriver 2019-7-11
I use Matlab 2013 version or May be do you have any suggestion with Matlab version to remove outliers or filloutliers with another values closed in between.

请先登录,再进行评论。

更多回答(1 个)

Jon
Jon 2019-7-11
编辑:Jon 2019-7-11
Since you do not have filloutliers and rmoutliers in your version of MATLAB
I would first recommend updating to a more recent version of MATLAB if possible as there have been many advances since 2013.
If that is not possible, you can look at the documentation in the link that Steven provided.
It gives MATLAB's default definition of an outlier as:
Outliers are defined as elements more than three scaled MAD from the median. The scaled MAD is defined as c*median(abs(A-median(A))), where c=-1/(sqrt(2)*erfcinv(3/2)).
So you could easily implement this in your code. For example if you had a vectors x and y and you wanted to make a plot with the outliers removed you could do the following
isOutlier = abs(y) > -3/(sqrt(2)*erfcinv(3/2))*median(abs(y - median(y)))
plot(x(~isOutlier),y(~isOutlier))
I would recommend though implementing isOutlier as a small function, so you don't have to keep repeating this code.
Another simple way to remove outliers is to sort your data, using the sort command, and then removing the first and last n values from the sorted listed, where you choose n according to how conservative you want to be with the outlier removal. so for example, given vectors x and y and n = 5.
You could implement this with something like
n = 5;
[ySrt,iSrt] = sort(y)
iKeep = iSrt(n:length(y)-n)
plot(x(iKeep),y(iKeep))
Note that n/length(y) is the fraction of data that you are discarding as outliers at the top and the bottom of the sorted list. So you might want to choose n so that n/length(y) is approximately 0.025, and thus you would be keeping 100*( 1- 2*0.025) = 95% of your data and considering the other extremes as outlier.
This method although simple, of course assumes you usually have some outliers at the extremes, otherwise you are just throwing away good data even though it is at the lower and upper end of the sorted list.
  2 个评论
Skydriver
Skydriver 2019-7-12
Thank you for Steven Lord and Jon, it is working know.
Jon
Jon 2019-7-12
编辑:Jon 2019-7-12
Glad to hear it is working now. If you feel like the question is answered it would be good to "accept" it so that if someone else has the same issue they can see that there is an answer available. If you are still waiting to see if there other approaches then you should leave it open.

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Time Series Collections 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by