remove outlier in large data

my question how to find outlier in matlab explain by code and example ? and how many methods ? how to find outlier?

 采纳的回答

Try this:
% data = the variable name of your array
stdDev = std(data(:)) % Compute standard deviation
meanValue = mean(data(:)) % Compute mean
zFactor = 1.5; % or whatever you want.
% Create a binary map of where outliers live.
outliers = abs(data-meanValue) > (zFactor * stdDev);

4 个评论

can u clear more by give example and codes ? if u have big data note have some column text should neglect any info thanks
Try this:
data = 100 + randi(8,10,10)
% Make two outliers
data(2,2) = 5;
data(1,4) = 150
stdDev = std(data(:)) % Compute standard deviation
meanValue = mean(data(:)) % Compute mean
zFactor = 1.5; % or whatever you want.
% Create a binary map of where outliers live.
outliers = abs(data-meanValue) > (zFactor * stdDev)
Be aware that z (how many std devs away from the mean you are) is not that great at finding outliers because if the outlier is huge, it will affect your mean and standard deviation. Just try the above code with data(1,4) = 9999999 to see what I mean. A better measure is the median absolute deviation: http://en.wikipedia.org/wiki/Median_absolute_deviation Or you can try deleteoutliers by Brett http://www.mathworks.com/matlabcentral/fileexchange/3961-deleteoutliers
how i want to check all columns one by one cuz some columns have words should be neglect how many ways to find outlier in big text data ? it is not clear any code
I don't know how to define outliers in text (words). For example, in this comment that you're reading now, which of the words are "outliers" and why?

请先登录,再进行评论。

更多回答(0 个)

类别

帮助中心File Exchange 中查找有关 Cell Arrays 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by