Deleting outliers by code

Question

0 个投票

Hi again,

I have a measurements matrix as follows:

993000000000  1.64178960306505e+17
007000000000  3.10346010252124e+16
046000000000  2.22784317607289e+17
051000000000  1.48978160280980e+17
061000000000  2.79186942297259e+17
076000000000  2.02039468852741e+17
080000000000  5.02562504223962e+17

the first column is the x value, and the second its the y measurement.

I want to delete the rows in which the average of the neighboring y values are much bigger (or smaller) then the local y (in this example, i want to delete the second row).

How can i do it ?

Thank you !!!

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Follow Question

Answer 1

Image Analyst 2014-8-18

0 个投票

Try this, by Brett from the Mathworks:

http://www.mathworks.com/matlabcentral/fileexchange/3961-deleteoutliers

If you want something less sophisticated, try a modified median filter where you identify outliers, for example by thresholding the signal you get from subtracting the median signal from the original signal and taking the absolute value, and then replace only those elements above the threshold with the median value.

7 个评论
显示 5更早的评论隐藏 5更早的评论

Ron 2014-8-18

This is the full matrix data (on log scale):

i marked the kind of dots which i want to remove, the algorithm that i had in my mind was

if (y(i-1)+y(i-1))/2 is 10 times bigger/smaller then y(i), delete y(i)

Image Analyst 2014-8-18

编辑：Image Analyst 2014-8-18

在 MATLAB Online 中打开

Try this:

outliers = y > (10 * averaged_y) | y < 0.1 * averaged_y
% Remove outliers
y(outliers) = []

averaged_y comes from conv(). By the way, I don't think this (your algorithm) is a very robust algorithm (just think about it and you'll realize why), but might be okay for your specific set of data.

请先登录，再进行评论。

Answer 2

Star Strider 2014-8-18

在 MATLAB Online 中打开

0 个投票

It depends on how you define ‘much bigger (or smaller)’, and the number of neighboring elements you want to average over.

To delete the second row is easy enough (calling your matrix ‘X’ here):

X(2,:) = [];

2 个评论
显示无隐藏无

Ron 2014-8-18

This is just a sample from a bigger matrix, i need it to be systematically removed.

Much bigger/smaller means by power of 10 and only the closest neighboring points (y-1 and y+1)

Star Strider 2014-8-18

编辑：Star Strider 2014-8-18

在 MATLAB Online 中打开

I implemented a linear interpolation between (y-1) and (y+1), excluding (y), instead of an average of (y-1) and (y+1). Did you mean to include (y)?

The problem is that all of your points violate the ‘power-of-ten’ exclusion criterion.

My contribution:

X = [105.993000000000  1.64178960306505e+17
106.007000000000  3.10346010252124e+16
106.046000000000  2.22784317607289e+17
106.051000000000  1.48978160280980e+17
106.061000000000  2.79186942297259e+17
106.076000000000  2.02039468852741e+17
106.080000000000  5.02562504223962e+17];
for k1 = 2:size(X,1)-1
    B(:,k1) = [[1 1]' [X(k1-1,1) X(k1+1,1)]']\[[X(k1-1,2) X(k1+1,2)]'];
    E(k1) = [1 X(k1,1)] * B(:,k1);      % Expected From Interpolation
    D(k1) = E(k1)  - X(k1,2);           % Difference
end
Ep = E(2:end);
Xp = X(2:end-1,1);
figure(1)
plot(X(:,1), X(:,2), '-xb')             % Plot Data
hold on
plot(Xp, Ep, '-+r')                     % Plot Interpolated Values
hold off

请先登录，再进行评论。

Answer 3

Guillaume 2014-8-18

编辑：Guillaume 2014-8-18

在 MATLAB Online 中打开

0 个投票

averages = (m(1:end-2, 2) + m(3:end, 2)) / 2; %averages of row, row+2
m([1; abs(averages - m(2:end-1, 2)) > tolerance; end], :) = [];

Should do it

5 个评论
显示 3更早的评论隐藏 3更早的评论

Guillaume 2014-8-18

在 MATLAB Online 中打开

Sorry, should have been

m([1; abs(averages - m(2:end-1, 2)) > tolerance; end], :) = [];

I've edited my answer to correct all the typos.

Joseph Cheng 2014-8-18

在 MATLAB Online 中打开

Another method would be to use the conv.

 x=randi(100,1,20);
 Nav= [1 0 1]/2;
 test = conv(x,Nav,'valid');

then perform the subtraction from the input (here x) from index t2o to the end-1.

difference =x(2:end-1)-test;

then threshold appropriately.

请先登录，再进行评论。

Deleting outliers by code

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（3 个）

7 个评论
显示 5更早的评论隐藏 5更早的评论

2 个评论
显示无隐藏无

5 个评论
显示 3更早的评论隐藏 3更早的评论

类别

标签

Community Treasure Hunt

Deleting outliers by code

0 个评论 显示 -2更早的评论 隐藏 -2更早的评论

回答（3 个）

7 个评论 显示 5更早的评论 隐藏 5更早的评论

2 个评论 显示 无 隐藏 无

5 个评论 显示 3更早的评论 隐藏 3更早的评论

类别

标签

另请参阅

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

7 个评论
显示 5更早的评论隐藏 5更早的评论

2 个评论
显示无隐藏无

5 个评论
显示 3更早的评论隐藏 3更早的评论