Any algorithm to separate very high values from a data set?
1 次查看(过去 30 天)
显示 更早的评论
Hello everybody,
i have a column of dimensions 10317x1, which contains the reflected voltage values.
We can say out of 10,317 reflected voltages more than 95% of values are diffused reflections which has reflected voltage very low, or around average. Only small percentage (less than 5%) of values are too large because those are specular reflections.
The main goal is to split this 2 type of data. I am looking for some algorithm or any mathematical separation function, which can give me a threshold. A threshold which separates the very high values from rest of the values.
I have attached a normalised histogram of the data. Which shows how my data looks like.
I have highlighted with circle which shows that very few values are of high amplitude.
What i can do is pick out the highest 500 values and separate them, but that would be a manual approach. what i am looking for is a mathematical or algorithm based approach.
0 个评论
采纳的回答
Akira Agata
2019-1-10
If you have percentage of outlier (say, 5%), I think you can assume 95th percentiles of a data set as a threshold, like:
% Assuming x is your 10317x1 data array
th = prctile(x,95);
% Index of outlier
idx = x > th;
% Separate the data
xOutlier = x(idx);
xNormal = x(~idx);
0 个评论
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Structures 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!