Filtering and Cleaning Data
显示 更早的评论
Dears Friends,
How can I clean this data?

Anyone have some a sugestion for me?
2 个评论
Mohammad Sami
2020-3-26
If you are running R2019b and later, try the interactive cleaning task in the live editor.
Camilo Costa
2020-3-26
回答(3 个)
Peng Li
2020-3-26
0 个投票
Technically, this is not a programming issue. rather, this is an issue about algorithm. It's all depending on what you meant by cleaning. Do you think the spikes are what you want to filtered out? Or do you want to do something else? If the spikes are what you think that shouldn't be filtered out, the simplest way to clean this is by a so called three sigma criterion--anything that is beyong mean+/-3*standard deviation is believed to be outliers. There are other tricks too. So, again, this is about the algorithm not about programming I believe.
Peng Li
2020-3-26
A simple work around:
b = DADOSUFCS2(:, 2);
bstd = movstd(b, 100);
thre = nanmean(bstd);
bnew = b(bstd <= thre);

3 个评论
Camilo Costa
2020-3-26
编辑:Camilo Costa
2020-3-26
Peng Li
2020-3-26
Sorry it's difficult for me to understand what you are trying to ask. What I provided is a simple algorithm based on moving standard deviation. anything whose corresponding moving standard deviation is above a threshold will be treated as outliers in my example.
Peng Li
2020-3-26
How do you know that they are not real? Do you have a specific criterion? If you have, then it is simple. If you don't, you may need to work out a bit more algorithm side as no algorithm is the best for filtering a general data set. You are the best person who knows your data the best.
类别
在 帮助中心 和 File Exchange 中查找有关 Statistics and Linear Algebra 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!