How can I separate data into multiple groups?
8 次查看(过去 30 天)
显示 更早的评论
Hi,
I have a csv with more than 50,000 rows (an extract is provide in the attached csv file).
I need to group the data as highlighted in yellow in the attached file. The numbers in each group are either very close to each other (difference of less than 1) or they are multiple of the smaller number (with tolerance of +/- 0.3).
How can I write the code such that it can name the highlighted group as 1, 2, 3 and so on? For those number that don't belong to a group, 0 will be their default group number.
Thanks for the help in advance.
2 个评论
Jan
2023-3-3
CSV-files are text files. There are no colored elements.
Can you import the file already? Then you could start from "I have a vector" or "matrix".
回答(1 个)
Jan
2023-3-3
编辑:Jan
2023-3-3
data = [2416.015, 127.402, 382.165, 127.425, 127.3387, 127.406, 637.001, 127.405, 2240.913, ...
2257.54, 241.801, 3064.636, 441.559, 220.805, 220.799, 1204.011, 1547.622, 322.37, ...
322.43, 6482.511, 558.603, 279.301, 2234.423, 279.307, 279.31, 279.295, 3901.168, ...
3595.353, 90.315];
m = [true, abs(diff(data)) < 1]; % Distance is small
ini = strfind(m, [0, 1]); % Index where blocks are starting
p = zeros(size(data));
p(ini) = 1;
p = cumsum(p); % Count starts
m(ini) = true;
result = m .* p; % Use m as mask
format long g
disp([data.', result.'])
8 个评论
Jan
2023-3-7
@Jayden Yeo: Yes, I've simplified my example. With the real data considering the tolerances will even increase the complexity.
If the desciption of the process is such tricky already, this is usually a hint, that the view on the problem is to indirect or contains too complicated assumptions. Therefore I ask, which real world problem you want to solve. Maybe there is a simpler solution to define groups.
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Matrix Indexing 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!