Remove duplicate rows in CSV file
7 次查看(过去 30 天)
显示 更早的评论
hello dear mathworkers,
I have a dataset consist of approximatlly 4 millions records, and i want to remove the duplicated rows or records, can any one help me with the way, i am using matlab 2018a . thanks in advance
7 个评论
采纳的回答
Alex Mcaulley
2019-7-23
Since all is numeric data, you can use:
data = xlsread('kdd.xlsx');
datanew = unique(data,'rows');
2 个评论
Shameer Parmar
2019-7-23
This is not working, because non of data is similar.. I dont find duplicate entries in this sheet provided by Mohammad Alsajri..
using your command, the 'data' and 'datanew' both are getting exact same..
Alex Mcaulley
2019-7-23
This code works!
I guess the excel provided by Mohammad is just a small portion of the dataset (4 million of rows).
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Web Services 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
