Remove duplicate rows in CSV file

7 次查看(过去 30 天)
hello dear mathworkers,
I have a dataset consist of approximatlly 4 millions records, and i want to remove the duplicated rows or records, can any one help me with the way, i am using matlab 2018a . thanks in advance
  7 个评论
madhan ravi
madhan ravi 2019-7-24
Mohammed: Alex's solution should have solved your problem.

请先登录,再进行评论。

采纳的回答

Alex Mcaulley
Alex Mcaulley 2019-7-23
Since all is numeric data, you can use:
data = xlsread('kdd.xlsx');
datanew = unique(data,'rows');
  2 个评论
Shameer Parmar
Shameer Parmar 2019-7-23
This is not working, because non of data is similar.. I dont find duplicate entries in this sheet provided by Mohammad Alsajri..
using your command, the 'data' and 'datanew' both are getting exact same..
Alex Mcaulley
Alex Mcaulley 2019-7-23
This code works!
I guess the excel provided by Mohammad is just a small portion of the dataset (4 million of rows).

请先登录,再进行评论。

更多回答(0 个)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by