compare datasets and remove not common rows

Hello all, I have to compare two datasets (a= 89072 x 13 & b=89268 x 37) that must have the same rows, so I have to remove the rows that are not common. The columns number for the two dataset are not the same, so I can't 'Intersect' them. How can I remove these rows please?
Thank you

4 个评论

"Same rows" by what definition of "same"?
I removed some observations (outliers) from dataset 'a' and now I have to eliminate these observations also from dataset b.
But there are 13 columns in one and 37 in the other so you are going to have to define some type of definition for "same". Once you've defined "sameness", then we can help identify and remove.
Dear Sean, the two datasets (a & b) initially had the same observations but different columns. In the first dataset (a) there are quantitative variables while in the second ds (b) the qualitative variables. I removed some observations (outliers) from dataset 'a' and I have to eliminate these observations also from dataset 'b' because the two datset must have the same observations. thank you

请先登录,再进行评论。

 采纳的回答

What do you mean by " I have to remove the rows that are not common"? if you mean to reduce the size of the other dataset(b) and make it equal to
a,
then you can use
b(1:length(a),:)

4 个评论

hi Kittu, I tried your command but doesn't work. I get this error:
Error using getobsindices (line 71) Observation index exceeds dataset dimensions.
Error in dataset/subsrefParens (line 16) [obsIndices, numObsIndices] = getobsindices(a, s(1).subs{1});
Error in dataset/subsref (line 69) [varargout{1:nargout}] = subsrefParens(a,s);
thank you
It seems the index exceeded dataset dimensions.So may be you have mistakenly swapped a and b. I assume that you want to reduce the number of rows in b variable and want to make it to number of rows in a. I tested in my system, it works!
I just solved the problem running these command:
NDG_X= get(a,'ObsNames');
NDG_Xc= get(b,'ObsNames');
[~,ia,ib] = intersect(NDG_X,NDG_Xc,'stable') ;
a_ridotto=a(ia,:);
b_ridotto=b(ib,:);
thanks anyway, Doriana
yes, you're right,
I have mistakenly swapped a and b,
however I could not use your command because 'a' and 'b' must have the same observations and not only the same size...

请先登录,再进行评论。

更多回答(0 个)

类别

帮助中心File Exchange 中查找有关 Statistics and Machine Learning Toolbox 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by