Find unique rows in cell array with mixed data types

First I'm used to working in database structures where you have a table with a primary key linked to a table with foreign key. In matlab, I need to read an xls file where all the data is in one file. For example,
raw =
ID Type Customer Allergies SomeNumber Topping
1 pizza John NaN 9 Pepperoni
1 pizza John NaN 9 Mushrooms
1 pizza John NaN 9 Peppers
2 pizza Jill NaN NaN Olives
2 pizza Jill NaN NaN Canadian Bacon
I would then like to put this in a struct where the toppings are an array and the unique row values are represented one time. I tried using unique() but couldn't perform on a mixed cell array. Any ideas how to extract the unique information and then attach the non-unique elements? I ultimately want to construct a class with each entry so that I can loop through and output data to a different format.

 采纳的回答

I cannot recreate your problem. Here is what I did:
AA = {'A','B','A','C'};
BB = [1 2 1 1];
CC= [true,true,true,false];
Z=table(AA',BB',CC')
Var1 Var2 Var3
____ ____ _____
'A' 1 true
'B' 2 true
'A' 1 true
'C' 1 false
then simply:
unique(Z)
ans =
AA BB CC
___ __ _____
'A' 1 true
'B' 2 true
'C' 1 false
Correctly gives unique rows. IT is also possible I didn't understand your problem.

4 个评论

Thank you for your response, but I don't think it is that simple. The file I'm working with has NaN's, numbers, strings, so unique treats each NaN as unique. I get an error processing the numbers saying input needs to be a cell array, etc. Other methods, I can process the individual results as a column but I need the unique values grouped as a row. This is what I have thus far but get the error Input A must be a cell array of strings.
unk = unique(raw(cellfun(@x all(~isnan(x)),raw(:,1:n))));
If you work out an example, it might be more clear what exactly you are trying to achieve.
Why are NaNs important? Just replace them with something dummy:
AA = {'A','B','A','C'};
BB = [1 2 1 1];
CC= [true,true,true,false];
DD = [ NaN,NaN,NaN,NaN];
Z=table(AA',BB',CC',DD')
>> unique(Z)
ans =
Var1 Var2 Var3 Var4
____ ____ _____ ____
'A' 1 true NaN
'A' 1 true NaN
'B' 2 true NaN
'C' 1 false NaN
Z.Var4(isnan(Z.Var4))=0;
>> unique(Z)
ans =
Var1 Var2 Var3 Var4
____ ____ _____ ____
'A' 1 true 0
'B' 2 true 0
'C' 1 false 0
Also make sure you are using the table format.
Thank you for your time to comment. To clarify, I am trying to eliminate duplicates and find all the unique rows and the file I'm reading from has mixed values strings, values, empty cells. So when I eliminate the nans and convert cell2table, I still get the error message "Input A must be a cell array of strings". My first value is a number but I don't know what difference that makes but obviously, Unique() won't process both numeric and string values in either a table or cell array.
I converted all the double data types to char using the following:
charArr = cellfun(@num2str, raw, 'Un', 0 );
Then
unique (charArr, 'rows');
and get the correction 'rows' is not supported for cell array inputs.
So then I converted my cell array to Table
T = cell2Table(charArr);
unk = unique(T, 'rows');
and that worked I got
1 pizza John "" 9
2 pizza Jill "" ""
Now I can loop through the toppings using the ID number and put into a struct. So the issue was converting to all one data type then using a Table for finding unique rows.

请先登录,再进行评论。

更多回答(0 个)

类别

帮助中心File Exchange 中查找有关 Data Type Conversion 的更多信息

标签

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by