- Define the sample data as a cell array.
- Use 'arrayfun' to apply the 'mostcommon' function to each row of the data.
- Output the results using disp.
Finding mode of each row in an array of Strings
5 次查看(过去 30 天)
显示 更早的评论
Currently I have an array with 3 columns and a lot of rows (about 50,000). Each value is a string I essentially want to compare the 3 values in a row and find the most common.
Say my input table looked like the following
Apple Bannana Apple
Cherry Cherry Apple
Mango Mango Mango
My outputs would be
Apple
Cherry
Mango
Please let me know if there is any advice, I have tried mode but it does not work for strings.
0 个评论
采纳的回答
Naga
2024-8-12
Dear Manas,
I understand you have a large array with 3 columns and many rows, where each value is a string. You want to find the most common string in each row and output these values. Here’s how you can do in MATLAB.
% Sample data
data = {
'Apple', 'Banana', 'Apple';
'Cherry', 'Cherry', 'Apple';
'Mango', 'Mango', 'Mango'
};
% Apply the function to each row and store results
mostCommonValues = arrayfun(@(i) mostCommon(data(i,:)), 1:size(data, 1), 'UniformOutput', false);
% Display the results
disp(mostCommonValues);
% Function to find the most common element in a cell array row
function commonValue = mostCommon(cellRow)
[uniqueElements, ~, idx] = unique(cellRow);
counts = accumarray(idx, 1);
[~, maxIdx] = max(counts);
commonValue = uniqueElements{maxIdx};
end
This approach should work efficiently even for large datasets like the one you mentioned with 50,000 rows.
Please refer to the below documentation to know more about the function 'arrayfun':
Hope this helps you!
更多回答(2 个)
Steven Lord
2024-8-14
If these strings represent data from one of several values in a category, consider storing the data as a categorical array.
str = ["Apple" "Banana" "Apple"; "Cherry" "Cherry" "Apple"; "Mango" "Mango" "Mango"];
C = categorical(str)
What fruits (categories) are present in C?
whichFruits = categories(C)
Can we ask for the most common category in each row?
M = mode(C, 2)
Does this work even if there's a missing value in C?
C(2, 2) = missing
mode(C, 2)
Now in row 2, Apple and Cherry occur equally frequently, but Apple comes first in the list of categories so it's the mode. [Apple (pi) a la mode? ;)]
Can we figure out how many elements of each category are in each row?
[counts, fruit] = histcounts(C(1, :))
or:
counts = countcats(C(1, :)) % No second output, returns counts in categories() order
0 个评论
Voss
2024-8-12
str = ["Apple" "Banana" "Apple"; "Cherry" "Cherry" "Apple"; "Mango" "Mango" "Mango"]
N = size(str,1);
modes = strings(N,1);
for ii = 1:N
[~,~,idx] = unique(str(ii,:));
modes(ii) = str(ii,mode(idx));
end
disp(modes)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Logical 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!