Find frequency of words from different books

4 次查看(过去 30 天)
I have a cell array of data collected from 5 different books
This is one of the cell arrays. It gives me the count of each word in the book (I used count{ii} = tabulate(text{ii}) ).
I need to create a unique count for all the words found in all the 5 books. So, for example, for the word 'the', I have to sum up all the frequencies in all 5 cells.
I was thinking about using a table but I really can't get it done.
Any ideas?

采纳的回答

Voss
Voss 2024-3-25
Maybe this will help:
% example data:
counts = { ...
{'the' 464; 'project' 87; 'of' 253} ...
{'the' 300; 'of' 314; 'nothing' 17; 'project' 13} ...
{'the' 100; 'price' 99; 'of' 114; 'everything' 12; 'value' 88; 'nothing' 54} ...
}
counts = 1x3 cell array
{3x2 cell} {4x2 cell} {6x2 cell}
% concatenate the cell arrays in counts and convert into a table
T = cell2table(vertcat(counts{:}),'VariableNames',{'word','count'})
T = 13x2 table
word count ______________ _____ {'the' } 464 {'project' } 87 {'of' } 253 {'the' } 300 {'of' } 314 {'nothing' } 17 {'project' } 13 {'the' } 100 {'price' } 99 {'of' } 114 {'everything'} 12 {'value' } 88 {'nothing' } 54
% use groupsummary to find the total counts
G = groupsummary(T,'word','sum')
G = 7x3 table
word GroupCount sum_count ______________ __________ _________ {'everything'} 1 12 {'nothing' } 2 71 {'of' } 3 681 {'price' } 1 99 {'project' } 2 100 {'the' } 3 864 {'value' } 1 88
  2 个评论
L
L 2024-3-25
That is exacly what I needed. What does the column Group Count means?
Voss
Voss 2024-3-25
You're welcome!
GroupCount is the number of times each word appears in the table T, so that would correspond to the number of books each word appears in. I don't think you need that information (it's automatically included by groupsummary), and you can remove it.
% example data:
counts = { ...
{'the' 464; 'project' 87; 'of' 253} ...
{'the' 300; 'of' 314; 'nothing' 17; 'project' 13} ...
{'the' 100; 'price' 99; 'of' 114; 'everything' 12; 'value' 88; 'nothing' 54} ...
};
% concatenate the cell arrays in counts and convert into a table
T = cell2table(vertcat(counts{:}),'VariableNames',{'word','count'});
% use groupsummary to find the total counts
G = groupsummary(T,'word','sum');
% remove GroupCount
G = removevars(G,'GroupCount')
G = 7x2 table
word sum_count ______________ _________ {'everything'} 12 {'nothing' } 71 {'of' } 681 {'price' } 99 {'project' } 100 {'the' } 864 {'value' } 88

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Characters and Strings 的更多信息

产品

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by