Info
此问题已关闭。 请重新打开它进行编辑或回答。
How to find mutual words in title field?
2 次查看(过去 30 天)
显示 更早的评论
The aim is to write a program that surveys an excel file on "title" field and distinguish the items which have mutual words in "title" field. (the excel file is attached). The question is can I use text analytics toolbox for this program if it is possible, How should I write its program? With great thanks
2 个评论
Paolo
2018-7-6
Is it a requirement to use the text analytics toolbox? I am sure the problem can be solved easily without it too. What is the expected output?
回答(2 个)
Sarah Palfreyman
2018-7-6
Yes, Text Analytics can help with this. It sounds like you are performing Topic Modeling, or grouping the data into sets of similar words.
https://www.mathworks.com/help/textanalytics/examples/analyze-text-data-using-topic-models.html
0 个评论
Christopher Creutzig
2018-11-26
If I read the question correctly, you are looking for a co-occurence matrix. You can get those from a bag-of-Words model by a matrix multiplication:
txt = ["my first title", "your first idea", "my second or minute"];
td = tokenizedDocument(txt);
bow = bagOfWords(td);
cnt = bow.Counts;
full(cnt*cnt.')
ans = 3×3
3 1 1
1 3 0
1 0 4
The first row says the first sentence has three words in common with itself (duh), and one each with the second and third sentence. The second and third string do not have words in common, hence the zeroes.
0 个评论
此问题已关闭。
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!