How to draw co-occurence network by using "Nouns" only in MATLAB Text Analytics Toolbox?
1 次查看(过去 30 天)
显示 更早的评论
Hello,
I have some trouble when conducting text analytics by using MATLAB.
I want to perform 1) Draw Co-occurence Network Diagram by using most occured 100 Nouns Only and 2) Draw Frequency Table/bar plot of most occured Nouns.
My code is as follows. I conducted the POS(Part of Speech) , but i can't proceed the from now.
Thanks in Advance!!!
T = readtable('D:/OneDrive/evpostridereview.csv');
t.desc = T.review;
cleanedDocuments = tokenizedDocument(t.desc); % 한번 뻗었는데 두번째 시도에서 됨
cleanedDocuments = addPartOfSpeechDetails(cleanedDocuments);
% Remove a list of stop words then lemmatize the words. To improve
% lemmatization, first use addPartOfSpeechDetails.
cleanedDocuments = removeStopWords(cleanedDocuments); % 실행 성공
stopwords =["전기차","하이브리드","현대","기아","아이오닉","쏘나타","카렌스","sm5","소나타","아이오","테슬라","를","의","이","중고차","휴게소","자동차"];
cleanedDocuments = removeWords(cleanedDocuments,stopwords);
cleanedDocuments = normalizeWords(cleanedDocuments,'Style','lemma'); % 뻗었다가 다시 됨
% Erase punctuation.
cleanedDocuments = erasePunctuation(cleanedDocuments); % 한번에 성공
% Remove words with 2 or fewer characters, and words with 15 or more
% characters.
cleanedDocuments = removeShortWords(cleanedDocuments,2); % 한번에 성공
cleanedDocuments = removeLongWords(cleanedDocuments,15);
tdetails = tokenDetails(cleanedDocuments);
head(tdetails)
% Extract Noun
nouns = tdetails.Token(tdetails.PartOfSpeech=='noun');
% Wordcloud for nouns
figure
wordcloud(nouns)
title("EVPost 전기차 주행기 워드 클라우드")
% Co-Occurence Network
bag = bagOfWords(cleanedDocuments);
counts = bag.Counts;
cooccurence=counts.'*counts;
figure
G = graph(cooccurence,bag.Vocabulary,'omitselfloops');
LWidths = 5*G.Edges.Weight/max(G.Edges.Weight);
plot(G,'LineWidth',LWidths)
title("Co-occurence Network")
% Center Keyword Setting
word = "디자인"
idx = find(bag.Vocabulary == word);
nbrs = neighbors(G,idx);
bag.Vocabulary(nbrs)'
H = subgraph(G,[idx; nbrs]);
LWidths = 5*H.Edges.Weight/max(H.Edges.Weight);
plot(H,'LineWidth',LWidths)
title("Co-occurence Network - Word: """ + word + """");
2 个评论
Piyush Dubey
2023-6-26
The code seems to be algorithmically perfect can you elaborate on what issue are you facing while creating the co-occurence network.
回答(1 个)
Saksham
2023-8-18
Hi 상원 음,
I understand that you already have code for co-occurrence network and want to create co-occurrence network only for top occurring 100 nouns.
I also observed that the code is extracting nouns in “nouns” variable. After the comment “% Co-Occurrence Network”, please pass variable “nouns” in the “bagOfWords” function.
To find top 100 occurring nouns, you may try finding frequency of each word and then filter the words accordingly. To know more about counting word frequency, please follow the below link:
I hope the above shared suggestion and resource will be useful to you.
Sincerely,
Saksham
0 个评论
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!