Classification Accuracy is Degrading
3 次查看(过去 30 天)
显示 更早的评论
I am classifying text based on news headlines and I am achieving accuracy more than 80%. I want to improve it more.
But issue is that when I calculate the same with synonyms using the code below:
Doc = actxserver('Word.Application')
X = cellfun(@(word) invoke( Doc,'SynonymInfo',word), words, 'UniformOutput', false);
Synonyms = cellfun(@(X) get(X,'MeaningList'), X, 'UniformOutput', false);
Synonyms = cellfun(@(X) [words{X}; Synonyms{X}], num2cell(1:numel(words)), 'UniformOutput', false);
My accuracy falls very badly below and reach 40% or less.
Why is this happening?
2 个评论
回答(1 个)
Walter Roberson
2014-4-24
Because words can be used in different ways, you can have a word S that is a synonym of word A, and of word B, and yet word A and word B might not be synonyms. Especially if you happen to encounter words which are synonyms of one of the many meanings of the word "set" or "jack". For example, a "set" of cards or a "set" of dishes is a "collection", and to let glue "set" is to let it "cure", but "collection" and "cure" are not synonyms. You might have had "collection" and "cure" individually as being distinct, but when you add synonyms you add in "set" and that links the "collection" and "cure" and makes it more difficult to classify headlines that involve the words.
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Text Data Preparation 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!