MATLAB 帮助中心
Remove short words from documents or bag-of-words model
newDocuments = removeShortWords(documents,len)
newBag = removeShortWords(bag,len)
newDocuments = removeShortWords(documents,len) removes words of length len or less from documents.
newDocuments
documents
len
example
newBag = removeShortWords(bag,len) removes words of length len or less from the bagOfWords object bag.
newBag
bag
bagOfWords
collapse all
Remove the words with two or fewer characters from a document.
document = tokenizedDocument("An example of a short sentence"); newDocument = removeShortWords(document,2)
newDocument = tokenizedDocument: 3 tokens: example short sentence
Remove the words with two or fewer characters from a bag-of-words model.
documents = tokenizedDocument([ ... "an example of a short sentence" "a second short sentence"]); bag = bagOfWords(documents); newBag = removeShortWords(bag,2)
newBag = bagOfWords with properties: NumWords: 4 Counts: [2×4 double] Vocabulary: ["example" "short" "sentence" "second"] NumDocuments: 2
tokenizedDocument
Input documents, specified as a tokenizedDocument array.
Input bag-of-words model, specified as a bagOfWords object.
Maximum length of words to remove, specified as a positive integer. The function removes words with len or fewer characters.
Output documents, returned as a tokenizedDocument array.
Output bag-of-words model, returned as a bagOfWords object.
Introduced in R2017b
removeWords | stopWords | removeLongWords | normalizeWords | tokenizedDocument | bagOfWords | bagOfNgrams
removeWords
stopWords
removeLongWords
normalizeWords
bagOfNgrams
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
选择网站
选择网站以获取翻译的可用内容,以及查看当地活动和优惠。根据您的位置,我们建议您选择:。
您也可以从以下列表中选择网站:
如何获得最佳网站性能
选择中国网站(中文或英文)以获得最佳网站性能。其他 MathWorks 国家/地区网站并未针对您所在位置的访问进行优化。
美洲
欧洲
亚太
联系您当地的办事处