Remove numbers during preprocessing
4 次查看(过去 30 天)
显示 更早的评论
I would like to remove numbers within text. I have this function or script for the preprocessing, how I can remove all numbers?
%Create Co-occurence Network for only class1 and 0 5%
data = dataone.text;
%textdata = data.text;
data = randsample(data,100)
%data=data(1:100,1)
documents = preprocessText(data);
bag = bagOfWords(documents);
bag1 = removeInfrequentWords(bag,2);
counts = bag1.Counts;
cooccurrence = counts.'*counts;
G = graph(cooccurrence,bag1.Vocabulary,'omitselfloops');
0 个评论
回答(1 个)
Ergin Sezgin
2022-9-30
Hello Rachele,
Try using the following code with your string array.
words = ["stringOne", "stringTwo", "2022", "stringThree"]
doubleArray = str2double(words)
nanIdx = isnan(doubleArray)
wordsArray = words(1,nanIdx)
Good luck
2 个评论
Ergin Sezgin
2022-9-30
If the issue is with a char array, its possible to remove all numbers from it, checking each element by an explicit loop or vectorization. If there are multiple char elements in a container, same method should also work after some additional steps are added. Could you please share some of the data?
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Cell Arrays 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!