How do I count and save twitter hashtags?
显示 更早的评论
I am writing a script that analyzes the hashtags from tweets that I saved in a text file. So far I managed to count the amount of hashtags in the file:
fid = fopen('Tweets.txt');
numberOfTweets = 0;
while i ~= -1
i = fgetl(fid);
numberOfTweets = numberOfTweets + 1;
end
numberOfTweets = numberOfTweets - 1;
frewind(fid)
for i = 1:numberOfTweets
twitterStuff{i} = fgetl(fid);
end
numberOfHash = 0;
for i = 1:numberOfTweets
if(strfind(twitterStuff{i}, '#') ~=0);
c = strfind(twitterStuff{i}, '#');
[rowHash columnHash] = size(c);
numberOfHash = numberOfHash + columnHash;
end
end
Now, I want to find out what the specific hashtags are and save them into a cell array, but I don't really know how to do that.
2 个评论
Walter Roberson
2012-12-14
Is # by itself a hashtag? Is #this#that with no spaces two hashtags? Is #35 a valid hashtag? Is #? a valid hashtag?
Abim
2012-12-14
采纳的回答
更多回答(2 个)
Sean de Wolski
2012-12-14
编辑:Sean de Wolski
2012-12-14
Using regular expressions:
str = '#MATLAB is an awesome product by #MathWorks';
[matchstart,matchend,~,hashtag] = regexp(str,'(\#(\w*))')
类别
在 帮助中心 和 File Exchange 中查找有关 Workspace Variables and MAT Files 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!