Create Word Cloud from String Arrays
This example shows how to create a word cloud from plain text by reading it into a string array, preprocessing it, and passing it to the wordcloud
function. If you have Text Analytics Toolbox™ installed, then you can create word clouds directly from string arrays. For more information, see wordcloud
(Text Analytics Toolbox).
Read the text from Shakespeare's Sonnets with the fileread
function.
sonnets = fileread('sonnets.txt');
sonnets(1:135)
ans = 'THE SONNETS by William Shakespeare I From fairest creatures we desire increase, That thereby beauty's rose might never die,'
Convert the text to a string using the string
function. Then, split it on newline characters using the splitlines
function.
sonnets = string(sonnets); sonnets = splitlines(sonnets); sonnets(10:14)
ans = 5×1 string
" From fairest creatures we desire increase,"
" That thereby beauty's rose might never die,"
" But as the riper should by time decease,"
" His tender heir might bear his memory:"
" But thou, contracted to thine own bright eyes,"
Replace some punctuation characters with spaces.
p = ["." "?" "!" "," ";" ":"]; sonnets = replace(sonnets,p," "); sonnets(10:14)
ans = 5×1 string
" From fairest creatures we desire increase "
" That thereby beauty's rose might never die "
" But as the riper should by time decease "
" His tender heir might bear his memory "
" But thou contracted to thine own bright eyes "
Split sonnets
into a string array whose elements contain individual words. To do this, join all the string elements into a 1-by-1 string and then split on the space characters.
sonnets = join(sonnets); sonnets = split(sonnets); sonnets(7:12)
ans = 6×1 string
"From"
"fairest"
"creatures"
"we"
"desire"
"increase"
Remove words with fewer than five characters.
sonnets(strlength(sonnets)<5) = [];
Convert sonnets
to a categorical array and then plot using wordcloud
. The function plots the unique elements of C
with sizes corresponding to their frequency counts.
C = categorical(sonnets);
figure
wordcloud(C);
title("Sonnets Word Cloud")