I'd like any suggestions or guidance you have to offer on how to create a word cloud using only Matlab tools/functions.
"" Count the words, throw away boring words, and sort by the count, descending. Keep the top N words for some N. Assign each word a font size proportional to its count. Generate a Java2D Shape for each word, using the Java2D API.
Each word "wants" to be somewhere, such as "at some random x position in the vertical center". In decreasing order of frequency, do this for each word:
place the word where it wants to be
while it intersects any of the previously placed words
move it one step along an ever-increasing spiral
That's it. The hard part is in doing the intersection-testing efficiently, for which I use last-hit caching, hierarchical bounding boxes, and a quadtree spatial index (all of which are things you can learn more about with some diligent googling). ""
My ultimate goal is to be able to recreate the design of the above image (Source: UPenn "Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach"). I guess I'll need a way to create text objects that can be moved around a plot. Each word is associated with two values - one will determine font size, the other will determine font color. It will also be extended to include mini-word clouds around the perimeter of the central word cloud. Any suggestions?