Introduction to Text Analysis with Voyant

Voyant (https://voyant-tools.org/) is an online tool that allows anyone to experiment with text analysis. No programming required! It’s all presented as a graphical webpage. You can try Voyant out with provided corpora, or you can submit your own text(s) as links, by copy and pasting, or by uploading files.

Once you’ve selected a corpus or supplied your own, you’ll be taken to a page that visualizes the text(s) in multiple ways. You can see an example below, based on the WPA Slave Narratives.

Voyant displaying analyses of the WPA Slave Narratives

Perhaps the first thing you’ll notice is the word cloud panel titled “Cirrus” in the upper left of the page. This word cloud neatly displays the most common words in the corpus via their size. You can tweak the word cloud with the “Terms” slider at the bottom to change the number of words displayed, and there are more options if you mouse up by the question mark and click the button that appears that looks like a switch. If your corpus is made up of more than one text, you can also use the scale drop-down (next to the “Terms” slider) to limit the word cloud to just one text rather than the aggregate of the corpus. The “Terms” view (at the top on the pane) provides a more numeric display of the data, showing the word counts and graphs of their relative frequency in your corpus’s texts.

The other graphical display you’ll notice is in the top right corner. The “Trends” pane starts by showing the relative frequency of the top words across the corpus. Each column is a text and each data point the relative frequency of a in that text. This view reveals whether a word’s usage is fairly stable across a corpus or whether it varies a lot.

Part of Voyant’s power is that it links together different methods of text analysis. If you click a word in the Cirrus word cloud, it will appear in the Trends graph. Similarly, it highlights that word in the upper center “Reader” pane which allows you to read one of the texts. The Reader pane also helpfully shows you the trend of where that word appears within the corpus’s texts along the bottom. The texts are represented as different colored blocks with the appearances of the word graphed across them. Clicking a block displays the corresponding text.

Selecting a word in the Reader makes it appear in the Trends graph just like clicking on a word in Cirrus did. However, picking a word in Reader also activates the bottom right “Contexts” panel where all the appearances of that word are displayed along with the words on either side of it. This view helps you understand each occurrence’s usage.

Finally, the bottom left “Summary” panel contains an overview of your corpus. It tells you which of your texts are the longest and shortest, the average number of words per sentence in each text, and the most frequent words, among other datapoints. Perhaps most interesting part of this panel is the “Distinctive words” section which lists which words appear disproportionately more in each text compared to the rest of the corpus.

Voyant gives you a lot of ways to explore texts. Try using its many views with their different options, and experiment with different texts and corpora!

css.php