This tutorial is an introduction to creating a data analysis workspace with Voyant and the Caselaw Access Project API. Voyant is a computational analysis tool for text corpora.

Import a Corpus

Let’s start by retrieving all full cases from New Mexico:

https://api.case.law/v1/cases/?jurisdiction=nm&full_case=true

Copy and paste that API call into the Add Texts box and select Reveal. Here’s more on how to create your own CAP API call.

Create Stopwords

You’ve just created a corpus in Voyant! Nice 😎. Next we’re going to create stopwords to minimize noise in our data.

In Voyant, hover over a section header and select the sliding bar icon to define options for this tool.

Blue sliding bar icon shown displaying text "define options for this tool".

From the Stopwords field shown here, select Edit List. Scroll to the end of default stopwords, and copy and paste this list of common metadata fields, OCR errors, and other fragments:

id
url
name
name_abbriviation 
decision_date
docket_number 
first_page
last_page
citations
volume 
reporter 
court 
jurisdiction
https
api.case.law
slug
tbe
nthe

Once you’re ready, Save and Confirm.

Your stopwords list is done! Here’s more about creating and editing your list of stopwords.

Data Sandbox

Let’s get started. Voyant has out of the box tools for analysis and visualization to try in your browser. Here are some examples!

Summary: “The Summary provides a simple, textual overview of the current corpus, including (as applicable for multiple documents) number of words, number of unique words, longest and shortest documents, highest and lowest vocabulary density, average number of words per sentence, most frequent words, notable peaks in frequency, and distinctive words.”

Here’s our summary for New Mexico case law.

Termsberry: “The TermsBerry tool is intended to mix the power of visualizing high frequency terms with the utility of exploring how those same terms co-occur (that is, to what extend they appear in proximity with one another).”

Here’s our Termsberry.

Collocates Graph: “Collocates Graph represents keywords and terms that occur in close proximity as a force directed network graph.”

Here’s our Collocates Graph.

Today we created a data analysis workspace with Voyant and the Caselaw Access Project API.

To see how words are used in U.S. case law over time, try Historical Trends. Share what you find with us at info@case.law.