Import a Corpus
Let’s start by retrieving all full cases from New Mexico:
Copy and paste that API call into the Add Texts box and select Reveal. Here’s more on how to create your own CAP API call.
You’ve just created a corpus in Voyant! Nice 😎. Next we’re going to create stopwords to minimize noise in our data.
In Voyant, hover over a section header and select the sliding bar icon to define options for this tool.
From the Stopwords field shown here, select Edit List. Scroll to the end of default stopwords, and copy and paste this list of common metadata fields, OCR errors, and other fragments:
id url name name_abbriviation decision_date docket_number first_page last_page citations volume reporter court jurisdiction https api.case.law slug tbe nthe
Once you’re ready, Save and Confirm.
Your stopwords list is done! Here’s more about creating and editing your list of stopwords.
Summary: "The Summary provides a simple, textual overview of the current corpus, including (as applicable for multiple documents) number of words, number of unique words, longest and shortest documents, highest and lowest vocabulary density, average number of words per sentence, most frequent words, notable peaks in frequency, and distinctive words."
Here’s our summary for New Mexico case law.
Termsberry: "The TermsBerry tool is intended to mix the power of visualizing high frequency terms with the utility of exploring how those same terms co-occur (that is, to what extend they appear in proximity with one another)."
Collocates Graph: "Collocates Graph represents keywords and terms that occur in close proximity as a force directed network graph."
Today we created a data analysis workspace with Voyant and the Caselaw Access Project API.