Getting Started with Caselaw Access Project Data

Today we’re sharing new ways to get started with Caselaw Access Project data using tutorials from The Programming Historian and more.

The Caselaw Access Project makes 360 years of U.S. case law available as a machine-readable text corpus. In developing a research community around the dataset, we’ve been creating and sharing resources for getting started.

In our gallery, we’ve been developing tutorials and our examples repository for working with our data alongside research results, applications, fun stuff, and more:

Return Cases from 100 Years Ago Today with the CAP API
Retrieve Cases by Citation with the CAP Case Browser
Get Opinion Author
Creating a Data Analysis Workspace with Voyant and the CAP API

The Programming Historian shares peer-reviewed tutorials for computational workflows in the humanities. Here are a group of their guides for working with text data, from processing to analysis:

Cleaning Data with OpenRefine
Reshaping JSON with jq
Basic Text Processing in R
Getting Started with Topic Modeling and MALLET
Corpus Analysis with Antconc
Analyzing Documents with TF-IDF

We want to share and build ways to start working with Caselaw Access Project data. Do you have an idea for a future tutorial? Drop us a line to let us know!