Exploring Caselaw Interfaces

Courts and the legal publishers that serve them, by necessity, are creatures of habit. A case's fundamental structure hasn't changed much, whether published early in the 19th century or during the COVID pandemic. Even when publishers started taking their wares online, they didn't stray far from their well-worn model. In many ways, that's a good thing. I imagine legal research and writing would be much more arduous if fundamental case elements were as inconsistent as citation schema over the years.

But we think these cases have undiscovered uses beyond informing legal arguments. We know that NLP (Natural Language Processing) folks have already made use of the API and bulk download tools we built at http://case.law. Still, the most frequently accessed pages on our website are individual case pages from google visitors. What are their needs? Historical research? Family history? ... leisure? Even if the fundamental structure of a case is necessarily immutable, are there opportunities for novel interfaces to bring these works to new audiences?


The first step I took was to assemble a list of actions that people perform on collections of things.

a hand-scribbled list of verbs

Among these ideas, I was most interested in enhancing people's ability to cut through the endless walls of text we serve up to find what they're looking for. This is a more cut-and-dried topic for an interface exploration, so I spent most of my time there.

I am also interested in humanizing the stories behind these cases through narrative. Too often, the technical analysis of these legal documents overshadows that they describe real events in real people's lives. Not only have the subjects of these cases often endured gruesome, traumatic events, but the trials themselves are often traumatic. While I only lightly touched on this direction here, I'd very much like to explore it in the future.

The Results

Topic Explorer

Topic Explorer is a simple idea based on data or a data interface that does not exist. What if you could find the number of cases that contain a specific word and then get a list of the most frequently used important words in those cases?

an inverted triangle cut into sections each with search terms and results

At that point, you could add that word to your search.

an inverted triangle cut into sections each with search terms and results

Or hide it to expose more words.

an inverted triangle cut into sections each with search terms and results

Exclude it from your search to go in a different direction.

an inverted triangle cut into sections each with search terms and results

Trace Topic

Though based on the same interest in exploring a topic, this approach is a bit different. The idea is that within a case, you could highlight a word and then see how frequently that word appears in cases that cite to the case you're reading and cases that cite to those cases. The idea is that you could drill down from that topic into different usages within related cases.

a picture of a document with one word highlighted, and a number of documents around it.

The color of the case represents the relevance of the term in that search, or whatever else you want it to be, really.

Clandestine Conversation

This completely different approach to digging into a specific topic involves trying to facilitate conversation among readers. Maybe someone could annotate a highlighted passage with an invitation to discuss it.

Enter the text: a screenshot of a portion of text with a bit highlighted, and a small box pointing to it in which some text is entered in an input field

Users see a symbol: a screenshot of a portion of text with a bit highligted, and a small "i" icon next to it

They click on it and get the invitation: a screenshot of a portion of text with a bit highlighted, and a small box pointing to it in which some text invites a user to converse about the highlighted text

Ratings and Reviews

Maybe people have feelings about cases best expressed through star ratings and reviews? Frankly, they probably don't, but it seemed like too familiar an idiom to ignore.

a screenshot of a caselaw viewing toolbar interface with a "Ratings and review" section added, like on an ecommerce site.

If you haven't had a chance to check out our trends viewer, I highly recommend you drop what you're doing and play for a little while. Like Google's Ngram viewer, it will tell you the frequency with which a word appears in cases over time. You can even split it up by jurisdiction! However, if you want to see how something trends in ALL jurisdictions, it's a little tough to read.

Rather than having all years and jurisdictions visible, I represented jurisdictions on a map and added a year scrubber control. You can get the precise numbers for that year from the list on the right.

a map of the united states on which the states are varying in opacity based on some data, a timeline above it, and a data table to the right

3D Timeline Explorer

Our developer Anastasia is working on a very cool legally-focused storytelling interface we call Timeline. Its users can create legally-focused timelines that include cases, important dates and events, and narrative. Inspired by some of the new proximity conferencing tools, such as gather.town, I designed an interface with which someone could explore one of these timelines in a 3D environment.

Users access different bits of media when moving their sprite over different hot spots on the timeline.

a 3d cartoon depiction of a hallway with a timeline on the floor, marked with various hot spot symbols for sounds, movies or articles

Since we are primarily a caselaw database, court cases would probably get special treatment. Each case could have a virtual courtroom with different hot spots for different participants in the process.

a 3d cartoon depiction of a courtroom marked with various hot spot symbols for sounds, movies or articles

Sound of an Opinion

Like Topic Explorer, Sound of an Opinion would require data we don't yet have. Using pre-made or algorithmically-created sound clips, we would convey the emotional tone and other measurable facets of an opinion based on text sentiment analysis. In my simplistic demo, I correlate positivity/negativity with instrumentation and scale, verb density with the drumline volume, and adjective density with the drumline complexity. The sound clips were created in Logic Pro X using Apple Loops and their algorithmic drum beat creator.

a screenshot of a sound tile board

Check out this live ProtoPie demo (that will not work in Safari.)

Next Steps

While few, if any of these ideas will be fully realized, unencumbered, blue-skies thinking is time well spent around here. We've already started investigating the feasibility of generating and serving sentiment analysis data through our API. Do any of these ideas excite you? Do you have any ideas of your own you think belong here? Reach out and let us know!

This Is Just Amazing

The other day, I noticed this on the side of the house.

Category 5 cable with broken jacket

That is near the bottom of the run of Cat 5 Ethernet cable I installed over twenty years ago, from the cable modem and router in the basement through a window frame, up the side of the house and into the third floor through another hole in a window frame. What I found amazing was not so much that the cable, neither shielded nor rated for the out-of-doors, had lasted so long in such an amateurish installation, but that all of our Zoom meetings for the last eight months had passed through these little wires.

The really amazing part, beyond the near-magic of all that audio and video flying through little twists of copper, is the depth of dependency: at each end of that cable is hardware that changes voltages on the wires, operating system drivers for interacting with the hardware, the networking stacks of the operating systems that offer network interfaces to software, the software itself, the systems of authentication and authorization that the software uses to permit or deny access—a cascade of protocols, standards, devices, programming languages, and codebases that become the (mostly) seamless experience of the discussion we have at ten each morning. Or, a moment later, the experience of confirming that the city has accepted the ballot I mailed.

Starry-eyed delight in an amazing machine is clearly not sufficient, with as good a view as we now have of the broken dream of a liberatory Internet. We have to have an acute awareness of the system accidents implicit in our tools and the societal technologies that are connected to them. I believe the delight is necessary, though—without it, I don't see how we can ever learn to treat computers as anything other than an apparatus of control. There's hope, if a grimy cable with a broken jacket can carry joy.

Tech Tip: Sorting Cases on Analysis Fields

Last month we announced seven new data fields in the Caselaw Access Project. Here are API calls to the cases endpoint that demonstrate how to sort on these fields. Note the query strings, especially the use of the minus sign (-) to reverse order.

All cases ordered by PageRank, a measure of significance, in reverse order, so the most significant come first:


All cases sorted by word count, from longest to shortest:


Introducing CAP Case Analysis

We’re announcing a new layer of information in the Caselaw Access Project.

Among seven new data fields are PageRank, the all-time significance of a case based on our citation graph, and Cardinality, the number of unique words in a case. These and other fields derived from case text allow us to do things like identifying the longest court opinion ever published, or investigating how language in cases has changed over time. You can view analysis fields in the sidebar when browsing or via the API.

We want to hear about what you’ve learned and created using these fields. Let us know!

Summer 2020 CAP Systems Update

Today we’re sharing an update to Caselaw Access Project systems. This change shows one way libraries can support access to large datasets at low cost. Here’s how we did it.

Unlike many services that run in the cloud, CAP runs on bare-metal servers. Running on bare metal solves two problems for us as a nonprofit: it gets us faster servers for less money, and it means we can offer high-traffic or CPU-intensive services to our users without risking an unexpected bill at the end of the month.

In the last few weeks we moved our main server to a new 64-core CPU with all-SSD storage. As long as we were doing that, we took the opportunity to upgrade our stack from Debian 9 to Debian 10, Python 3.5 to 3.7, Postgres 9.6 to 11, and Elasticsearch 6 to 7, as well as updating our own software to be compatible with the new stack.

The upshot is that our most resource-intensive tasks, like citation extraction, bulk exports, and rebuilding our search index now run about 20 times faster than they did a few weeks ago. This helps us move large amounts of data more quickly, for less money. We're looking forward to using that faster server for new features, like custom, on-demand bulk exports for researchers.

We like to talk about the systems behind CAP. Have questions about how CAP works? Let us know!

Caselaw Access Project Cite Grid

Today we’re sharing Cite Grid, a first visualization of our citation graph data. Citation graphs are a way to see relationships between cases, and to answer questions like “What’s the most cited jurisdiction?” and “What year was the most influential in U.S. case law?”

You can explore this visualization two ways. The map view allows you to select a jurisdiction, and view inbound and outbound citations. This shows states more likely to cite that jurisdiction in a darker color. For example, when viewing Texas, the states Missouri and California are shown as most likely to cite that state.

Map view showing inbound citations to Texas, with Missouri and California shown as most likely to cite that state.

The grid view allows you to view the percentage of citations by and to each state. Here’s an example! When we select one square, we can see that 1.4% of cases from Colorado cite California.

Grid view showing 1.4% of cases from Colorado citing to California.

Do you want to create your own visualization with the data supporting this tool? We’re sharing the dataset here. If you’re using our citation graph data, we want to hear about it, and help you spread the word!

Guest Post: An Empirical Study of Statutory Interpretation in Tax Law

This guest post is part of the CAP Research Community Series. This series highlights research, applications, and projects created with Caselaw Access Project data.

Jonathan H. Choi is a Fellow at the New York University School of Law and will join the University of Minnesota Law School as an Associate Professor in August 2020. This post summarizes an article recently published in the May 2020 issue of the New York University Law Review, titled An Empirical Study of Statutory Interpretation in Tax Law, available here on SSRN.

Do agencies interpret statutes using the same methodologies as courts? Have agencies and courts changed their interpretive approaches over time? And do different interpretive tools apply in different areas of law?

Tax law provides a good case study for all these questions. It has ample data points for comparative analysis: the IRS is one of the biggest government agencies and has published a bulletin of administrative guidance on a weekly basis for more than a hundred years, while the Tax Court (which hears almost all federal tax cases) has been active since 1942. By comparing trends in interpretive methodology at the IRS and Tax Court, we can see how agency and court activity has evolved over time.

The dominant theoretical view among administrative law scholars is that agencies ought to take a more purposivist approach than courts—that is, agencies are more justified in examining indicia of statutory meaning like legislative history, rather than focusing more narrowly on the text of the statute (as textualists would). Moreover, most administrative law scholars believe that judicial deference (especially Chevron) allows agencies to select their preferred interpretation of the statute on normative grounds, when choosing between multiple competing interpretations of statutes that are “reasonable.”

On top of this, a huge amount of tax literature has discussed “tax exceptionalism,” the view that tax law is different and should be subject to customized methods of interpretation. This has a theoretical component (the tax code’s complexity, extensive legislative history, and specialized drafting process) as well as a cultural component (the tax bar, from which both the IRS and the Tax Court draw, is famously insular).

That’s the theory—but does it match empirical reality? To find out, I created a new database of Internal Revenue Bulletins and combined it with Tax Court decisions from the Caselaw Access Project. I used Python to measure the frequency of terms associated with different interpretive methods in documents produced by the IRS, the Tax Court, and other federal courts. For example, “statutory” terms discuss the interpretation of statutes, “normative” terms discuss normative values like fairness and efficiency, “purposivist” terms discuss legislative history, and “textualist” terms discuss the language canons and dictionaries favored by textualists.

It turns out that the IRS has indeed shifted toward considering normative issues rather than statutory ones:

Graph showing "Statuatory and Normative Terms in IRS Publications" and the relationshp between year and Normalized Term Frequency.

In contrast, the Tax Court has fluctuated over time but has been stable in the relative mix of normative and statutory terms:

Graph showing "Statuatory and Normative Terms in Tax Court Decisions" and the relationshp between year and Normalized Term Frequency.

On the choice between purposivism and textualism, we can compare the IRS and the Tax Court with the U.S. Supreme Court. The classic story at the Supreme Court is that purposivism rose up during the 1930s and 1940s, peaked around the 1970s, and then declined from the 1980s onward, as the new textualism of Justice Scalia and his conservative colleagues began to dominate jurisprudence at the Supreme Court:

Graph showing "Purposivist and Textualist Terms in Supreme Court Decisions" and the relationshp between year and Normalized Term Frequency.

Has the IRS followed the new textualism? Not at all—it shifted toward purposivism in the 1930s and 1940s, but has basically ignored the new textualism:

Graph showing "Purposivist and Textualist Terms in IRS Publications" and the relationshp between year and Normalized Term Frequency.

In contrast, the Tax Court has completely embraced the new textualism, albeit with a lag compared to the Supreme Court:

Graph showing "Purposivist and Textualist Terms in Tax Court Decisions" and the relationshp between year and Normalized Term Frequency.

Overall, the IRS has shifted toward making decisions on normative grounds and has remained purposivist, as administrative law scholars have argued. The Tax Court has basically followed the path of other federal courts toward the new textualism, sticking with its fellow courts rather than its fellow tax specialists.

That said, even though the Tax Court has shifted toward textualism like other federal trial courts, it might still differ in the details—it could favor some specific interpretive tools (e.g., certain kinds of legislative history, certain language canons) over others. To test this, I used Python’s scikit-learn package to train an algorithm to distinguish between opinions written by the Tax Court, the Court of Federal Claims (a federal court specializing in money claims against the federal government), and federal District Courts. The algorithm used a simple log-regression classifier, with tf-idf transformation, in a bag-of-words model that vectorized each opinion using a restricted dictionary of terms related to statutory interpretation.

The algorithm performed reasonably well—for example, here are bootstrapped confidence intervals reflecting the performance of the algorithm in classifying opinions between the Tax Court and the district courts, showing Matthews correlation coefficient, accuracy, and F1 score. The white dots represent median performance over the bootstrapped sample; the blue bars show the 95-percent confidence interval, the green bars show the 99-percent confidence interval, and the red line shows the null hypothesis (performance no better than random). The algorithm performed statistically significantly better than random, even at a 99-percent confidence level.

Confidence intervals showing "the performance of the algorithm in classifying opinions between the Tax Court and the district courts, showing Matthews correlation coefficient, accuracy, and F1 score. The White dots represent median performance over the bootstrapped sample; the blue bars show the 95-percent confidence interval, the green bars show the 99-percent confidence interval, and the red line shows the null hypothesis (performance no better than random)."

Because the classifier used log regression, we can also analyze individual coefficients to see which particular terms more strongly indicated a Tax Court decision or a District Court decision. The graph of these terms is below, with terms more strongly associated with the District Courts below the line in red, and the terms more strongly associated with the Tax Court above the line in green. These terms were all statistically significant using bootstrapped significance tests and correcting for multiple comparisons (using Šidák correction).

Graph showing individual terms and the strength of their relationship to District Courts or Tax Court.

Finally, I used regression analysis (two-part regression to account for distributional issues in the data) to test whether the political party of the Tax Court judge and/or the case outcome could predict whether an opinion was written in more textualist or purposivist language. The party of the Tax Court judge was strongly predictive of methodology; but case outcome (whether the taxpayer won or the IRS won) was not.

Table showing "Regression Results for Party Affiliation in Tax Court Opinions, 1942 - 2015" including dependent variables for purposivist and textualist terms per million words, for "Democrat", "Year Judge Appointed", "Taxpayer Wins", "Opinion Year Fixed Effects", and "N".

The published paper contains much more detail about data, methods, and findings. I’m currently writing another paper using similar methodology to test the causal effect of Chevron deference on agency decisionmaking, so any comments on the methods in this paper are always appreciated!

Data Science for Case Law: A Course Collaboration

We just wrapped up a unique, semester-long collaboration between the Library and the data science program at SEAS.

This semester Jack Cushman and I joined the instructors of Advanced Topics in Data Science (CS109b) to lead a course module called Data Science for Case Law. Working closely with instructors, we challenged the students by asking them to apply data science methods to generate case summaries (aka "headnotes") with cases from CAP.

The course partnered with schools across campus to create six course modules, from predicting how disease spreads with machine learning, to understanding what galaxies look like using neural networks. We introduced our module by reviewing and discussing a case, and framed our goal around the need for freely available case summaries.

This challenge was a highlight of the semester. Students presented their work at the end of the term, which included multiple approaches to creating case summaries - like supervised and unsupervised models for machine learning and more.

We’re looking forward to new collaborations in the future, and want to hear from you. Have ideas? Let’s talk!

Caselaw Access Project Nominated for a Webby: Vote for Us!

The Caselaw Access Project has been nominated for one of the 24th Annual Webby Awards. We’re honored to be named alongside this year’s other nominees, including friends and leaders in the field like the Knight First Amendment Institute.

CAP makes 6.7 million cases freely available online from the collections of Harvard Law School Library. We’re creating new ways to access the law, such as our case browser, bulk data and downloads for research scholars, and graphs that show how words are used over time.

Brown v. Board of Education, 347 U.S. 483, 98 L. Ed. 2d 873, 74 S. Ct. 686 (1954)

If you like what we're doing, we would greatly appreciate a minute of your time to vote for the Webby People’s Voice Award in the category Websites: Law.

Do you have ideas to share with us? Send them our way. We’re looking forward to hearing from you.