In early 2025, the Library Innovation Lab launched the Public Data Project, a major effort to collect and publish federal datasets with proof of authenticity and provenance. Our work began with copying 311,000 datasets from Data.gov between November 2024 and January 2025. More recently, we moved to capture all public domain Smithsonian data.
The driving focus for this work is the “one copy problem.” Simply put, information that exists in a single location, or is supported by a single funding stream, or administered by a single entity, is at considerable risk of disappearing — or worse, being changed without notice. This problem, which has long been an area of focus and advocacy for us, threatens our cultural memory, our ability to access the data we need to know what has happened so we can plan where we are going. The sweeping loss of public data on the federal web beginning in 2025 is only the latest, and largest, demonstration of the internet’s mass fragility and vulnerability to shocks.
The Public Data Project is equipping a nationwide network of libraries, archives, and nonprofits with the tools they need to safeguard the most vulnerable U.S. federal data and to build the technical, organizational, and human infrastructure required for long-term, low-cost stewardship of public information. The Public Data Project builds on our history with large-scale data and digital preservation projects, such as the Caselaw Access Project and Perma.cc.
The Public Data Project’s current work includes:
- Producing open-source data monitoring tools in collaboration with America’s Data Index;
- Enhancing federal data access and visualization in collaboration with Radiant Earth;
- Developing graduate-level training curriculum for the next generation of librarians;
- Rethinking inter-governmental and inter-institutional frameworks for digital mutual aid to preserve cultural memory.
Check this page, as well as our blog, for updates to this project. We are grateful for the support of the John D. and Catherine T. MacArthur Foundation and the Rockefeller Brothers Fund (RBF). The opinions and views here do not necessarily state or reflect those of the contributors.