Photograph of the Smithsonian Institution Building in Washington, D.C.
Smithsonian Institution building, from Wikimedia Commons

We are excited to announce today that the Library Innovation Lab has expanded our Public Data Project beyond datasets available through Data.gov to include 710 TB of data from the Smithsonian Institution — the complete open access portion of the Smithsonian’s collections. This marks an important step in our long-running mission to preserve large scale public collections both for our patrons and for posterity.

Scanned image of suffragette ribbon that reads, "Votes for Women — Brooklyn Woman Suffrage Association — 1869 — 'Failure Is Impossible' — S. B. A."
From the National Museum of American History. Creative Commons 0 License

The Smithsonian has an incredible 157.5 million items and specimens, of which 18.4 million are searchable and 5.1 are released under a public domain license, offering an extraordinary view of the American experience — everything from Thomas Jefferson’s own compilation of Bible verses to 3D images of the grand piano owned and used by Thelonious Monk, from Samuel Morse’s transcription of the first telegraph message sent in 1844 to the Women’s Suffragette Ribbon.

The Smithsonian has had the mission, since its founding in 1846, to pursue “the increase and diffusion of knowledge.” In the past, this could only be done by visiting Smithsonian museums in person. Now that its collections are also digital, we are grateful to be able to do our part in preserving and sharing our nation’s cultural heritage.

Our initial collection includes some 5.1 million collection items and 710 TB of data. As is always our practice, we have cryptographically signed these items to ensure provenance and are exploring resilient techniques to share access to them, which we plan to launch in the future.

From the National Museum of African American History and Culture. Creative Commons 0 License