Dan Gillmor has a good post at Salon about archiving the Net, spurred by meetings at the Library of Congress. I'm especially interested in his comments—pointing to a post by Dave Winer—about the role of long-lived institutions, including universities.
Have we all concluded at this point that there is no hope of keeping a full and accurate archive? The Net is too vast, too every-changing, too complexly linked. I can't even keep a full archive of my own computer; the Mac's TimeMachine makes hourly backups, but not minutely or secondly, and it only preserves daily backups over the long-ish haul. All records are broken to one degree or another, because records require choices about what's worth recording and energy to do the recording. "Full record" is an oxymoron.
So the question is, what is the right periodicity and scope of the Internet record we want? Usually, questions about archives and records are relative to some use case. A general record of the Net is like a general record of life. So, we'll just have to make some choices that inevitably will turn out to be wrong for some unanticipated uses. We'll have to deal with it.
Personally, I'm heartened to see this discussion occurring at an institution with the gravitas of the Library of Congress, and that it includes people like Dan and Dave.