The following post was authored by Digital Archivist Paul Kelly.
My previous blog entry covered the digitization of physical things. Well, paper at least. We’re pretty on top of that! You take a page, scan it, and kablammo – it’s online, so to speak. But how do we deal with records that are already digital, like, say, web pages? Do we print them? Stick them in folders? What would that even look like? 136 billion pages, apparently. Granted, CUA’s site is tiny compared to the internet as a whole, but you get the general idea.
As it turns out, though, non-profit organization The Internet Archive has been doing more all these years than hosting bootlegs of old Smashing Pumpkins gigs. In fact, since 1996 they’ve also been saving snapshots of websites (47 billion and counting), all of which are accessible through the Wayback Machine. If you know the URL, chances are you can score some older version of the page. I dare you to find my Dead Journal from 2001. Continue reading “The Archivist’s Nook: Archiving the Internet – Do Web Crawlers Dream of Electric Sheep?”