The internet is constantly changing. It’s rare when a site stays the same for years — far more often, sites change so often that you almost can’t rely on older links at all. It’s a terrible problem for citing sites in research papers, and an even worse problem for historians and geeks who’d like to look back at the beginning of the internet, since it’s largely already disappeared.
That is, it would have all disappeared if the Internet Archive wasn’t around. This ambitious non-profit project aims to build the definitive internet library by snapshotting much of the crawalable web and makes it available for anyone to sort through in the Wayback Machine, among other projects. It’s an incredible tool to look back and see, say, how Apple.com looked in ’98, and it just recently got a facelift and an API that makes it even more useful.
Browsing the Web of Yesteryears
The new Wayback Machine
The Internet Archive’s Wayback Machine is an incredible resource, one so valuable that it could have dancing baby gifs embedded in every page and it’d still be one of the best things about the internet. But a nice redesign is always welcome, and this year’s refresh of the WayBack Machine makes it suddenly feel like a modern website.
It’s still the same tool as before at the core. You type in the site or specific page you want to lookup, then select the old versions of the site. Seconds later, you’ll see the internet of the past, with a link to the specific shot of that page on that specific date you can share or cite in your work. It’s fun — but more than that, it’s the perfect way to find dead links and content that’d otherwise have disappeared from the web.
The citation tool you should be using.
The nicest new feature, though, is the new Save Page Now tool. On the Wayback Machine page, just enter a page you want to save in the Save Page box in the bottom right corner, and it’ll generate a current snapshot of that page and give you a link to reference. That new feature is immediately the best way to cite pages in documents and perhaps Wikipedia, since you can guarantee that the link won’t rot. It’s something the Supreme Court should immediately love, since they’ve found that a staggering 49% of the links in cases are now dead.
Right along with that, the Wayback Machine got a new JSON API that makes it simple to see if a page is cached in the Wayback Machine and return a link to the archived page. There’s also their original CDX API and support for the Memento API to get links to specific timeshots of sites from the archive. Combine that with their free Wayback Machine-powered 404 page and a WordPress plugin to check for dead links, and there’s everything you could need to make the cached history of the web easily accessible.
It’s Not Just the Internet
You’ve never seen TV news like this
Webpages disappear at an alarming rate, but it’s not the only media that’s fleeting. TV shows might hang around on DVD and Netflix for a while, but imagine trying to find what your local TV news said on a specific date? Archiving everything that’s ever broadcasted on TV seems far too lofty a task, but the Internet Archive is currently archiving everything from 23 major American TV news networks at their TV News site. It was started during the last US election to make it easier to quote what was said on the news, and has continued to index the full text of everything broadcasted on the stations they’re caching. You can search what’s being said, see the top topics right now, watch short clips where the text was said, and borrow full DVDs from the library for research. It’s the most incredible thing to ever happen to TV news.
Yup, I’m running VisiCalc on my MacBook Air.
The Official Site of Record.
See Google before it was worth billions, courtesy of the Wayback Machine.
Put the API and the Save Page tool together, and the Wayback Machine should start being considered the best way to cite webpages in research and education. The Wayback Machine is fun just to see how sites used to look, just like Google Maps Streetview is a fun way to go on a virtual trip from your browser. But both are far more important as tools, and these latest additions to the Wayback Machine should cause it to be seen as such. Sure, there’s the possibility that the project won’t be kept going forever, but I’d say there’s a far greater possibility that the sites you link to will go offline first.
So go have fun remembering what the web looked like back when you got AOL trial floppies and CDs in the mail all the time, and then remember to use the Wayback Machine the next time you’re linking to something you’d hate to see go away. It’s the best guarantee against link rot we have today.