404 is the HTTP “Not Found” error code. When a web application receives a request for a resource or page it can’t make sense of, it should return a 404 error and a page that corresponds to that error. You’ve almost certainly seen a 404 page in your life, and you’ll probably see tons more.
Why a 404 page happens can be down to about 1000 reasons. Maybe you changed CMSes and all your URLs broke. Maybe you intentionally shifted your website and market fit and decided that abandoning your old pages to history was a better move than setting up 301 “Moved Permanently” or 302 “Found or Temporary” redirections. Maybe a user is just randomly typing URLs into your site and you never had anything there in the first place. The absolute best result in each of those situations is different, and how your page can be most helpful is hard to know for certain.
Some General Thoughts on 404 Pages
- Be straightforward…
- Be calm…
- Be serious…
- Offer a next step.
Before we get to this article’s “one cool trick”, some basic things to consider about your 404 page — whether it’s served from a CMS, a complex web application, or anywhere along that continuum:
- Forwards are better than broken history. — There is room to debate this, but I belong to the “Cool URLs Don’t Change” school of web application design. A correlate of that school is the idea that when a change must happen, redirection is better than death. Certainly for sprawling sites and a forced CMS change on a tight budget, this isn’t always feasible. But as much as possible I think you should avoid showing people a 404 page if you can forward them to the new location of what they were looking for. It’s good for both search engine rankings and user experience.
How You Communicate the Error Matters — A few years ago I stumbled across a small site that MailChimp uses for internal writing guidelines, and its page about errors stuck with me: a highlight are the recommendations: “Be straightforward… Be calm… Be serious… Offer a next step.” I encourage you to read the linked page and site for more detail, but it’s important to realize that whether your 404 page’s audience is your technophobic grandfather or and frustrated developer trying to appease the boss currently breathing down her neck, something that’s too cute or technical will probably grate rather than satisfy. Explain what a 404 page is meaning in plain English all understand — “Sorry, the page you requested was not found.”
- Advice About What To Do Is Awesome — Maybe you’re making an API endpoint and your 404 pages should point to your (hopefully good and up to date) documentation. Maybe you’re creating a simple CMS 404 page, and you can easily throw a search box on there, or point to a simple visual sitemap. Whatever the case, if you can’t give the visitor the data they likely expect, pointing them to details about how they might find it is a good step.
History, Archives, and the Internet Archive
By now you may be wishing I’d just get to the one cool trick, but I think a diversion makes sense. And as a student of history I want to soapbox a bit. And I promise it’s relevant.
Basically, history relies on paper trails. Records that are accessible and kept available. We know about the distant past through veils and veils of data loss. Why do we know so much more about Europe than the native peoples of North America? There may be some “white bias” in there, but mostly it’s that Europe made and kept more records for longer and in better storage conditions than pre-Columbian American peoples. This isn’t a cultural distinction, it’s a technological one. More people who can read or write, and more people who value it, means more sources for future historians. Similarly, less records destroyed or rendered inaccessible for political, practical, or technological reasons means more sources for future historians.
The Internet Archive is a US not-for-profit group whose intention is to preserve that paper-trail of the internet for future historians. They save an much of internet as they can to make it available in the future. If it’s publicly accessible on the internet, the Internet Archive would like to save a copy of it. And does in a large number of cases. If you’ve never tried it, their Wayback Machine is a great little way to play with your past, or the collective past of the internet. Type in page and quickly see what it used to look like.
Finally, the One Cool Trick
So the one cool trick for 404 pages? It’s that the Internet Archive has offered a great enhancement to you, for free, using its public archive. This wouldn’t have a lot of value in the context of a greenfield project, at least not immediately, but it wouldn’t do any harm. The trick is to add to your 404 pages these two lines of HTML:
<div id="wb404"/> <script src="//archive.org/web/wb404.js"></script>
Essentially, you’re creating an empty
div if their history has a result for the current URL. It it find nothing, it just stays empty. Their blog post from when they released it last year gives a few more details. But it looks essentially like this:
NOTE: I discovered this Internet Archive feature in a a tweet from internet superhero — he of the Kind of Bloop IP debacle, the XOXO festival, the excellent Waxy.org, Supercut.org, etc — Andy Baio. Thanks for spreading the good word, Andy!