Are We Breaking the Web With a Hash and a Bang?

The internet is a big and complicated place; it’s called “The Web” for good reason. It’s a giant ball of twisting and turning strings, a mass of connections between isolated blocks of information. This is its nature, its natural inclination — the formation of connections. The technology it’s built upon facilitates this, and through its relatively simple means, allows for an impressively wide array of websites to be created.

One of the newest practices popping up in web app development is known as “the hashbang”. It’s this little fellow: #!. You can find it in the URLs of some of the hottest social networks around. Why is it here? What does it do? Should we embrace it, or fear it? Is it here to stay?

The Why

Much has been written on the subject of why the hashbang exists. Mike Davies writes about the origins of the #! URL in his piece “Breaking the Web with hash-bangs”. He says:

The #!-baked URL (hash-bang) syntax first came into the general web developer spotlight when Google announced a method web developers could use to allow Google to crawl Ajax-dependent websites. … [It] was especially geared for sites that got the fundamental web development best practices horribly wrong, and gave them a lifeline to getting their content seen by Googlebot.

So Google codified this “standard”, throwing a bone to the site owners who improperly built their sites upon a brittle JavaScript foundation, hiding their content from plain site. That was back in 2009. Two years later we see this technique appearing in big name sites like Twitter and Facebook. These two in particular have a big reason for wanting to leverage the power of Ajax and client-side scripting as much as possible. More speed.

So developers in search of speed would get to use the latest and greatest, Ajax. And they could now appease the search engine gods thanks to Google’s spec. It looks like a win-win. But there’s more at play here.

We’re Web Sites

Twitter was the first big-time site to bring the hashbang URL scheme but it wasn’t the site that drew the most heated criticism, the one that spawned a dozen blog posts, creating a Web-wide discussion on the blessings and maledictions associated with URL design. That honor goes to Gawker Media.

See, Gawker Media issued a redesign of their fleet of blogs on February 7. Not long after, they experienced a five hour outage across their entire sphere of sites, specifically because of their hashbang URLs. The JavaScript on the page failed to load, and therefore, so did their content.

And that was the catalyst for the ensuing discussion. The next day Mike Davies blogged extensively one the subject, in the above quoted article. I strongly encourage anyone interested in the nitty-gritty specifics of this issue to read his article. A day after that, Jeremy Keith wrote “Going Postel” and elaborated on his feelings about the situation, pointing to Davies’ entry, as well as a few others.

What I think is interesting, is that the ire of the web community was raised mostly by Gawker’s new usage of the hashbang scheme. It was when a content-based web site adopted the JavaScript-dependent pattern that red flags started going up. And I think I know why.

In the case of a web site, URLs are of obvious importance. On his blog Warpspire, Kyle Neath wrote about why that is. He said:

URLs are universal. They work in Firefox, Chrome, Safari, Internet Explorer, cURL, wget, your iPhone, Android and even written down on sticky notes. They are the one universal syntax of the web. Don’t take that for granted.

The prospective performance gain is far outweighed by the danger of divorcing your content from the means of accessing it. We’ve spent the last decade building sites upon stable web standards, so that at the very least, a site’s content can be viewed by anything that understands HTTP. Gawker’s redesign threw all of that away, and broke the fundamental means that people use access their site.

We’re Web Apps

Now lets take a look back at the other prominent site to utilize hashbangs: Twitter. Twitter is the personification of Web 2.0, and the app revolution. According to Ben Cherry, a front-end engineer at Twitter, its web app status lets it play by slightly different rules:

We get URLs in our Web applications as a legacy of Web sites, and this feature is unique to this class of applications. … Unfortunately, supporting a standard URL structure for this type of application is not possible in current browsers, without using the URL hash. So, we use the URL hash to take advantage of this sort of thing, but not because we believe our application to be a traditional Web site. We do lose indexability and some other benefits of HTTP we may otherwise have, but these are not losses to our application, since we did not have to use a URL at all.

Emphasis mine.

That’s an interesting thought, web applications don’t have to use a URL at all. URLs are markers of content, and all a web app may need to mark is the entry point to the app, the root level domain. From there, there’s features and services, not necessarily content, right? Ben says later on in his post:

URLs are not necessary in this context, but we provide them (via the hash) because they bring tangible benefits for applications.

Wait a second. They’re not necessary, but they provide tangible benefits. How does that work?

It’s easy to say that a blog or news site is content based. That URLs are important for them, and that preserving their cleanliness and legibility is imperative. But who wouldn’t say that Twitter is important because of its content. Sure, its a web app, but it still has a foundation of content, and access to that content through logical URL schemes is a must.

Where We’re Heading

Hashbangs are an unpleasant stop-gap measure. They’re an arguably necessary evil of today’s app-infested Web. Ben Cherry even says as much in his blog post:

It’s not perfect. It’s not pretty (despite what Google says). But the hashbang is useful. Users benefit from the trend towards full-fledged applications, with increased speed and functionality.

But the thing to understand is, hashbangs aren’t the future. As a part of the HTML5 spec there is a History API. When adoption of that spec reaches a tipping point, developers will be able to get the best of both worlds: clean URLs and slick Ajax.

That’s the future. That’s what we should be looking forward to. We have to remember that the current hashbang syntax is an ugly hack that we should be looking to replace as soon as we’re able. The future for web apps is bright, and it doesn’t involve hashes or bangs.


  • http://hostingz.org desbest

    Any content website that uses a hash bang url syntax, that fails to have indexible the content with the javascript off, really is questionable and needs to be sorted out.

    Twitter says they get individual tweets indexed, and they can use the old twitter for indexing.
    Facebook has a directory, which lists all public profiles and pages from A to Z.
    Gizmondo.com has Gizmondo.net which gives the fallback version.
    All other Gawker Media blogs don’t have a fallback. Gawker is stupid.

  • http://www.3dy.ro/ Eduard M.

    A hash and a bang? Are we talking about this?
    http://crunchbanglinux.org/

    :)

    • http://zachlebar.com Zach LeBar

      I wasn’t…but I’m curious now. :) Always interested in checking out another Linux distro.

      • http://www.3dy.ro/ Eduard M.

        :)

theatre-aglow
theatre-aglow
theatre-aglow
theatre-aglow