Welcome to Web 2.0 - Please Wait While Your Page Loads, Part 1

This is the first in a two-part series exploring the current state of Web 2.0 and the need for better optimization in the age of Rich Internet Applications. Part 1 gives an extremely brief history of Internet development and takes stock of the current state of the World Wide Web, specifically in terms of various big-name and AJAX-enabled sites. It also discusses large download sizes inherent with Rich Internet Applications andtheir impact on end-user perceptions. Part 2 introduces various client-side optimization methods that every Web 2.0 developer should perform and every Web 2.0 client should expect from their Internet solutions partner, and explores these optimization methods in detail.

In the Beginning...

In the good old days of the Internet - back in that other millennium - the majority of people surfed the World Wide Web over a 56kbs modem (or worse!). Bandwidth was at a premium. Download times were gigantic. Most people were running Windows 95 or Windows 98 on 386, 486, or Pentium I boxes using Lynx, I mean Mosaic, I mean Mozilla, I mean Internet Explorer... [ visit http://www.livinginternet.com/w/wi_browse.htm for a quick and interesting look at Web Browser history ] In short, web browsers were springing up everywhere, there were very few standards, and it took forever to download a web page that was anything more complicated than some text and perhaps an interlaced GIF or two.

During that nascent age of the Internet, web developers were faced with the daunting task of keeping the size of their pages to a minimum. Cascading style sheets hadn't come onto the scene, so everything was based on tables, and images were far and few between. JavaScript, which started to show up in web pages in the late 1990s, never really worked right, and was too slow to be practical for anything but the most basic of tasks. The main concern of web developers, after getting the correct information on the page in an organized manner, was to keep the size of those same pages to a minimum so that people could actually download them in less time than it took to smoke a cigarette.
 

Got Bandwidth - Will Travel

Fast forward from the 1990s to 2008, and more than 80% of all Internet users (and over 93% when you just consider the work force) in the United States are on broadband connections, according to a WebSiteOptimization.com report,and a September 2007 survey by the PEW Internet and American Life Project found that over 50% of Americans have "high-speed Internet connections" at home.

At the same time over the last 15 or so years, the W3C and other organizations have managed to standardize more-or-less all of the core components of the Web (XHTML, CSS, JavaScript, RSS). Web sites have evolved to contain full-color web-optimized images and interactive content, including streaming media and client-side Web 2.0 interfaces that push the bleeding edge of JavaScript and Cascading Style Sheets. It's not uncustomary to encounter web pages these days that contain upwards of 200 KB of text and embedded data in your usual daily browsing experience [ see Current State of the Internet, below ].

Web 2.0 Makes Interfacing Fun / Welcome to the Dark Side

With the rise of broadband, the adoption of XML for information encapsulation on the Web, and the maturity of JavaScript, the so-called Web 2.0 was born. Asynchronous JavaScript and XML (AJAX) came into the prime-time, because browsers could finally handle the complex JavaScript code quick enough and the boom in available bandwidth made it possible to send data back and forth with a reasonably tolerable latency.

Interfaces could be redesigned. Users could interact with web pages in an entirely new way. Data could be dynamically loaded up or updated in a page without having to reload the whole thing... Widgets could pop up, fly around, float, fade in and out, and do all of those other cool effects in real-time that users have become accustomed to thanks to the various personal computer operating system flavors. The whole world was suddenly wide-open, and anything seemed possible.

At a price.

Sprinkle a site with a few cool dynamic HTML and AJAX items, and you won't really notice the increased load when your browser downloads and executes the JavaScript code necessary to make it happen. Build a full-featured Rich Internet Application loaded with AJAX, however, and things could just come to a crawl (or even halt) when your browser has to pull down a couple hundred kilobytes of code files just to run a page... Welcome to the Dark Side.

Current State of the Internet

To see how some current leading web pages stack up, the team at Congruent Media ran a set of web page download speed reports using the very convenient Web Developer Toolbar plug-in for the FireFox web browser. "Total Size" reported includes all HTML, Cascading Style Sheets, JavaScript files, images, and any other embedded media that the browser is required to download in order to render and execute the intended user experience. "Broadband Speed" refers to the average download time required if accessing the page and related assets over a T1 1.44Mbps connection. "Modem Speed" refers to the average download time if the same page and assets were accessed via a 56kbs modem with a 0.7 correction factor to simulate packet loss. Both sets of results include 0.2 second corrections per object for round-trip connection latency. These tests were performed on the afternoon of July 23rd, 2008. Copies of the complete reports from WebSiteOptimization.com can be found here.

The reports provided some very enlightening results:

Web Site Total Size (B) # Objects Broadband Speed (s) Modem Speed (s) Calculated Latency (s) Images (B) External JavaScripts (B) External CSS (B) Other External Objects (B)
Adobe 484,440 99 22.37 116.35 19.80 233,618 227,271 23,551 0
Apple 180,611 22 5.36 40.40 4.40 47,189 104,589 24,691 0
BBC News 515,274 151 32.93 132.89 30.20 294,490 77,101 60,742 0
CNN 670,327 264 56.35 186.40 52.80 232,935 273,020 144,300 0
Congruent Media 259,133 41 9.57 59.84 8.20 36,583 169,967 16,568 23,803
Google.com 11,462 2 0.46 2.68 0.40 8,558 0 0 0
Microsoft 192,514 36 8.22 45.57 7.20 75,333 101,629 1,385 0
MSDN 198,635 46 10.25 48.79 9.20 41,992 136,449 4,359 0
The New York Times 416,965 72 16.61 97.50 14.40 231,995 69,218 4,717 0
YouTube 303,303 32 8.01 66.85 6.40 58,837 124,918 106,986 0
Median 281,218 44 9.91 54.32 8.70 67,085 114,754 20,060 0
Average 323,266 77 25.10 71.64 15.30 126,153 128,416 38,730 2,380
Standard Error 62,074 25 9.62 17.96 4.97 34,120 25,033 15,790 2,380

A quick look at the data shows that, according to this sampling, the average site uses approximately 123 KB of images, 125 KB of JavaScript files, and 38 KB of external CSS, in order to render and make itself functional to the end user. In all, this average indicates a total byte size of just over 316 KB and a broadband download speed of 25.1 seconds!

However, by far the worst site in this setis CNN, with over227 KB of images, 267 KB of external JavaScript, and141 KB of external CSS - do the math and you'll find that there are over635 KB of data in those263 external items needed to completely render the page and make it function correctly! To repeat, that's 635 KB of data, or over half a MEGABYTE! Not far behind are the other news sites on the list, BCC and the New York Times, along with the Adobe website. Those four are between 92 KB and 339 KB larger than the average.

A 2006 performance research study by the Yahoo! User Interface group found that, in general,a web browser only spends between 5% and 38% of the total download timedownloading the actual HTML document, and the remaining 62% to 95% of the time retrieving everything else (that is, images, scripts, style sheets, embedded objects such as Flash movies, etc.). In particular, their observations of CNN revealed a 15-85ratio of the HTML document to all of its included assets. Our own speed report from July 23rd, 2008, indicates that in just over a year and a half, this ratio has grown to almost 3-97, which is to say that 97% of the download is in the included assets in the form JavaScript files, Cascading Style Sheets, and images!

Illegal Play on the Field - Extreme Download Size!

Even with web browser file caching and compression between the server and the client, transferring upwards of 655 KB of total data takes a long time (as far as web speeds are concerned). Hitting CNN.com from the Congruent Media office in Baltimore, MD, which enjoys a hefty 10 Mbs downstream connection speed, took a noticeable time to download and render (approximately 10 seconds on the first try, give or take, and between 3 and 4 seconds each reload of the home page thereafter), even after having visited the page several times during which the browser had presumably cached images, style sheets, and JavaScript files.

Just out of curiosity, the Congruent Media team viewed some of CNN.com's JavaScript source code using the Firebug plug-in for FireFox. Utilizing both 3rd-party code librariesincluding prototype and scriptaculous and a hefty chunk of custom JavaScript, it turns out that NONE OF THE FILES ARE COMPRESSED!

There is a common assumption that there's plenty of bandwidth available, companies have fast servers, and end-users have fast enough computers to handle downloading, rendering, and executing web pages in less time than it took to write this sentence. In fact, that may actually be true for many web sites out there today, and certainly true of the sites from a few years ago. But, as AJAX techniques become more widely implemented and embedded objects like Flash video become more prevalent, more and more pages will start to hit that upper limit. Users will begin to notice a distinct latency between when they hit a URL and when the page finishes rendering. This will become more and more marked as pages begin to rely heavily on JavaScript that must be executed by the client machine.

First Impressions Count - Why You Shouldn't Rely on Browser Caching

Browsers were designed to take some of these client-side issues into account by implementing multi-threaded, asynchronous object download and caching techniques, and many developers work under the implicit (if mistaken) assumptionthat browser caching will solve most of their load time problems. While this is certainly true - the browser will load up included assets (images, Flash players, other embeds)after the document is being rendered, and it will store as many of those objects on the client machine for future use and a faster rendering experience - it is a fallacy to assume that a user is going to wait around for the page to finish loading if itappears to betaking anything longer than a few seconds.

First impressions count! This is the Internet age, whereusers have come to expect things to happen close to the speed of light. If it takes more than a handful of seconds for a page to load (or at least give the appearance of being loaded), a web surfer is going to move on. And if that's a consumer who found your site in a set of search engine results, this means that they're going to go back and try the next link down, heading straight to your competition.

First impressions notwithstanding, developers cannot rely on browser caching to solve the problem of download times if the pages that they are working with are highly dynamic. If the page keeps changing, anything from a simple marquee or rotating image, down to the more complicated case of a news page where breaking stories always appear on top, the browser has nothing to cache, and full versions of page components must be pulled down from the web server on every load. If the HTML has changed, then a fresh copy is required. New images must be pulled down. Other non-cachable items include dynamically generated scripts and style sheets, segments of content created on-the-fly, and entire web page sections that are produced after-the-fact using AJAX techniques.

In fact,a study performed at the end of 2006 by the Yahoo! User Interface team and posted on the YUI Blog revealed that approximately 20% of all page views (at least of Yahoo!'s home page) came from a browser that had an empty cache! This means that 1 out of every 5 web requests resulted in the full download of all page assets. This clearlyindicates that a developer cannot rely on the browser cache alone to "speed things up" and further underscores the importance of client-side optimization techniques.

Web development has come full circle. The lessons that designers learned a decade ago - that images and HTML MUST be optimized for a speedy download and reasonable rendering time - have come back around again in 2008. As Rich Internet Application development matures, programmers must take stock of their code base and consider optimization techniques for both their server and client-side scripts, CSS, and multi-media.

Part 2 of this series explores various optimization techniques and tools available to facilitate that process.

Comments
This site is running a Congruent Media enhanced version of BlogCFC 5.9.002. Contact Blog Owners
Baltimore Interactive Marketing and Advertising Agency