Are Long URLs Wasting Bandwidth? 379
Ryan McAdams writes "Popular websites, such as Facebook, are wasting as much as 75MBit/sec of bandwidth due to excessively long URLs. According to a recent article over at O3 Magazine, they took a typical Facebook home page, looked at the traffic statistics from compete.com, and figured out the bandwidth savings if Facebook switched from using URL paths which, in some cases, run over 150 characters in length, to shorter ones. It looks at the impact on service providers, with the wasted bandwidth used by the subsequent GET requests for these excessively long URLs. Facebook is just one example; many other sites have similar problems, as well as CMS products such as Word Press. It's an interesting approach to web optimization for high traffic sites."
Wordpress has the option (Score:5, Informative)
For SEO purposes it's always handy to switch to the more popular example: http://www.mysite.com/2009/03/my-title-of-my-post.html [mysite.com].
Suggesting that we cut URL's that help Google rank our pages higher is preposterous.
Re:Can they not use... (Score:0, Informative)
Sure they can
TinyURL [tinyurl.com]
Better way of doing it (Score:5, Informative)
Wordpress? (Score:4, Informative)
By default Wordpress produces short urls.
Most likely insignificant (Score:4, Informative)
This is ridiculous. If I have a billion dollars, I'm not going to worry about saving 50 cents on a cup of coffee. The bandwidth used by these urls is probably completely insignificant.
Wow. Just wow. (Score:4, Informative)
75 whole freaking megabits? WOWSERS!!!!
They must be doing gigabits for images, then. Complaining about the URLs is complaining about the 2 watts your wall-wart uses when idle, all the while using a 2kW air conditioner.
Re:Mental Masturbation (Score:3, Informative)
it's actually not even 0.15Kb, it's 0.146kb >;)
and 100mil hits, 1kb saved = 95.36Gb saved.
You mixed up marketing, and in-use computer kilos, gigas etc. 1Kb !== 1000 bytes, 1Kb === 1024bytes :)
Re:Depending on your viewpoint (Score:4, Informative)
I've always found stories along the lines of "$ENTITY wastes $X amount of $RESOURCE per year" dubious. Given enough users who each use a piece of $RESOURCE, the total amount of used resources will always be large no matter how little each individual user uses. There's no way to win.
Re:Can they not use... (Score:2, Informative)
That's not compression, that's hashing.
Re:Waste of effort (Score:5, Informative)
This very type of analysis is what YSlow [yahoo.com] is for :)
Re:Can they not use... (Score:5, Informative)
Most of the time, yes, but then there's a question of trade-off. Small URLs are generally hashes and are hard to type accurately and hard to remember. On the other hand, if you took ALL of the sources of wastage in bandwidth, what percentage would you save by compressing pages vs. compressing pages + URLs or just compressing URLs?
It might well be the case that these big web services are so inefficient with bandwidth that there are many things they could do to improve matters. In fact, I consider that quite likely. Those times I've done web admin stuff, I've rarely come across servers that have compression enabled.
Re:Better way of doing it (Score:3, Informative)
Of course, it's a totally different paradigm that requires a database instead of XML for the page metadata. But what it enables in being able to relate the sections of the site to one another and the pages is a an increase in functionality and speed over conventional methodologies.
Re:Can they not use... (Score:5, Informative)
Using a cookie, TinyURL allows you to enable previews [tinyurl.com], i.e., view where a TinyURL points to before following the link.
Re:I can top that. Try the Globe and Mail! (Score:2, Informative)
Have you tried compiling the whitespace [dur.ac.uk] =)
Re:Irrelevant (Score:4, Informative)
You missed the previous paragraph of the article where they explained where they got the 20k value, perhaps you should read the article first. :)
They rounded down the number of references, but on an average Facebook home.php file there are 250+ HREF or SRC references in excess of 120 characters. They took that these could be shaved by 80 bytes each. Thats 80 bytes x 250 references = 20,000 bytes or 20k.
Your math is wrong, its taking into account just one URL, when there are 250 references on home.php alone! They did not even factor in more than one page view per visit. If they did it your way, you would be looking at far more bandwidth utilization that 74MBit/sec.
Re:Can they not use... (Score:5, Informative)
And even with the wink, this still got initially moderated "Interesting" instead of "Funny".... *sigh*
To clarify the joke for those who don't "GET" it, in HTTP, POST requests are either encoded the same way as GET requests (with some extra bytes) or using MIME encoding. If you use a GET request, the number of bytes sent should differ by... the extra byte in the word "POST" versus "GET" plus two extra CR/LF pairs and a CR/LF-terminated Content-length header, IIRC.... And if you use MIME encoding for the POST content, the size of the data balloons to orders of magnitude larger unless you are dealing with large binary data objects like a JPEG upload or something similar.
So basically, a POST request just hides the URL-encoded data from the user but sends almost exactly the same data over the wire.
Re:Can they not use... (Score:5, Informative)
You're missing the joke... GET requests look like this:
POST requests look like this:
Same amount of content... URL looks shorter, but the exact same data as the querystring gets sent inside the request body. Thus, switching from GET to POST does not alter the bandwidth usage at all, even if it makes the URL seen in the browser look shorter.
Re:Better way of doing it (Score:3, Informative)
This is a very, very simple method. You seem to want to make it out to be the best thing in the world. The problem is, it needs some form of descriptive characteristic.
In my own little personal CMS/framework I do it similarly, except with a 1-16 character string. This way I can set some form of description.
It's really, very easy to do. Basically need a table with (id,parentid,page_title,page_content). parentid is the id of the parent section, leave NULL if it is the top level. This way you can seek in the DB with a simple SELECT * FROM `pages` WHERE `id`="aboutus" LIMIT 1.
You can use parentid to form a breakcrumb to trace back to which section it relates to... this maintains hierarchy. An easier way is to do a select all and hold it in an array (or hash table -- depending on your language) to make it really speedy. Hell, skip the DB. Store it in a file as a serialized string. (for VERY low traffic sites, anyway)
This method also works VERY well with URL rewriting.