Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Networking IT

Are Long URLs Wasting Bandwidth? 379

Ryan McAdams writes "Popular websites, such as Facebook, are wasting as much as 75MBit/sec of bandwidth due to excessively long URLs. According to a recent article over at O3 Magazine, they took a typical Facebook home page, looked at the traffic statistics from compete.com, and figured out the bandwidth savings if Facebook switched from using URL paths which, in some cases, run over 150 characters in length, to shorter ones. It looks at the impact on service providers, with the wasted bandwidth used by the subsequent GET requests for these excessively long URLs. Facebook is just one example; many other sites have similar problems, as well as CMS products such as Word Press. It's an interesting approach to web optimization for high traffic sites."
This discussion has been archived. No new comments can be posted.

Are Long URLs Wasting Bandwidth?

Comments Filter:
  • by teeloo ( 766817 ) on Friday March 27, 2009 @06:05PM (#27364267)
    compression to shorten the URL's?
  • by markov_chain ( 202465 ) on Friday March 27, 2009 @06:09PM (#27364331)

    The short Facebook URLs waste bandwidth too ;)

  • Irrelevant (Score:5, Insightful)

    by Skal Tura ( 595728 ) on Friday March 27, 2009 @06:09PM (#27364345) Homepage

    It's irrelevantly small portion of the traffic, while at the scale of Facebook, it could save some traffic, but does not make any impact on the bottomline worthwhile the effort!

    150 chars long url = 150 bytes VS 50KILObytes + Images of rest of the pageview....

    I'm throwing out of my head that 50kilobytes for the full page text, but a pageview often runs at over 100kb.

    So it's totally irrelevant if they can shave off the 100kb a whopping 150bytes.

  • by JWSmythe ( 446288 ) * <jwsmythe@@@jwsmythe...com> on Friday March 27, 2009 @06:12PM (#27364383) Homepage Journal

        This is a stupid exercise. Oh my gosh, there's an extra few characters wasted. They're talking about 150 characters, which would be 150 bytes, or (gasp) 0.150KB.

        10 times the bandwidth could be saved by removing a 1.5KB image from the destination page, or doing a little added compression to the rest of the images. The same can be said for sending out the page itself gzipped.

        We did this exercise at my old work. We had relatively small pages. 10 pictures per page, roughly 300x300, a logo, and a very few layout images. We saved a fortune in bandwidth by compressing the pictures just a very little bit more. Not a lot. Just enough to make a difference.

        Consider taking 100,000,000 hits in a day. Bringing a 15KB image to 14KB would be .... wait for it .... 100GB per day saved in transfers.

        The same can be said for conserving the size of the page itself. Badly written pages (and oh are there a lot of them out there) not only take up more bandwidth because they have a lot of crap code in them, but they also tend to take longer to render.

        I took one huge badly written page, stripped out the crap content (like, do you need a font tag on every word?), cleaned up the table structure (this was pre-CSS), and the page loaded much faster. That wasn't just the bandwidth savings, that was a lot of overhead on the browser where it didn't have to parse all the extra crap in it.

        I know they're talking about the inbound bandwidth (relative to the server), which is usually less than 10% of the traffic. Most of the bandwidth is wasted in the outbound bandwidth. That's all anyone really cares about. Server farms only look at outbound bandwidth, because that's always the higher number, and the driving factor of their 95th percentile. Home users all care about their download bandwidth, because that's what sucks up the most for them. Well, unless they're running P2P software. I know I was a rare (but not unique) exception, where I was frequently sending original graphics in huge formats, and ISO's to and from work.

  • by RobertB-DC ( 622190 ) * on Friday March 27, 2009 @06:12PM (#27364407) Homepage Journal

    Seriously. Long URL's as wasters of bandwidth? There's a flash animation ad running at the moment (unless you're an ad-blocking anti-capitalist), and I would expect it uses as much bandwidth when I move my mouse past it as a hundred long URL's.

    I'm not apologizing for bandwidth hogs... back in the dialup days (which are still in effect in many situations), I was a proud "member" of the Bandwidth Conservation Society [blackpearlcomputing.com], dutifully reducing my .jpgs instead of just changing the Height/Width tags. My "Wallpaper Heaven" website (RIP) pushed small tiling backgrounds over massive multi-megabyte images. But even then, I don't think a 150-character URL would have appeared on their threat radar.

    It's a drop in the bucket. There are plenty of things wrong with 150-character URLs, but bandwidth usage isn't one of them.

  • Re:Waste of effort (Score:5, Insightful)

    by JCY2K ( 852841 ) on Friday March 27, 2009 @06:13PM (#27364413)

    Of all things that could be optimized, urls shouldn't have a high priority (unless you want people to enter them manually). I'm pretty sure their HTML, CSS, and javascript could be optimized way more than just their urls. But rather than simply sites, people often what it to be filled with crap (which nobody but themselves care about).

    ps, that doesn't mean you should try to create "nice" urls instead of incomprehensible url that contain things like article.pl?sid=09/03/27/2017250

    Of all things that could be optimized, urls shouldn't have a high priority (unless you want people to enter them manually). I'm pretty sure their HTML, CSS, and javascript could be optimized way more than just their urls. But rather than simply sites, people often what it to be filled with crap (which nobody but themselves care about).

    ps, that doesn't mean you should try to create "nice" urls instead of incomprehensible url that contain things like article.pl?sid=09/03/27/2017250

    To your ps, most of that is easily comprehensible It was an article that ran today; only the 2017250 is unmeaningful in itself. Perhaps article.pl?sid=09/03/27/Muerte/WasteOfEffort would be better but we're trying to shorten things up.

  • Re:Who knows? (Score:5, Insightful)

    by phantomfive ( 622387 ) on Friday March 27, 2009 @06:20PM (#27364535) Journal
    Seriously. No one better tell him about the padding in the IP packet header. A whole four bits is wasted in every packet that gets sent. More if it's fragmented. Or what about the fact that http headers are in PLAIN TEXT? Talk about a waste of bandwidth.

    In reality I think by watching one youtube movie you've used more bandwidth than you will on facebook URLs in a year.
  • by Anonymous Coward on Friday March 27, 2009 @06:24PM (#27364575)
    Yeah, exactly.
    And since I've read somewhere that Wordpress isn't the best CMS for a high-traffic site, it doesn't really matter too much.
  • by rbrome ( 175029 ) on Friday March 27, 2009 @06:43PM (#27364841) Homepage

    I hope this is obvious to most people here, but reading some comments, I'm not sure, so...

    The issue is that a typical Facebook page has 150 links on it. If you can shorten *each* of those URLs in the HTML by 100 characters, that's almost 15KB you knocked off the size of that one page. Not huge, but add that up over a visit, and for each visit, and it really does add up.

    I've been paying very close attention to URL length on all of my sites for years, for just this reason.

  • by thoglette ( 74419 ) on Friday March 27, 2009 @06:48PM (#27364921)

    Badly written pages (and oh are there a lot of them out there) not only take up more bandwidth because they have a lot of crap code in them, but they also tend to take longer to render.

    ebay has "upgraded" their local site http://my.ebay.com.au/> and "my ebay" is now a 1M byte download. That's ONE MILLION BYTES to show about 7K of text and about 20 x 2Kb thumbnails.

    The best bit is that the htm file itself over 1/2 Mbytes. Then there's two 150K+ js files and a 150k+ css file.

    Web "designers" should be forced to develop on a 128M P3 machine with VGA screen and dial up modem

  • No (Score:5, Insightful)

    by kpang ( 860416 ) on Friday March 27, 2009 @06:52PM (#27364969) Homepage
    Are Long URLs Wasting Bandwidth?

    No. But this article is.
  • by hrbrmstr ( 324215 ) * on Friday March 27, 2009 @06:53PM (#27364981) Homepage Journal

    Isn't Facebook itself the huge waste of bandwidth as opposed to just the verbose URLs it generates?

  • by XanC ( 644172 ) on Friday March 27, 2009 @06:54PM (#27364999)

    You can't ever rely on the HTTP-Referer header to be there. Much of the time, it isn't; either the user has disabled it in his browser, or some Internet security suite strips it, or something. I'm amazed at the number of sites that use it for _authentication_!

  • by coryking ( 104614 ) * on Friday March 27, 2009 @07:48PM (#27365647) Homepage Journal

    ...except they aren't using mod_gzip/deflate. At first I thought you browsed the web RMS style [lwn.net] and maybe wc* didn't support compression** and you were just getting what you deserved***, but then I checked in firefox and lo and behold:


    Response Headers - http://www.theglobeandmail.com/blogs/wgtgameblog0301/ [theglobeandmail.com]

    Date: Fri, 27 Mar 2009 23:39:54 GMT
    Server: Apache
    P3P: policyref="http://www.theglobeandmail.com/w3c/p3p.xml", CP="CAO DSP COR CURa ADMa DEVa TAIa PSAa PSDa CONi OUR NOR IND PHY ONL UNI COM NAV INT DEM STA PRE"
    Keep-Alive: timeout=15, max=100
    Connection: Keep-Alive
    Transfer-Encoding: chunked
    Content-Type: text/html

    200 OK
    No compression!

    If they had been using compression, it would have made all that whitespace fairly negligible.

    Probably a result of how their template system stitches everything together. Still, that is pretty bad. There is no excuse to run a webserver and not turn on compression. It is the single biggest way to boost page-load and decrease bandwidth.

    * wget 4 lyfe!
    ** compression is probably evil and Anti-Freedom(tm) somehow, kinda like images are evil or fads like "graphical user interfaces" are evil. In otherwords,anything that makes things easier or faster for a user is basically evil and Anti-Freedom****.
    *** braindead comment spamming bots are the only thing not using compression (except RMS, probably)
    **** I'll leave it to you, dear reader, to deduce if I'm serious. Hint: no hint.

  • by Nutria ( 679911 ) on Friday March 27, 2009 @07:50PM (#27365675)

    Sure they can TinyURL

    No, because the long URL is still out there. For example: http://tinyurl.com/c9fjov [tinyurl.com] translates into http://www.nerve.com/CS/blogs/scanner/2008/11/16-22/pervert.jpg [nerve.com].

  • by Bill Dimm ( 463823 ) on Friday March 27, 2009 @07:57PM (#27365777) Homepage

    The problem isn't bandwidth, it is that long URLs are a pain from a usability standpoint. They cause problems in any context where they are spelled out in plain text (instead of being hidden as a link). For example, they often get broken in two when sent in plain text email. When posting a URL into a simple forum that only accepts text (no markup), a long URL can blow-out the width of the page.

    Where does this problem come from? It comes from SEO. Website operators realized that Google and other search engines were taking URLs into account, so CMSs and websites switched from using simple URLs (like a numeric document ID) to stuffing whole article titles into the URL to try to boost search rankings. One of the results of this is that when someone finds a typo in an article title and fixes it, the CMS either creates a duplicate page with a slightly different URL, or the URL with the typo ends up giving a 404 error and breaks any links that point to it.

    What I don't understand is why search engines bother to look at anything beyond the domain name when determining how to rank search results. How often do you see anything useful in the URL that isn't also in the <title> tag or in a <h1> tag? If search engines would stop using URLs as a factor in ranking pages, people would use URLs that were efficient and useful instead of filling them with junk. The whole thing reminds me of <meta> keyword tags -- to the extent that users don't often look at URLs while search engines do, website operators have an opportunity to manipulate the search engines by stuffing them with junk.

  • Plus (Score:3, Insightful)

    by coryking ( 104614 ) * on Friday March 27, 2009 @07:58PM (#27365787) Homepage Journal

    The HTTP-Referer isn't designed for ?ref=somesource

    Your stat software wants to know if more people click to your page through the logo ?ref=mylogo or through a link in the story ?ref=story. The Referer can't give you that info.

    The HTTP-Referer also is no good for aggregation. It only give you a URL. If you didn't append something like ?campaign=longurl, it would be almost impossible to track things like ad-campaigns.

    HTTP-Referers *are* good for dealing with myspace image leeches. If you haven't I suggest you read thorough you log files right now--I bet you'll find 20% of your traffic is myspace idiots leeching your images. Redirect those guys to something more... tasteful, and enjoy the bandwidth savings.

  • by dgatwood ( 11270 ) on Friday March 27, 2009 @08:14PM (#27365951) Homepage Journal

    Depending on your network type, you may not get any benefit from shorter URLs at all. Many networking protocols use fixed-size frames, which then get padded with zeroes up to the end of the frame. For example, in ATM networks, anything up to 48 bytes is a single frame, so depending on where that URL occurs relative to the start of a frame, it's possible that it would take a 48 byte URL to cause even one extra frame to be sent.

    Either way, this is like complaining about a $2 budget overrun on a $2 billion project. Compared with the benefits of compressing the text content, moving all your scripts into separate files so they can be cached (Facebook sends over 4k of inline JavaScript with every page load for certain pages), generating content dynamically in the browser based on high density XML without all the formatting (except for the front page, Facebook appears to be predominantly server-generated HTML), removing every trace of inline styles (Facebook has plenty), reducing the number of style sheet links to a handful (instead of twenty), etc., the length of URLs is a trivial drop in the bucket.

  • by Sparr0 ( 451780 ) <sparr0@gmail.com> on Friday March 27, 2009 @10:50PM (#27367191) Homepage Journal

    How do you bookmark a specific lower level page if no variables are stored in the URL?

  • by mattwarden ( 699984 ) on Friday March 27, 2009 @11:57PM (#27367583)

    You mean that ?area=51 crap? How is http://mysite.com/?area=51 [mysite.com] usable?

    (Unless the page is about government conspiracies, I guess.)

  • by cgenman ( 325138 ) on Saturday March 28, 2009 @01:42AM (#27368075) Homepage

    Do we know what 75MBps as a percentage of total site traffic is? It seems like if that number is 1% or less, there would be more important areas to optimize. A little slack can be more valuable than bandwidth in a complex system.

  • absurd (Score:2, Insightful)

    by rbunker ( 1003580 ) on Saturday March 28, 2009 @01:57AM (#27368139)
    This is silly. The URLs, even "long" ones are miniscule compared to the pictures, streaming video, music, javascript etc. on these pages. To worry about them is like worrying about the lint on a suit of clothes making them too hot. This is just absurd.

When bad men combine, the good must associate; else they will fall one by one, an unpitied sacrifice in a contemptible struggle. - Edmund Burke

Working...