Follow Slashdot blog updates by subscribing to our blog RSS feed


Forgot your password?
Google Businesses The Internet Privacy

Security Fears Over Google Accelerator 355

Espectr0 writes "A software tool launched by Google on Wednesday that speeds up the process of downloading Web sites (covered recently on Slashdot) has caused some users to worry about their privacy. A ZDNet article discusses problems that users have been experiencing with the information that is cached by the software. On a Google Labs discussion group, one user said that 'I went to the Futuremark forums and noticed that I'm logged in as someone I don't know...'" Commentary also available on Signal vs. Noise and BlogNewsChannel.
This discussion has been archived. No new comments can be posted.

Security Fears Over Google Accelerator

Comments Filter:
  • by xmas2003 ( 739875 ) * on Friday May 06, 2005 @01:33PM (#12453595) Homepage
  • by ShaniaTwain ( 197446 ) on Friday May 06, 2005 @01:34PM (#12453611) Homepage
    'I went to the Futuremark forums and noticed that I'm logged in as someone I don't know...'

    thats not a bug, its a feature.
    • by Silverlancer ( 786390 ) on Friday May 06, 2005 @02:24PM (#12454436)
      Thats a common proxy bug, actually, and the person who we all "appear" to be logged in as at Futuremark is St34lthW4rrior, a guy I actually know. No, we aren't actually logged in as him--its simply how the page is cached, and as our school proxy causes this problem basically every day, I'm used to it. Just disable it for dynamic pages such as forums.
  • How does caching your cookies to the internet help speed up your local browsing?
    • by Enigma_Man ( 756516 ) on Friday May 06, 2005 @01:37PM (#12453658) Homepage
      It doesn't just cache your cookies, it acts as a proxy that compresses the data as you browse, much like the ISPs that offer "high speed" compressed modem surfing.

    • It helps because the site you are browsing will require your cookie to display correctly.

      What i *think* might have happen to the user in the above article is that the site used the IP address, not a cookie, to identify the user. Thus there was no cookie being misplaced but rather the site assumed google's ip belonged to the same user.
    • by SuperBanana ( 662181 ) on Friday May 06, 2005 @01:54PM (#12453981)
      How does caching your cookies to the internet help speed up your local browsing?

      Who said it was a cookie that was cached, and not the page content? Much of the discussion thusfar seemed based off what an anonymous quote in a ZDnet article. Far as I can tell, the guy saw "Welcome back, Bob!" and freaked, when he wasn't -actually- logged in as Bob. Furthermore, who says it isn't Futuremark (or their forum software- because we all know how security-conscious PHP/MySQL forum software is) tagging their pages as cacheable when they shouldn't be? If Google is ignoring "don't cache this page", now yes, we have a problem- but the ZDnet story is of a technical level I'd expect of a community newspaper, so it's kind of hard to tell. It's like a story in your city newspaper that read "somebody killed by a cop!" and going off on a rant about police brutality...only to find out later the guy was a bank robber with an Uzi.

      Before you get all excited about bank sites etc- keep in mind those often use very unique URLs for each page and other tricks.

  • Aaaaaaaah! (Score:4, Funny)

    by Anonymous Coward on Friday May 06, 2005 @01:34PM (#12453620)
    Its true its true! People are logging on this account and acting like me on this account on /. but it really isnt me! Imposters!
  • by Future Linux-Guru ( 34181 ) on Friday May 06, 2005 @01:36PM (#12453641)

    You'll get better results filing a report with Google as opposed to complaining on /.

    As for me, I used the 3.7 minutes I've saved so far to spend some quality time with my friends.
    • Are their names Louise and Rosey?
    • That's utter crap. Google continually uses the "beta" moniker for their projects in order to escape criticism, or so 'twould seem.

      At least no one is having to pay for the privilege of beta testing Google's software.
    • by Chicane-UK ( 455253 ) * <> on Friday May 06, 2005 @01:48PM (#12453866) Homepage
      As for me, I used the 3.7 minutes I've saved so far to spend some quality time with my friends.

      Rosie Palm and her 5 sisters? ;)
    • Beta or not, there are security concerns. Should such significant issues simply be ignored just because it's not production-level? It's something people should be aware of...
      • by arkanes ( 521690 ) <> on Friday May 06, 2005 @01:57PM (#12454015) Homepage
        I think a more obvious answer here is that GWA is exposing web security bugs on a wide variety of applications. It's worth noting that if GWA can compromise your security, then it can be done intentionally as well. Which is not to say that caching issues should be ignored, or that there may not be a real problem with users getting some other users cookies. But if GWA can seriously affect your website, then instead of bitching that GWA is breaking your website like SomethingAwful did, you need to realize that your security was already flawed and you need to fix it.
    • So anything with the beta tag is magically allowed to do anything it wants? Lets say i type my credit card to someone on Gmail or type my ssn in the web accelerator and google accidently starts returning them as search results? "Oh its just beta dummy!" you can say, but Google is giving these things wide release, sometimes with press releases...the beta tag only seems to exempt them from liability. That sounds at least semi-evil to me. Maybe all companies can start calling all their products beta.
    • by Sialagogue ( 246874 ) <> on Friday May 06, 2005 @02:08PM (#12454182)

      How long has Google Groups been labelled Beta now, two years maybe? How many users does it have?

      If a wide number of even adventurous, risk-taking users could be exposed to a potentially significant security hole, then word should get out more widely than just Google's "thanks for the feedback" e-mail addresses.

      Beta is not the Greek word for "without responsibility." As much as we criticize Microsoft for making the idea of a "release date" (or "security") meaningless, I think Google's well on it's way to making the idea of the "Beta Release" meaningless.

      They act like a small, groovy coding lab with Beta releases and all, but seemingly aren't simultaneously recognizing that because of their prominence in consumer's minds, *anything* they do has widespread impact on ordinary Net consumers. So a true, uncontrolled Beta release? That's fine for me when I just coded a little midi tool and want to run it past my friends, but there's really no such thing when you're Google.

      I think that the number of users that adopt even their least publicized tools takes them out of the realm of the real intent of a Beta release, especially when security issues are involved.

      • by ajs ( 35943 ) <ajs.ajs@com> on Friday May 06, 2005 @03:55PM (#12455895) Homepage Journal
        "How long has Google Groups been labelled Beta now, two years maybe? How many users does it have?"

        So you would have them move it out of beta sooner? Not beta it? What's the solution you're proposing?

        Are you saying that software that Google issues in beta should be bug free, or are you suggesting that Google, being a search engine and all, should be scraping all of the Web's most popular forums as their bug reporting mechanism?

        I'm really not sure what you're proposing, here.
        • by nmk ( 781777 ) on Friday May 06, 2005 @05:11PM (#12456980)
          I think he's probably proposing that they should stop acting like pussies and start taking some responsibility for their software. Like he said Google has turned the very concept of the Beta into a joke. If MS was to keep a major piece of software in Beta for three or four years (as does Google), they would be accused of incompetence. I think the same should apply to Google.
  • Links.... (Score:5, Interesting)

    by Mz6 ( 741941 ) * on Friday May 06, 2005 @01:36PM (#12453642) Journal
    Perhaps this is just Google's way of finding morelinks to add to it's search index? Imagine gathering millions of websites that it may not have indexed or found yet. All from links that users of the GWA have visited... possible?
  • Privacy eh? (Score:5, Interesting)

    by funny-jack ( 741994 ) on Friday May 06, 2005 @01:36PM (#12453646) Homepage
    I found it a bit amusing that when I clicked the story link, the destination site, as well as three other sites, each attempted to save a cookie on my computer. Four cookies. To read a news story. That's necessary.
    • Re:Privacy eh? (Score:5, Interesting)

      by baadger ( 764884 ) on Friday May 06, 2005 @01:59PM (#12454045)
      Cookies are horrendously abused. There should never be a need for cookies until you choose preferences or login to a website.

      It's about time the net at large woke up to P3P [], or better yet webmasters started thinking before they mindlessly implement cookies for tracking their visitors.
  • by Jailbrekr ( 73837 ) <> on Friday May 06, 2005 @01:36PM (#12453649) Homepage
    Its a caching proxy server for crying out loud. It caches web pages and feeds you the cached version. This is not new nor is it surprising, especially for a new service offering.
    • by 44BSD ( 701309 ) on Friday May 06, 2005 @01:46PM (#12453840)
      It is more than a caching proxy.

      The client-side portion of the architecture aggressively prefetches content. It's a two-stage proxy, really, and the issue some people have with it is that the content in the portion on the end-user's hard drive is not content that the user asked for, but content that the proxy predicts the user will soon ask for.
    • Its a caching proxy server for crying out loud. It caches web pages and feeds you the cached version. This is not new nor is it surprising, especially for a new service offering.

      Actually, it is a bit surprising. Proxies are nothing new. All of the issues are well defined and have long since been worked through. Google really has little excuse for caching pages that should not be cached.

  • Sooooo (Score:3, Interesting)

    by liquidpele ( 663430 ) on Friday May 06, 2005 @01:36PM (#12453657) Journal
    What's the difference between this and your ISP?
    your ISP could do the same stuff people claim google can do (as far as tracking). I would like to know how the hell someone got logged in as someone else. That's just weird... perhaps the person just doesn't know what they are talking about, or a friend logged them in without their knowledge?
    • Re:Sooooo (Score:3, Insightful)

      by bogie ( 31020 )
      Did you Read The Fine Article?

      "I went to the Futuremark forums and noticed that I'm logged in as someone I don't know. Great, I've used Google's Web Accelerator for a couple of hours, visited lots of sites where I'm logged in. Now I wonder how many people used my cache. I understand it's a beta, sure, but something like that is totally unacceptable."

      I frankly don't know a ton about it since it fucked up my firefox install but others are giving the example of user X who has mod status browses www.popularfo
  • Not only that, but Google will conceal real web statistics from websites.

    Remember acquisition of Urchin? Here is my concern about Google Webaccelerator [].

  • I had to remove it from my system. It hijacked my browser, and I was not able to browse my companies internal websites because it over-rode our proxy. Bummer worked great
  • One has to worry about so many google apps and features and products in general.

    Using a ton of apps from one source is a risk on it's own. Google appears to be great now. But what if they stepps to teh 'dark side' and started doing crazy stupid stuff?
  • Since Google looks like it wants to become Big Brother instead of helping the masses, Microsoft can come to the rescue with their own products that does it better with no strings attached and no fishy EULAs. Yeah, right. Where's the idiot who sold me the Brooklyn Bridge?
  • I ran it for about an hour; turns out it's lumpy when one deals with multiple proxy servers (work vs. home) and it broke Rhapsody in a BIG way. I'm sure the good folks at Google will sort it out eventually.

    OTOH, one must consider whether or not one trusts Google with one's information that way. I wanted to check it out, but probably, in the long run, wouldn't have used it. But it's worth noting that millions of people use ISP proxy servers without even knowing it (think transparent proxies) or without unde
  • by alphakappa ( 687189 ) on Friday May 06, 2005 @01:41PM (#12453751) Homepage
    The accelerator prefetches the links on web pages, in effect clicking on all of them (except ads), which includes links that say 'delete this' or 'unsubscribe' etc. Many webpages use GET links to do these actions, and this is causing pages to disappear []. Until web apps are rewritten to take note of the prefetch header, it's probably unsafe to use the accelerator. (Which seems to be offline at the moment - the page redirects [] you to the toolbar)
    • That page takes me directly to a spot to download the Web Accelerator - and when I clicked download, sure enough it started to save a file.
    • by Anonymous Coward

      in effect clicking on all of them (except ads), which includes links that say 'delete this' or 'unsubscribe' etc. Many webpages use GET links to do these actions

      Then they were coded by morons. Section 9.1.1. of RFC 2616 (the HTTP 1.1 specification) explicitly states that GET should not be used for unsafe actions:

      In particular, the convention has been established that the GET and HEAD methods SHOULD NOT have the significance of taking an action other than retrieval. These methods ought to be

    • by Jeff DeMaagd ( 2015 ) on Friday May 06, 2005 @02:06PM (#12454157) Homepage Journal
      If it is prefetching everything, then I would have a problem with that, from a different perspective. That increases the amount of bandwidth used by fetching a lot of pages that might not be followed. That means increased bandwidth costs unless enough users use the system such that Google's caching means that it most of the given files are already in their cache.
    • by poot_rootbeer ( 188613 ) on Friday May 06, 2005 @02:20PM (#12454373)
      links that say 'delete this' or 'unsubscribe' etc. Many webpages use GET links to do these actions

      In which case, many webpages are BROKEN AS HELL.

      Come on, "webmasters". I knew well enough to implement any irreversible actions as a form with method=POST to prevent spiders from triggering them back in 1998. There's no excuse for a professional web developer to make that mistake in 2005.

      Google being the global aggregator that it is, though, should have expected the worst and foreseen that this kind of thing would happen and planned for it. Disappointing.
      • Come on, "webmasters". I knew well enough to implement any irreversible actions as a form with method=POST to prevent spiders from triggering them back in 1998.

        So did these people. But this isn't a spider. This is a monkey piggy-backing on an AUTHENTICATED USER SESSION.

        And I, for one, say it is time to punch that monkey.
  • Does anyone have experience using a caching web proxy for their home use? If so, did you see any browsing acceleration?
  • by Sebby ( 238625 ) on Friday May 06, 2005 @01:42PM (#12453777)
    We encounted similar problems when we implemented aggressive caching on our site; mostly that we didn't set the headers properly.

    this site [] was pretty useful for information. So was AOL webmaster resources [] info.

  • by oneiros27 ( 46144 ) on Friday May 06, 2005 @01:43PM (#12453793) Homepage
    If Google is ignoring Cache-Control headers, then that's one thing to complain about. There's also a good chance that some of these sites are using improper systems for session control (eg, using HTTP_ADDR without checking X_FORWARDED_FOR, and not setting Cache-Control on their response).

    For more info about these known issues with HTTP caching, see the following
  • by Palos ( 527071 ) on Friday May 06, 2005 @01:44PM (#12453808)
    I tried this for a little bit, and really am not impressed. Some basic issues:

    From a users point of view:

    1 - Ignores hosts file, so I end up seeing ads I normally wouldn't see

    2 - Cookies work weirdly if at all, a lot more sites that I visit frequently appear to use cookies, and I've noticed some definte weirdness

    3 - The time saved on a broadband connection really seems minimal, after an hr or two of surfing it takes a few seconds

    4 - The pre-fetching it supports is already in firefox and probably other browsers

    From a webmasters point of view:

    1 - No way to limit caching of certain pages outside of moving them to SSL. Robots.txt isn't being followed (although probably rightly so, based on the application ).

    2 - Because of the flawed cookie support (at least right now) a lot of affilate and different advertising methods have to be modified to support this.

    I'm a big google fan, and I use most of their applications daily, but this one defintely needs some work. :)

  • "The service is only available to broadband subscribers."

    I read the FAQ and it said it is doubtful that the Google Web Accelerator will have any affect on dialup connections as it was designed for broadband.
    It doesn't say that it is not available for dialup users. Sounds like a hurried article to grab some headlines.
  • []

    Really insightful.
    • Really insightful.

      A blatantly exaggerating troll.

    • That's not insightful. His site is broken, and Google shows information it shouldn't as a result. There are probably other cases in which the site breaks that hadn't been noticed yet because not that many people used a caching proxy before now. The rest of the article said not to idolize a company, because in the end, it's a company. That's not insightful, it's common sense.
  • Adsense clicks (Score:5, Interesting)

    by broothal ( 186066 ) <> on Friday May 06, 2005 @01:44PM (#12453820) Homepage Journal
    Has anyone read how google will deal with adsense clicks? Since all users of the accellerator will come from the same IP, will that IP decrease in value? (It's well known that the same IP can't just click again and again and generate revenue).
  • i've had it installed at home since i saw it on slashdot... i must not go to common sites or something, because i'm still sitting at 0.0 seconds saved.
  • NoCache directive (Score:4, Insightful)

    by Sir Pallas ( 696783 ) on Friday May 06, 2005 @01:46PM (#12453838) Homepage
    Shouldn't those sites be using the NoCache directive and shouldn't Google be honoring it? I wonder which side is at fault. At any rate, fears about information leakage are kind of silly because of the volume of traffic that Google services. The accelerator allows them to see link patterns, but no one could store, let alone process, an entire day's worth of data after the fact. The same is true for Google Mail: no person ever sees your email; an algorithm does, and tailors simple, pertinent advertising in exchange for an otherwise free service. The accelerator can only make the search engine better for everyone. Anyone that uses it is giving back, contributing to the synergistic knowledge of Google.
    • Yeah, the NoCache directive is almost *never* respected. Most of the time, it shouldn't be, actually. It doesn't really serve any purpose- browsers are smart enought to check for new versions, there's no need to explicitly tell them to.
  • by Chris_Jefferson ( 581445 ) on Friday May 06, 2005 @01:48PM (#12453871) Homepage
    The business with appearing to be logged on isn't quite as serious as it sounds (although it is still bad).

    The problem appears to be that you will sometimes be given a page that was personalised for someone else. However if you attempt to do anything from that page (for example if you find yourself looking like admin of a web board) you'll find that it doesn't work, any more than it would if someone emailed you a copy of a page where they were logged in as admin and you clicked on links (if you are on a website where doing that would work, you already have serious security problems). It also doesn't occur with SSL as google doesn't doing anything with SSL pages (as you would hope)

    This is still a problem if that page shows something private of course, and should be fixed. (a password of course being the worst case, but how often do you see your actual passwords printed on a webpage?)
  • Read about all of the username, forum, and security risks?

    Since such activity could pose both a security risk to web surfers and site owners, there are some web sites which are interested in not having Web Accelerator pick up their material.

    A very fast and efficacious method of denying Google Web Accelerator (GWA) funneled traffic access to your web site is blocking the IPs it is calling your pages from: 6 []
  • by august sun ( 799030 ) on Friday May 06, 2005 @01:51PM (#12453917)

    lowtax of SomethingAwful makes some interesting points amidst all his fuming but I'll have to defer to the /. tech wizards to vet his technical claims.

  • by Momoru ( 837801 ) on Friday May 06, 2005 @01:53PM (#12453948) Homepage Journal
    Don't use it! Google is a public corporation, everything they make is designed to somehow make a profit (which i see nothing wrong with, btw)...even if it doesn't cache your personal information like the article claims, there is some angle to it that will make money for them, maybe they will look at your web surfing habits and target ads to you. If you're one of those people who blindly trusts google because of their "don't be evil" mission statement, then use it and trust that Google is taking care of you. I personally don't trust them, so I won't use it. There is no free lunch.
  • When I uninstalled it, it broke firefox - it wouldn't startup any more, complaining about some x-asl binding. On reinstalling firefox, I get a huge status bar with the text <key id="key_openHelp" ------------^

    I don't know what it did, but my firefox is not happy now.
  • Does anyone know any free alternatives for the GWA? ones that work on windows? I'm aware of squid, any others?
  • by RebornData ( 25811 ) on Friday May 06, 2005 @02:06PM (#12454146)
    I just deleted the accelerator from my system after trying it for the last day, and I must say that it is much less mature than most of the "Beta" products google releases. It caused several significant issues with Firefox on my system, including:

    1. Links that open another window stopped working entirely (although they worked if I right-clicked and selected "open in new tab")

    2. Even after closing all Firefox windows, a firefox.exe process would remain running, and prevent any new firefox windows from being opened until it was manually killed

    3. "Proxy not available" errors when opening several pages at once, such as when using the Firefox "open in tabs" on a folder of bookmarks.

    And I haven't even checked into some of these cookie / privacy issues. Perhaps these issues are unique to my system, but my environment is pretty vanilla... I just run a few of the more popular Firefox plugins. Removing the GWA cleared up all of the problems cited above.

    Up to this point, I've always been very impressed with the level of testing that has gone into Google software products before they enter Beta. In this case, I'm not. Hope this isn't a sign of things to come.

  • this is really going to blow a hole in the marketing schemes of aol, earthlink, netzero, netscape, and others who depend on the accelerator feature. google has leveled those in one fell swoop. i expect the stocks of dialup-centric companies to drop significantly.
  • by Temporal ( 96070 ) on Friday May 06, 2005 @02:21PM (#12454388) Journal
    I assume Google has properly implemented the HTTP/1.1 caching mechanisms. Among these, it is possible for a server to mark a page as being "private", meaning that it should never be cached in a public cache like Google's. Another thing the server can do is set "Vary: Cookie", which indicates that the server will produce different pages for people who give it different cookies.

    Here are the headers that the Futuremark forums give me when I am logged in:
    HTTP/1.1 200 OK
    Date: Fri, 06 May 2005 18:10:16 GMT
    Server: Apache/1.3.29 (Unix) mod_perl/1.29
    Transfer-Encoding: chunked
    Content-Type: text/html
    As you can see, neither "Cache-Control: private" nor "Vary: Cookie" is given. In fact, the server doesn't even give an expiration date for the content. Under these conditions, the HTTP/1.1 protocol says that it is perfectly OK for a cache to keep this page for awhile and serve it to other people.

    This problem is firmly the fault of the people who wrote Futuremark's forums. This constitutes a major security hole in the WWWThreads [] forum package, because this problem will occur when using any standards-compliant HTTP cache. I would strongly recommend against the use of these forums on any web site until they fix their security problems.

    (I do not know if other forum software has this problem, but frankly it would not surprise me. It seems lots of PHP developers and other high-level web programmers have no idea how HTTP/1.1 works, and assume that headers are completely unimportant. I have written a web server and forum software myself, though, and I made damned sure that mine produces the right headers.)
    • Interesting. Microsoft is doing the "right thing" with IIS6:
      Date: Fri, 06 May 2005 18:31:39 GMT
      Server: Microsoft-IIS/6.0
      X-Powered-By: ASP.NET
      Content-Length: 5905
      Content-Type: text/html
      Expires: Thu, 05 May 2005 18:31:38 GMT
      Cache-Control: private
      This is apparently the default.
      • Of course, you should only slap the "private" header on pages which are actually private. Otherwise you're just killing the ability of the cache to do its job. But, marking everything private is better than marking nothing private; the former just reduces performance while the latter is a security problem.

        It looks like, which simply redirects to, is marked "private". That's excessive, and indicates to me that Microsoft's web designers don't understand cache-friendliness o
        • by Godeke ( 32895 ) * on Friday May 06, 2005 @04:02PM (#12456002)
          No, my point was exactly that "marking everything private is better than marking nothing private": this was the header from a site I built. Now that I'm aware of the ramifications, I can remove that header from the appropriate pages (the few that are not data driven). But I far prefer the default this way that discovering "oh yay, all my data driven pages are stupidly cached". Right now the site is just rude and uninformed, not broken.

          As far as Microsoft's sites, I really could care less how stupid their choices are, I'm just glad I can now implement it properly by adding the change where necessary instead of having egg on my face for not having a piece of information when I built the site. During building the site, the only cache I considered was the browser cache. Bad, but not as bad as what I'm finding on my personal PHP driven sites on this same issue. There I just look stupid:
          Date: Fri, 06 May 2005 20:00:49 GMT
          Server: Apache/1.3.33 (Unix) mod_jk2/2.0.0 mod_auth_passthrough/1.8 mod_log_bytes/1.2 mod_bwlimited/1.4 FrontPage/ mod_ssl/2.8.22 OpenSSL/0.9.6b PHP-CGI/0.1b
          Last-Modified: Fri, 30 Nov 2001 20:02:22 GMT
          Etag: "2bed0b-1b27-3c07e5ce"
          Accept-Ranges: bytes
          Content-Length: 6951
          Keep-Alive: timeout=10, max=100
          Connection: Keep-Alive
          Content-Type: text/htm
          (Um, yeah, haven't updated that ugly site in four years).
  • Here's some code to add to your web pages to block GWA. This will leave static media alone, which is fine.

    if(array_key_exists($_SERVER['HTTP_X_MOZ'] ))
    if(strtoupper($_SERVER['HTTP_X_MOZ']) == 'prefetch')
    header("HTTP/1.x 403 Forbidden");
    header("Content-Type: text/html; charset=iso-8859-1");
    header("Expires: Mon, 26 Jul 1997 05:00:00 GMT");
    header("Cache-Control: no-store, no-cache,
    header("Cache-Control: post-check=0, pre-check=0",
    header("Pragma: no-cache");


  • by Everyman ( 197621 ) on Friday May 06, 2005 @03:06PM (#12455073) Homepage
    Why is Google doing this?

    If the purpose is to speed up web access, then why couldn't all this gzip compression, prefetching, and so forth, be handled on your local drive without going through Google? Wouldn't that be faster? Not everyone lives next door to a Google data center (not yet, anyway), and there is latency when you hop around the web to get stuff from Google. The accelerator installation file isn't exactly lean (1.4 meg), so I don't understand why Google has to broker all of this stuff on their servers.

    Google claims that there's no more of a privacy issue with this thing than there is with your ISP. However, I think most ISPs are a bit different than Google.

    My ISP has no reason to store it's logs indefinitely. Google has every intention of storing everything about me forever. My ISP rotates their logs regularly, while Google indexes and compresses their logs using globally-unique IDs, and stashes it away for future reference. My ISP is not the world's largest advertiser, but Google is determined to "know more about you" (Eric Schmidt's words) for profiling purposes. My ISP has a real privacy policy, and I believe that they would demand a subpoena before giving out information about my surfing behavior. Google has never suggested that they even require a subpoena from officials, so I have to assume that they have a very cozy relationship with various governments.

    All that is from the user's perspective. What about webmasters?

    The web accelerator ignores robots.txt. The web accelerator ignores the NOARCHIVE meta. I believe, but have yet to confirm, that it ignores any no-cache pragma headers. It avoids prefetching anything with a question mark in the URL, but what about all those PATH_INFO dynamic links we've been installing for the last four years so that our dynamic pages look like static URLs? Google prefetches many of these, and there are numerous reports that this prefetching, along with some cookie mishandling by Google, is breaking sites out there. Does Google care?

    Why isn't there a sitewide opt-out option for this monster? Heck, it's so bloody dangerous for both the user and the webmaster that it ought to be opt-in instead of opt-out.

    All webmasters should block this thing. If a user cannot get to your site because of this block, then at least you as a webmaster won't be complicit. We have to protect users from Google's megalomania, because they've been so dumbed-down by Google worship over the last few years that they can no longer think straight.
    • so I don't understand why Google has to broker all of this stuff on their servers.

      Never heard of the slashdot effect? Well if everyone is using this, it will eliminate it. Google downloads the site's content, everyone downloads from google, site stays up.

    • Response (Score:5, Informative)

      by Otto ( 17870 ) on Friday May 06, 2005 @07:17PM (#12458339) Homepage Journal
      The web accelerator ignores robots.txt.

      The web accelerator is not a robot, so this is correct behavior.

      The web accelerator ignores the NOARCHIVE meta.

      NOARCHIVE is a Google specific extension to the robots.txt specification, and again, this is not a robot.

      I believe, but have yet to confirm, that it ignores any no-cache pragma headers.

      I'd be absolutely shocked if that were actually the case. I also believe it respects the Expires header as well as the Cache-Control header.

      It avoids prefetching anything with a question mark in the URL, but what about all those PATH_INFO dynamic links we've been installing for the last four years so that our dynamic pages look like static URLs? Google prefetches many of these, and there are numerous reports that this prefetching, along with some cookie mishandling by Google, is breaking sites out there. Does Google care?

      If they're following the proper standards, then it's not their place to care or not. If your website doesn't properly specify cache-control (many don't) then you get what you get.

      For any pages with user-specific content, add the "Cache-Control: private" header and voila, problem solved for you.

      If you want to opt out entirely, then a simple "Cache-Control: no-cache" header in your HTTP responses would do the trick, as would "Pragma: no-cache", I bet.

      Furthermore, there is no cookie-mishanding I've actually seen, and I've tested it. It passes cookies through just fine, without caching them, near as I can tell.

"I shall expect a chemical cure for psychopathic behavior by 10 A.M. tomorrow, or I'll have your guts for spaghetti." -- a comic panel by Cotham