Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Google Businesses The Internet Your Rights Online

Google to Anonymize Users' Search Data 151

Google's official blog states they are on an effort to anonymize their search data after 18-24 months. After previously fighting turning over search data to the feds, it looks like they are striking another blow to the "think of the children" crowd. Any bets on whether MSN or Yahoo! will follow suit?
This discussion has been archived. No new comments can be posted.

Google to Anonymize Users' Search Data

Comments Filter:
  • Re:right.... (Score:5, Insightful)

    by ag0ny ( 59629 ) <javi@lavand[ ]a.net ['eir' in gap]> on Thursday March 15, 2007 @07:46AM (#18360117) Homepage
    Why would Google have to comply with EU regulations? :?

    Maybe because they do business in Europe?
  • Re:0 months? (Score:5, Insightful)

    by cdrudge ( 68377 ) on Thursday March 15, 2007 @07:49AM (#18360137) Homepage
    My guess is they don't do it immediately is because there is internal business value in mining the data. User patterns, length of stay, etc. After 18 or 24 months, the internal value has dropped significantly as things change quickly. I would have thought that the value would have dropped even quicker then that, say after 6 months or maybe a year.
  • by jacquesm ( 154384 ) <j@NoSpam.ww.com> on Thursday March 15, 2007 @07:49AM (#18360139) Homepage
    I never got why google needs to keep all that history without anonymizing it.

    There is - as far as I can see - no rational argument that has to do with improving search results because you have them tied to individuals.

    And yes, keeping tabs on half the globe is evil too...

  • Re:Uhm (Score:5, Insightful)

    by Rakishi ( 759894 ) on Thursday March 15, 2007 @07:50AM (#18360147)
    And anonymous proxies do not need to make money or provide much of a service unlike google, logs are very useful for such things.
  • According to TFA (Score:5, Insightful)

    by ReallyEvilCanine ( 991886 ) on Thursday March 15, 2007 @07:50AM (#18360149) Homepage
    Google plan to make it "more anonymous". Like pregnancy, data either ARE anonymous or they ain't. You can't qualify an absolute, and "anonymous" is an absolute condition indicating lack of information.
  • Re:Uhm (Score:5, Insightful)

    by Whiney Mac Fanboy ( 963289 ) * <whineymacfanboy@gmail.com> on Thursday March 15, 2007 @07:51AM (#18360153) Homepage Journal
    All they have to do is erase the logs every day or just not keep them. It doesn't "take an effort". Anonymous proxies have been doing this for years.

    I know where you're coming from, but that would kinda fuck with their targetting advertising business model dontcha think?
  • Re:Uhm (Score:3, Insightful)

    by jacquesm ( 154384 ) <j@NoSpam.ww.com> on Thursday March 15, 2007 @07:53AM (#18360171) Homepage
    it doesn't have to, after all the targetted ads are supposedly targetted to the *content* of the pages and your search query. No need to keep that for two years in order to target it better unless you have other plans with my data (such as selling my 'profile').

  • by solevita ( 967690 ) on Thursday March 15, 2007 @08:08AM (#18360243)

    Stop googling for "jihad death to american president" if you're worried about getting caught.
    You're correct. The only people that demand privacy are those up to no good. How about I come over to your house later, sit in your bed for a bit, go through your draws and your phone records, take some pictures of you and your friends, ask the neighbours some pressing questions?

    If you've got nothing to hide, you should have no problem with this.
  • by GweeDo ( 127172 ) on Thursday March 15, 2007 @08:26AM (#18360359) Homepage
    There is no need? What about the monetary need? Google doesn't really care who you are, but they do care about what you are looking for. The more they know about what you are looking for the better their AdSense program can do. The better it does, the more money they make.

    As for your whole you "we have privacy" bit, sure you do. In your own home while using your stuff. The moment you sent your request out over the internet in plain text to a third party (that is a corporation out to make money you know) you lost that.
  • by garcia ( 6573 ) on Thursday March 15, 2007 @08:32AM (#18360397)
    Stop googling for "jihad death to american president" if you're worried about getting caught.

    Excuse me?! I live in America and if I want to research the results of the search terms "jihad death to american president" I'm well within my fucking rights.

    Fuck you for saying otherwise.
  • Re:Uhm (Score:4, Insightful)

    by daeg ( 828071 ) on Thursday March 15, 2007 @08:43AM (#18360481)
    I'm between the two extremes of agreeing with you and agreeing that data needs to be retained. As any of us who have taken a statistics class (or four) can tell you, you don't need access to the whole sample to provide accurate data. So, say, for instance, the Google engineers were working on a specific niche of the web, say, dog lovers. If I were designing something to better suit dog lovers, my first step would be pulling a report on the common search patterns of people that search for dog-related topics.

    Historical data that identifies a unique user is extremely useful. I do the same thing with our Intranet search and report tools. If I want to improve something, oftentimes the logs will give a very telling tale. (This accounting department employee searched for "expense", then "expense excel", then "expense spreadsheet", then "expense log", finally getting his document. I can then add the keywords 'excel' 'spreadsheet' to the actual document entry.) That said, you don't actually need to know who the unique user is, for all intents and research purposes, User5486734067 is just as useful as an IP+Cookie.
  • 18-24 months? (Score:2, Insightful)

    by JackMeyhoff ( 1070484 ) on Thursday March 15, 2007 @08:53AM (#18360543)
    Which is it? 18, 19, 20, 21, 22, 23 or 24?
  • by tomstdenis ( 446163 ) <tomstdenis@gma[ ]com ['il.' in gap]> on Thursday March 15, 2007 @09:08AM (#18360705) Homepage
    This is why it pays to have a modicum of computer knowledge.

    Assuming you're not trolling...

    When you send a query to google, it goes over the "internet" in the clear. That is, not encrypted. Anyone who can see it can read it. Well who can read it? Turns out a lot of people. Between me and google are probably 10 different boxes. 5 of which are just my ISPs routers. The other five are boxes on other networks, not even related to Google.

    There is no inherant requirement for privacy like there is with telephones (maybe their ought to be one). But that said, you're giving your data to Google, willingly no less. That gives them every right to record it. You gave them permission by using their service, I guess you never read their TOS [google.ca] which is your fault, not theirs. Think about the analogy in the real world. This is like you handing your drivers license to every stranger you meet, then getting upset when some of them write it down.

    If you don't want your assets [IP, location, name, platform, etc] leaked to Google you should use an anonymous proxy.

    Tom
  • by tomstdenis ( 446163 ) <tomstdenis@gma[ ]com ['il.' in gap]> on Thursday March 15, 2007 @09:10AM (#18360733) Homepage
    I'm not against google cleaning their logs. I'm against people claiming this is a privacy issue.

    Google logging all your queries: Not a privacy problem.

    Bank leaking your SSN via stolen laptop: Privacy problem.

    AOL knowing that you like midget porn: Not a privacy problem.

    Government using sub-standard contractor to manage passport data, later turns up on broken into computer: Privacy problem.

    By screaming wolf every time "data" is mentioned you desensitize people to real privacy problems.
  • List of nifty little phrases that have bitten their speakers in the ass:

    • They will never bomb Berlin
    • Read my lips, no new taxes
    • I did not have sex with that woman
    • Mission accomplished
    • Don't be evil

    Now Google brings us:

    Let's just be less evil, now that we've been caught.

  • ADVERTIZING (Score:3, Insightful)

    by everphilski ( 877346 ) on Thursday March 15, 2007 @09:48AM (#18361121) Journal
    it's all about the advertising. Google's knowlege of you lets them advertise to you more effectively.
  • by sherriw ( 794536 ) on Thursday March 15, 2007 @10:42AM (#18361797)
    Personally I think it's all a load of BS. If they really cared about our privacy, and if all they really needed my IP addy for is to aggregate my searches to 'better serve me', then all they have to do is one-way hash my IP addy. Then they can still tie all my searches together, and my gmail and such, but they wouldn't be able to back track it. And the govn't could demand all they want... you want the IP of the user who searched this? Here it is Mr. Bush... go nuts: x867:%dsgfk435j>67&*g[fg

    So forgive me if I don't get all thankful for Google's big gesture. Heh.
  • by Impy the Impiuos Imp ( 442658 ) on Thursday March 15, 2007 @11:08AM (#18362161) Journal
    People searching for their social security numbers just for the hell of it, or their CC numbers, and presto! Now real numbers exist in some "Google history list" for ever and ever.

    There's a goldmine of data there. "Anonymizing" it doesn't affect this, unless they have filters to try to recognize such and get rid of it.

    Still, if it's in the form of "User X" searched for these 132 terms last month, some terms might identify them and hence link them to other things like their unfortunate search for "donkey love".

    E.g.

    1234 Fake Street (suppose it's your real address)

    +britney +bald +"bald down there"

    What does "bedonk-i-donk" mean?

    fat asses with tiny waists

  • by santiago ( 42242 ) on Thursday March 15, 2007 @11:39AM (#18362785)
    There's 2^32 IP addresses under IPv4. If Google is doing the hashing, then they know the hash function. How long do you think it would take them to brute-force break the hash by hashing every possible IP address and creating a map from the hashed values back to the originals? Express your answer in microseconds.

    (If your solution is to increase the space of inputs by adding a variable salt value, please explain how this allows them to use the resulting hashes for aggregation.)

To do nothing is to be nothing.

Working...