Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Google Security

Scroogle Has Been Blocked 281

An anonymous reader writes "Scroogle, the secure third-party Google search interface, has been blocked by Google. Scroogle was an SSL-based search proxy that enabled one to search for and receive Google results over an SSL connection in a pseudo-anonymous manner."
This discussion has been archived. No new comments can be posted.

Scroogle Has Been Blocked

Comments Filter:
  • Scroogle (Score:4, Informative)

    by sopssa ( 1498795 ) * <sopssa@email.com> on Tuesday May 11, 2010 @11:11AM (#32169568) Journal

    While I would love to see a good rant towards Google and while I also myself use Scroogle, the summary isn't really being truthful. Google hasn't blocked anything, they just changed the page that Scroogle scrapes and they're throwing a hissy fit about it.

    From the Scroogle announcement:

    We regret to announce that our Google scraper may have to be permanently retired, thanks to a change at Google.

    That interface was at www.google.com/ie but on May 10, 2010 they took it down and inserted a redirect to /toolbar/ie8/sidebar.html. It used to have a search box, and the results it showed were generic during that entire time.

    Now that interface is gone. It is not possible to continue Scroogle unless we have a simple interface that is stable. Google's main consumer-oriented interface that they want everyone to use is too complex, and changes too frequently, to make our scraping operation possible.

    Google changing something isn't exactly "blocking" a third party service. Even more so, it's just a few lines of code to get the results from main Google search too. All the search results and links have approciate html ID's associated to them and it's been the same for years already.

    I have no idea why Scroogle is bitching about this.

    Oh well. I changed to use ixquick [ixquick.com], which also has the added benefit of being located in the Germany rather than US and a lot better and useful interface.

    -sopssa

    • Re: (Score:3, Interesting)

      by longacre ( 1090157 )
      What's the benefit of being in Germany?
      • Re:Scroogle (Score:4, Funny)

        by Registered Coward v2 ( 447531 ) on Tuesday May 11, 2010 @11:27AM (#32169810)

        What's the benefit of being in Germany?

        Maybe:

        Ve hav vays of making you benefit

        or,

        Ve no nutting, nutting,

        • Re:Scroogle (Score:4, Insightful)

          by negRo_slim ( 636783 ) <mils_orgen@hotmail.com> on Tuesday May 11, 2010 @12:43PM (#32171080) Homepage

          What's the benefit of being in Germany?

          I may be mistaken but I believe they have stronger privacy laws.

          • by Myopic ( 18616 )

            That could be, and if so it would be significant, but the benefit might simply come from the server being in pretty much *any* other country. I imagine it's a lot more footwork for a police department to get information from foreign servers versus domestic ones. It's one more barrier in front of your privacy. No barrier is perfect, but any barrier is better than none (if privacy is your goal).

      • Re:Scroogle (Score:5, Informative)

        by sopssa ( 1498795 ) * <sopssa@email.com> on Tuesday May 11, 2010 @11:28AM (#32169836) Journal

        From the FAQ [ixquick.com]:

        European Privacy Seal
        On July 14th 2008 Ixquick received the first European Privacy Seal from European Data Protection Supervisor Mr. Peter Hustinx. The Seal officially confirms the privacy promises we make to our users. It makes Ixquick the first and only EU-approved search engine. Both EU Commissioner Viviane Reding and Dr.Thilo Weichert, German Privacy Commissioner complemented Ixquick on its privacy achievements.
        You can find the press release here.

        Since I am in EU, it also means US can't just randomly get data that doesn't belong to them, ie. for people from other countries. Frankly, EU and European countries take privacy a lot more seriously, for historical reasons too.

        • Re:Scroogle (Score:4, Interesting)

          by geekoid ( 135745 ) <dadinportlandNO@SPAMyahoo.com> on Tuesday May 11, 2010 @12:17PM (#32170610) Homepage Journal

          The US can't do that in the US either. Just an FYI.

          " Frankly, EU and European countries take privacy a lot more seriously,"
          Care to back that up? I mean when you can take time away from being on public video, told what you can and can not say, carrying papers,

          IT would be more correct to say it treats privacy different;which makes sense because what it considered ''privacy' is different. For example, what you do in public can be considered 'private' in some countries.

          Of course it's such a patch work in the EU, it's almost nonsense to say to use the EU as a generally statement concerning privacy.

          • Re:Scroogle (Score:4, Insightful)

            by logjon ( 1411219 ) on Tuesday May 11, 2010 @12:28PM (#32170848)
            Not sure which US you live in, but here in the US I'm a citizen of, the government has unfettered access to communications, digital and otherwise. The patriot act took the last of American privacy, and with a hearty chuckle, wiped its ass with the remainder of the fourth amendment.
            • Re: (Score:2, Insightful)

              by Myopic ( 18616 )

              Nice. How's the weather there? The rest of us are stuck here in the United States of Reality.

          • by dAzED1 ( 33635 )

            ...carrying papers...

            Haven't been listening to the news about AZ lately, eh?

            • Re: (Score:3, Insightful)

              by dAzED1 ( 33635 )

              Let's look at what I was replying to:

              I mean when you can take time away from being on public video, told what you can and can not say, carrying papers

              Now, how am I taking the current AZ situation out of context? Is racial profiling not occurring, with people being told to show papers? In fact, that's exactly what is occurring. Which means, it is no longer valid to use Europe's habit of asking for papers as an indication that we have more liberties here - since that is now occurring here.

              Note also that I

          • Re:Scroogle (Score:4, Informative)

            by Smauler ( 915644 ) on Tuesday May 11, 2010 @12:52PM (#32171212)

            The Data Protection Act [wikipedia.org] in the UK was as a result of the Data Protection Directive [wikipedia.org] from the EU. This severely limits what people and/or companies and/or governmental agencies (with some exceptions to the last) can do with information about you. I'm not aware of legislation as strong as this in the US, but I may be wrong.

            I do agree that different countries treat privacy differently - I personally believe that anything I do in public is basically that - public. I won't ever carry papers in my own country, so if somehow the ID card in the UK goes through (looking very very unlikely at the moment), I'll just lose mine every time I get a new one, and reapply. Some people don't have a problem with such things, but I do. The EU is a very diverse place, but that data protection directive means that all EU countries have similar laws with regards to data protection AFAIK.

        • Re:Scroogle (Score:4, Funny)

          by operagost ( 62405 ) on Tuesday May 11, 2010 @03:05PM (#32173090) Homepage Journal
          I'm glad that Eurasia has certified that we are safe from the prying eyes of Oceania.
      • Re: (Score:2, Informative)

        by Anonymous Coward
        real beer, excellent food, beautiful landscape
    • I love Startpage. As a metasearch engine, it's pretty darn good.

    • Re:Scroogle (Score:5, Interesting)

      by Jer ( 18391 ) on Tuesday May 11, 2010 @11:28AM (#32169846) Homepage

      What's more, the link they were scraping off of [www.google.com/ie] seems to be related to Google's support of Internet Explorer. Since it's been replaced with a "go get IE 8" page, it's probably been dumped to encourage people to dump their older versions of IE and get something newer.

    • Re:Scroogle (Score:5, Insightful)

      by cmiller173 ( 641510 ) on Tuesday May 11, 2010 @11:31AM (#32169882)
      You sir are absolutely correct. Anyone who writes an application based on screen scraping should expect changes to happen and not act surprised when they do. Besides doesn't Google have a freaking search API? http://code.google.com/apis/ajaxsearch/ [google.com]
      • Re:Scroogle (Score:5, Informative)

        by _xeno_ ( 155264 ) on Tuesday May 11, 2010 @12:42PM (#32171068) Homepage Journal

        Because if they did that, they'd be forced to abide by the search Terms of Service [google.com]. And they appear to be violating Section 1.4.

        By using the generic web robot approach, they're allowed to scrape Google based on the same concepts that allow Google to scrape third party web pages in the first place.

        From Google's robots.txt [google.com]:
        User-agent: *
        [snip]
        Disallow: /ie?

        Well, OK, so they're not obeying robots.txt in the first place. But ignoring that one pesky fact, uh...

    • Re:Scroogle (Score:5, Insightful)

      by JustinOpinion ( 1246824 ) on Tuesday May 11, 2010 @11:38AM (#32170002)
      Indeed. Scroogle sounds like a good idea... but it's a service that exists parasitically to Google proper. I'm not trying to imply anything unethical by using the word "parasite", but this really is a situation where Scroogle uses Google's capabilities/services without contributing anything back to Google. This is fine to the extent that Google tolerates it. But they are under no obligation to make accommodations to keep these third-party services running smoothly. TFA says "It's not as if Google needs the money" which seems rather uncharitable given that Google has put up with Scroogle's operations for many years now without any complaints or blocking attempts (that I'm aware of). And Google does need some money (they would have to shut down if everyone used their services through Scroogle...).

      Scroogle needs to either adjust their service to keep up with Google's changes, or make a business case to Google for why it is in their best interest to provide a stable interface/API for third-party redistributors like them. The implication in TFA that they are somehow entitled to this interface/API/access is really silly.
      • by cynyr ( 703126 )

        doesn't http://code.google.com/apis/ajaxsearch/ [google.com] count? Also i'm fairly sure that it shouldn't be that hard to get the links out of the normal google search(but it's been a while since i looked at the HTML of the results. yep,

        a href=foo class=1> name /a> (sry i had to mangle it a bit, but the idea should be clear)

        The results are the only links to have a class of 1. so a simple matter of parsing it as XML should work.

      • Re: (Score:2, Informative)

        by Anonymous Coward

        but it's a service that exists parasitically to Google proper. I'm not trying to imply anything unethical by using the word "parasite", but this really is a situation where Scroogle uses Google's capabilities/services without contributing anything back to Google.

        The word you're looking for is commensalism [wikipedia.org]. Although I think in this case it is closer to parasitic since it does use some of Google's resources without giving back much or any value to Google itself.

    • Re:Scroogle (Score:4, Interesting)

      by Anonymous Coward on Tuesday May 11, 2010 @11:43AM (#32170062)

      A couple fun facts about Scroogle:
      1. Since Scroogle hit multiple Google IPs, it used to be possible to search the same keywords 5 or 6 times in a row and see the variation in page rank. Great for web site owners to see how they ranked.
      2. Scroogle dot COM is NSFW at all, so when telling people about Scroogle it was usually CRUCIAL to emphasize the dot ORG part of the domain. At a previous job I made the mistake of telling my boss about it without emphasizing the dot ORG part and, well... he got an eyeful of the wrong type of "org"...

    • by bpechter ( 2885 ) on Tuesday May 11, 2010 @11:44AM (#32170080) Homepage

      The wife used the www.google.com/ie interface for accessability reasons. It worked much better for her with her screen reader. She's totally blind. She'll miss the interface and I know there were others using it for the same reason.

      • by mea37 ( 1201159 ) on Tuesday May 11, 2010 @11:57AM (#32170268)

        I'm legally blind (but not to the extent that I require a screen-reader) and certainly I advocate for accessability features. But, just like the /ie interface wasn't intended to be a stable screen-scraping interface for Scroogle, it wasn't intended to be an accessability feature. That's the problem with using things in unsupported ways. Sure, they may work now - but you have no assurances going forward.

        I'd suggest your wife, and anyone else who finds Google's support for low-vision users lacking, contact them and start lobbying for a proper solution that they will then have proper knowledge of and reason to support.

    • Re: (Score:3, Informative)

      by al0ha ( 1262684 )
      Actually IxQuick is located in The Netherlands which is certainly not Germany and darn well good that it is not because Holland does actually concern itself with privacy protection, Germany on the other hand, not so much.
  • by stagg ( 1606187 ) on Tuesday May 11, 2010 @11:14AM (#32169606)
    If you RTFA you'll notice that Google didn't block Scroogle, they just upgraded without consideration to its functionality. As soon as someone can explain why Google WOULD have Scroogle on a dependency chart we can all put our conspiracy hats back on.
  • The Summary Lies! (Score:4, Insightful)

    by dancingmilk ( 1005461 ) on Tuesday May 11, 2010 @11:15AM (#32169628) Homepage Journal

    What a horrible summary. Google didn't block anything, they just changed the page that Scroogle scrapes off of. Scroogle claims that they need a "simple" interface to scrape off of. Sounds to me like they are too lazy to adjust their service.

    • Sounds to me like they are too lazy to adjust their service.

      ...at the whim of an update schedule that is irregular, unannounced, and liable to massive changes that would break scraping.

      • Re: (Score:3, Insightful)

        by idontgno ( 624372 )

        break scraping.

        Scraping is inherently unreliable. Particularly if you're scraping without the data source's permission or cooperation. It's what you do with the bottom of the barrel.

        If you want reliable, you won't be doing any scraping. If you're doing scraping, don't get bent out of shape with it suddenly stops working. By choosing a scraping solution, you've committed yourself to intermittent service and a continual race to keep up with target interface changes.

        Of you can use the provided API [blogspot.com]? Yes, it h

        • I’m not defending their choice to scrape... just pointing out that whether or not scraping was a good idea, their justification for not scraping (any longer) is perfectly reasonable.

          If you’re asking “why are they scraping in the first place”... well, they found an interface to scrape from that they thought would never change... and surprisingly enough, it did.

          • by schon ( 31600 )

            just pointing out that whether or not scraping was a good idea, their justification for not scraping (any longer) is perfectly reasonable.

            The decision to not scrape is reasonable. What is not reasonable is the outrage and blame placed on Google.

            they found an interface to scrape from that they thought would never change

            So in other words, they're stupid.

            • I really can’t say I disagree.

              I’m just saying it’s not based on quite as much laziness as you made it seem. Yes, they’re lazy, but scraping would’ve been the wrong solution anyway: it’s too much work.

      • by natehoy ( 1608657 ) on Tuesday May 11, 2010 @12:10PM (#32170482) Journal

        Scroogle has the absolute right to a refund for any and all money that they have paid Google because Google isn't living up to the contract where Scroogle pays Google for a stable connec... wait, what was that? Oh, I see. Never mind.

        Scroogle may be providing a service that people value, but they are still using Google to do it, and not paying Google for that access. Google is tolerating this, which is all well and good, but they are under absolutely no obligation to make sure the connection is unchanged. Sites change all the time, and anyone who employs scraping technology as part of their technological solution should not be surprised when they do.

    • The default search there is now is easy to scrape, very easy infact despite the attempts to obfuscate the page.

      There is this basicly:

      div id: ires -- contains the results
      li id: imagebox class g --- image results
      li class g -- normal result

      within li class: g there is: h3 class r - subject/title, and div class s -- result meta.

      It's very easy to scrape. Infact anyone who has done a serious scraper with proper methods in the past, could probably scrape the results with just 30-60minutes of work properly, and fro

  • Duck Duck Go (Score:4, Interesting)

    by Anonymous Coward on Tuesday May 11, 2010 @11:17AM (#32169660)

    I used to use Scroogle for privacy reasons, but switched to Duck Duck Go [duckduckgo.com] a few weeks ago. It is quickly becoming a great privacy-respecting alternative to Google and often gives more relevant results than Google.

    • Re: (Score:3, Insightful)

      by compumike ( 454538 )

      I too have been trying Duck Duck Go [duckduckgo.com] (link to encrypted version) for the last several weeks and have been impressed.

      Furthermore, check out their privacy policy [duckduckgo.com], as well as a recent blog post about search privacy [gabrielweinberg.com] that explains why it "might be the most private place to search the Internet". No IPs logged, no cookies, no contractors.

      There are also a large set of convenient "bang commands" [duckduckgo.com] such as searching "!slashdot foo".

      And finally, searching over (encrypted) HTTPS just works "out of the box".

      Give it a try

  • Umm (Score:5, Interesting)

    by mindstrm ( 20013 ) on Tuesday May 11, 2010 @11:18AM (#32169684)

    When google wants them to stop, they'll be hearing from lawyers........ not just finding that google changed their page layout.

  • by hey ( 83763 )

    Doesn't Google have a real search API they can use?
    Rather than using a kludge like google.com/ie (yuck)

    • Re: (Score:3, Informative)

      by jandrese ( 485 )
      In fact they do [google.com]. It's not clear why Scroogle has such a hard on for screen scraping.
      • Really, I hate this stuff, because I am sometimes asked to write a scraper and it is bloody stupid. Because the moment something changes, anything changes, you have to check it and check it again. And you know, many sites change their layout all the time, if for no other reason then to fix bugs. That is nothing to say of seasonal changes.

        And then there is the legal side, and the ease with which to block you.

        So don't fucking scrape, especially with a well developed and documented API around. Really, scrapi

      • It's not clear why Scroogle has such a hard on for screen scraping.

        Perhaps because the API will probably block them after the first 10,000 hits and Google will ask them to contribute something back ?

  • Ah, Don't be evil? (Score:5, Insightful)

    by GNUALMAFUERTE ( 697061 ) <almafuerte@gmai[ ]om ['l.c' in gap]> on Tuesday May 11, 2010 @11:29AM (#32169854)

    They are being Evil. They have a perpetual obligation to keep every single feature in a time-freeze so that third parties can use them as they see fit!

    Ah, wait, no they don't.

    There is an assload of meta-search engines out there. Scroogle seems to be the only one that has been affected. That's because they were saving bandwidth, processor usage, and programmer's time by using the same fucking simple interface for the last 5 years. So, they've been using an old interface that existed for the SOLE PURPOSE of being compatible with shitty old IE versions .... now that google pulls it out, they bitch about it? Come on ...

    Here is what I hate: Everyone is complaining about the privacy concerns with many services, but nobody stops using them! Everyone feels they have the right for every service to work they way they want it to. Guess what, you don't. You don't like google? Stop using it!. I don't like microsoft. I Don't like anything from them. So, I don't use ANYTHING FROM THEM. Not their software, nor their services, nothing. On the other hand, we have people cracking their software and complaining when they are evil. They ARE evil? stay the hell away from it.

    I'm really tired of this privacy-concerns constant circle-jerking. Stop using the shit you don't like. Simple, huh?

    • They are being Evil. They have a perpetual obligation to keep every single feature in a time-freeze so that third parties can use them as they see fit! Ah, wait, no they don't.

      Wait a minute. This is Slashdot, and the collective hive-mind knows that Microsoft does "have a perpetual obligation to keep every single feature in a time-freeze so that third parties can use them as they see fit". So does Oracle, Apple, HP, or any $big_succesful_company. Why doesn't Google?

    • Stop using the shit you don't like. Simple, huh?

      I do that sometimes, but if I followed your advice to the letter I wouldn't be able to do much at all. It's hard to find big corporations that are truly benign.

    • by sorak ( 246725 )

      Can you write that in an XML format so I can autopost it to my facebook, myspace, and twitter pages? </ducks>

  • Is this the same company that started anonymizing search logs sooner and refused to hand over search data to the US?

    Is there a reason why you NEED a more anonymous search engine? And can you trust the other party you're going through isn't logging your search inquiries?

    Ultimately it comes down to who you trust more. I just don't understand why no one trusts Google when they have the cleanest track record out there.

    • I just don't understand why no one trusts Google when they have the cleanest track record out there.

      This has nothing to do with trust and also has nothing to do with paranoia or wanting to hide something, as is sometimes suggested (not by you, to be fair). Sure, Google has a clean track record but the mere potential for abuse also counts, not just how likely you consider it or whether it has happened in the past. The more data a company has the less pleasant are the consequences when that data is abused. Google is one of the largest or perhaps even the largest data collectors in the world. (I guess the NS

  • Others have already stated this is not actually a block, but if it had been ...

    I wonder if a distributed proxy would work. Run a client on your computer that puts you into a pool. Point your browser to localhost web server where it provides a search interface and submit your query. The client randomly picks another host in the network where your request is carried out and returns your results.

    After a couple hundred thousand users go online the amount of mixed requests muddles the data so much that it's all

    • Re: (Score:3, Informative)

      by gorzek ( 647352 )

      That's pretty much what Tor does, only it can be used for any kind of traffic, not just searching.

  • by valadaar ( 1667093 ) on Tuesday May 11, 2010 @11:34AM (#32169934)
    No other comment - this is simply factually wrong. Let me know when Scroogle can't even resolve Google servers, then they are truly blocked.
  • by NotQuiteReal ( 608241 ) on Tuesday May 11, 2010 @11:34AM (#32169938) Journal
    I don't get it - why "scrape" at all? Google has a real search API, do they not?
  • by jtcampbell ( 199660 ) on Tuesday May 11, 2010 @11:44AM (#32170082) Homepage

    ......by using a different search engine.

    Oh wait - you're weren't generating any revenue for them and were actually costing them bandwidth.

    That will really show them!

  • Why they are not using the actual REST API provided here [google.com]?

    Sounds like laziness to me, and that they are blaming google for their own shortcomings.

  • The AJAX in the name didn't tip you off? If you put a Google search box on your own webpage then searches done from that box still go directly to Google's machines and there is NO anonymizer.

    • Re: (Score:3, Insightful)

      by clone53421 ( 1310749 )

      If the connection was performed by sockets between Scroogle’s own servers and Google’s (which is what they were doing with their SSL searches to screen-scrape the results from the old /ie interface previously) it would be the same level of anonymity as before. AJAX is just a Javascript interface to open sockets and make HTTP/HTTPS requests.

      It’s just a matter of server side vs. client side. The primary reason that an AJAX search is done by your browser rather than your own webpage is becaus

    • Scroogle could use the api on their servers. Really, do you know so little about the web that you think AJAX can't be run server to server? Hell, you don't even need javascript for it, any scripting language can do it if it can do html requests.

    • Comment removed based on user account deletion
    • Well, except for the fact that the user submits the query to the anonymizing proxy, and the proxy (using the API) submits the query and returns the response. Which is all the webscraper approach ever did.

      Maybe you're getting confused by the use of AJAX? Think anonymizing server running server-side Javascript, not client browser directly executing Javascript.

  • Google made some changes to the results that broke Scroogle.

    And no, I don't think the intentionally did it.
    And yes, I like the new search returns,
    and no I don't work fr Google,
    and yes there are other ways to do exactly what Scroogle was doing.
    and no I don't like these kinds of lists,
    and yes I'll stop now.

  • by Animats ( 122034 ) on Tuesday May 11, 2010 @12:17PM (#32170628) Homepage

    Google once had a real search API. [google.com] It was SOAP-based. But they discontinued it years ago.

    Google's AJAX search API [google.com] is, by design, very limited. All you can really do is create a little search widget, and perhaps add some fields of your own. The term [google.com] prohibits doing much beyond that. "You are allowed to use the API only to display, and to make such uses as are necessary for You to display, Google Search Results on your Property. The API does not provide You with the ability to access, and You are not allowed to access, other underlying Google Services or data. Subject to the limitations and conditions described below, " ... "You agree that You will not, and You will not permit your users or other third parties to: (a) modify or replace the text, images, or other content of the Google Search Results, including by (i) changing the order in which the Google Search Results appear, (ii) intermixing Search Results from sources other than Google, or (iii) intermixing other content such that it appears to be part of the Google Search Results; or (b) modify, replace, obscure, or otherwise hinder the functioning of links to Google or third party websites provided in the Google Search Results. " Given those restrictions, you can't write Scroogle using that API.

    We have a SiteTruth search page which uses the Google AJAX API. [slashdot.org] We're prohibited from re-ordering the entries or removing any of them. Since the whole point of SiteTruth is to re-order search results by business legitimacy, and we don't do that for the Google results, the Google results are inferior to the ones from other search engines. So our primary search page uses Yahoo/Bing. [sitetruth.com]

    • So... if somebody performed a search that would return illegal content (child porn, say), the Google search API terms would prohibit you from removing those listings even if you wanted to try to detect and remove such things from the results...

      Sounds almost like Google is endorsing the content they provide, to me... like they just guaranteed that none of their results would contain illegal images. Because if they did, and if you knew this, then you would be lawfully required to filter them out (based on you

  • There are only lazy coders.

    Google is under no obligation to spend effort making it easy to use their site in a way not intended by them - particularly since Google provides an actual API that does not need any scraping.

    It's like reading the newspaper over someone's shoulder in the train, and then complaining that they turn the page too fast to keep up.

  • I use Google but I don't like the new sidebar.
    There doesn't seem to be a way to remove it.
    It doesn't appear on a browser from 2001 but does from 2005.
    There aren't even any ads.
    Can anyone help me remove the sidebar and get it back to the "cleaner" appearance, even with ads?
    Without updating to a new browser, or using Google toolbar or something.

    There may be privacy issues but as now I am not worrying about them.
    I probably should but I don't.
    I do like the preference keeping like the 100 items per page though.
    I

The Tao is like a glob pattern: used but never used up. It is like the extern void: filled with infinite possibilities.

Working...