How Hackers Listened Their Way Around Google's Recaptcha 101
An anonymous reader writes with this story at Ars Technica: "Three self-taught hackers from the DC949 hacker collective managed to use a combination of techniques to beat ReCaptcha with 99.1% accuracy (better than most humans!)" In short, the hackers skipped the visual part of the Recaptcha system entirely, focusing on the audio alternative, which gave them a few convenient angles of attack. Google responded with changes to the system, but that doesn't minimize their accomplishment.
First! (Score:1, Funny)
Oh yeah! Not even a recaptcha to worry about!
How far behind were the criminals/spammers? (Score:2)
Re:How far behind were the criminals/spammers? (Score:5, Interesting)
Quote summary:
Google responded with changes to the system, but that doesn't minimize their accomplishment.
On the contrary, yet is does minimize their accomplishment. It makes it all for nothing, a technical exercise, with no near term or long term payback.
Recaptcha is a huge con, no more secure then the original captcha. The second (or first) portion being there only to serve some other purpose, and any answer will do.
Adding the audio option (probably forced by ADA) did nothing for security. At best this demonstrates that adding multiple different keys to the same lock makes things worse, not better.
Captcha's original intent was to slow down bots, by making the user prove they were human. They are seldom used to protect anything
of value, simply to keep the nuisance bots to a dull roar.
Now it appears that machines can beat captcha and recaptcha very easily. So WHY do we still see these schemes in use?
Re:How far behind were the criminals/spammers? (Score:5, Insightful)
Because even a very "high" accuracy machine system is still going to add a significant barrier to automatically cracking the results, especially if Google continues altering reCAPTCHA like they do. While you won't eliminate 100% of attackers, you can eliminate the vast majority, and slow down the attackers that do get through. The alternative is to use nothing, and believe me: you absolutely do not want that. The Internet would be 99.99999999% spam almost overnight if that happened.
Re: (Score:3)
intelligence on /. bravo dear sir...
Re:How far behind were the criminals/spammers? (Score:5, Informative)
Re:How far behind were the criminals/spammers?
At about 75%, from what I read on the black hat forums.
There's a whole social spam ecosystem out there now, with tools and services for spamming Facebook, Twitter, Instagram, Google+, Yelp, Tumblr, Youtube, random blogs, and for retro types, Myspace. It's not just a few people doing this. It's an industry with a supply chain. Read my "Social is bad for search, and search is bad for social" [sitetruth.com] paper for an overview. If it feeds into Google search rankings, it's being spammed.
Re:How far behind were the criminals/spammers? (Score:5, Insightful)
Now it appears that machines can beat captcha and recaptcha very easily. So WHY do we still see these schemes in use?
Could you give me your address, and let me know when you won't be home? (I presume you no longer lock your house.)
Re: (Score:2)
Re:How far behind were the criminals/spammers? (Score:5, Interesting)
On the contrary, yet is does minimize their accomplishment. It makes it all for nothing, a technical exercise, with no near term or long term payback. Recaptcha is a huge con, no more secure then the original captcha. The second (or first) portion being there only to serve some other purpose, and any answer will do.
It's funny that you'd complain about a waste of effort and then bemoan Recaptcha, which was developed to prevent all those man-years of solving CAPTCHA's from going to waste.
BTW, the founder of Recaptcha has expressed that he will be happy when it can be defeated trivially because at that point the other job it's trying to do can be completely automated, which is still a win.
Re: (Score:1)
Not if the trivial defeat simply consists of solving the "easy" word and filling in junk for the hard one. Which is what a fair number of humans do.
Re: (Score:2)
Re: (Score:2)
Why do you assume that it comes just as Google makes changes to the system? Are you positive the change to the system did not stem from them reporting this to Google, and then following safe disclosure practices gave Google time to fix it, before going public. Are you sure they didn't do all this, then report it to Google and collect a "reward" for what they found?
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:1)
Moderation
50% redundant
50% funny
I wonder how the first post can be redundant.
Re: (Score:3)
Weakest Link (Score:3)
Re:Weakest Link (Score:5, Funny)
If they can solve captchas at 99% accuracy, I hope they develop a browser toolbar or plugin I can use.
Re: (Score:1)
Too late, Google took steps to fix it before the exploit was widely announced, according to TFA.
Re: (Score:2)
Too late, Google took steps to fix it before the exploit was widely announced, according to TFA.
The real spammers who most likely had this figured out 6 months ago are probably slightly annoyed.
Re: (Score:2)
Re: (Score:1)
For a handful of sites... but it doesn't have a decrypter plugin for recaptcha.
Re: (Score:2)
Yes... very wise...
Re: (Score:3)
Audio ReCaptcha is the Weakest Link! Goodbye!
Singularity (Score:4, Insightful)
Since they beat the Turing Test, this means we've reached the AI singularity... right?
Re: (Score:3)
"More human than human." It just means the Tyrell Corporation was working on it.
Re: (Score:2)
Re:Singularity (Score:4, Interesting)
You bring to mind something I read long ago, too long ago for a citation. A researcher was running a turing test with one subject seeing if he could decide which terminal was a computer and which had a computer on the other end.
The tester just sat there without inputting anything. Pretty soon a message came up on one screen: "Is there anybody there?"
"That's the human," the tester said
Re: (Score:2)
Weird thing is, I actually work on a product called 'Skynet'. It's a website used to keep track of vehicle fleets.
It's not self-aware yet, but I'll be the first to warn you when it does :)
Snake meet tail (Score:5, Insightful)
I realized there's an interesting aspect to this, in that gVoice transcription is actively trying to do basically the same thing these guys did* (albeit in a far more general way). Wonder how gVoice would do transcribing google's own recaptcha audio. Someone go try that. Either way though, it's an interesting dilemma if they ever got automatic transcription good enough to defeat these audio recaptchas.
* Well, after RTFA, I realize that a fair bit of what they did was actually more related to hashing (and the pseudo-random generator) vs actually trying to parse the audio, but still.
Re: (Score:2)
Having seen lots of google voice transcriptions, I'm pretty sure it couldn't transcribe it's way through the most articulate of all audio captchas. Years of training and it's only gotten worse.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
I use it on android to send about 200 texts a month. Once you learn to speak naturally instead of over-enunciation everything, it does quite well. I suspect a big part of the issues with voicemail transcriptions is partly to do with audio compression on cell phones.
Re: (Score:2)
I'm sorry, you can register a google voice number as well if you like. I send about 1/2 those texts from my non cellular enabled android tablet at home via wifi. Get with the times and liberate your phone number from your cell carrier.
Thanks.
Re: (Score:2)
Re: (Score:1)
I watched the video (hilarious, btw). Someone in the audience asked if they had tried Google's own speech recognition. They had, and it couldn't solve the audio captcha.
Re: (Score:2)
I did that three years ago. All my posts are by bots.
2
3
5
Re: (Score:2)
Wonder how gVoice would do transcribing google's own recaptcha audio. Someone go try that. Either way though, it's an interesting dilemma if they ever got automatic transcription good enough to defeat these audio recaptchas.
* Well, after RTFA, I realize that a fair bit of what they did was actually more related to hashing (and the pseudo-random generator) vs actually trying to parse the audio, but still.
In the presentation they did that question was raised and they stated that using gvoice was the first thing they did with no luck.
Another solution.. (Score:5, Informative)
Most of the spammers who circumvent captcha's use real people to fill in their captcha's for them. How they do it:
1) A pay-per-filled-in-captcha site (where members solve captcha's, not really getting paid eventhough they think they will be) OR a high traffic site (false/scam sites, hacked sites, etc)
2) Mirror the image from the site you want to spam to your own site
3) A person visits your own site with the mirrored image and solves the captcha
4) Mirror the answer back to the site you want to spam
5) ???
6) Profit! (literally)
Re:Another solution.. (Score:5, Insightful)
Reminds me of the story of the guy who would play 8 games of chess simulataneously in an octagon and absolutely guarantee he'd win 50% of the games at least.
He then proceeded to play the moves of the players opposite each other against each other.
Re: (Score:2)
55% of professional chess matches end in draws. 45% to the power of 4 is 0.17%.
If he had claimed "he would lose less than 50% of the games" then he would be correct, but that sounds a lot less impressive.
Re: (Score:1)
Does this guy take bets and where can I find him?
55% of professional chess matches end in draws. 45% to the power of 4 is 0.17%.
If he had claimed "he would lose less than 50% of the games" then he would be correct, but that sounds a lot less impressive.
Sorry, I misspoke. I'm certain the wager was that he would not lose more than half the games, or perhaps that a draw would result in a rematch.
Re: (Score:2)
Also, what you calculated was the probably to not draw in 4 consecutive games, not 4 out of 8. There are the same number of ways to lose 4 out of 8 as there are to win 4 out of 8. Thus, there is a 50-50 chance of winning or losing. Therefore, it doesn't matter if we're talking about 4 out of 8, or just 1 game. Based on your statistic, the probability of winning or losing 4 out of 8 is 45%, not 0.17%.
Re: (Score:2)
Re: (Score:2)
Seconds of all, there are only 4 chess games going on. I don't know where you got the number "8" from.
Ostensibly, the con-artist claims "I'm play 8 chess games against 8 players simultaneously."
What's actually happening is that he's using the moves of A against B, C against D, E against F, and G against H. Thus there are only 4 chess games going on.
Out of 4 chess games, there are precisely 5 possible outcomes:
4 winners: 45%^4 * 55%^0 * 4 choose 4 = 4.1% (I accident
Re: (Score:2)
Actually, you're totally right about the 4 games being all that matters. (assuming he doesn't alter the moves, and just plays them off each other).
For some reason I automatically assumed that the question was only talking about games that were either won or lost.
But, let's assume you're right. Then the con-artist only won his bet 4.1% of the time. This is not a very good con if the con-artist loses
Re: (Score:2)
absolutely guarantee he'd win 50% of the games at least
"he wouldn't lose at least 50% of the games" would be more accurate (draws)
Sounds like bull (Score:2)
That would work for an opening move but the whole point of chess is that there are many opening moves and with each additional move the possible moves explode until you need a very special sort of mind or a big computer (IBM big, not your pitiful 6 core big) to sort it all out.
How would your guy make sure the moves of the opposite player have any bearing on the moves on the other board? It would be like playing blackjack by copying what the guy next to you does. SMART, if by some miracle you had the same ca
Re: (Score:2)
You should read some of your sibling comments (hell, there was a video clearly explaining this). What GP would do is play off each other player. To be more specific, he would play black for 4 games and white for 4 (this is the usual setup for playing multiple games simultaneously, incase you did not know). He would see the move the white player makes, not respond to him. Move on to the next board, make the same move on this board. Observe the response, and remember it, so that he can play it in the previous
Re: (Score:1)
Collective? (Score:1)
"Better than most humans" (Score:5, Funny)
That's it! Make all users do a SERIES of incredibly hard recaptchas. Those who get too many correct are machines! Brilliant!
Re:"Better than most humans" (Score:5, Interesting)
...especially if they solve them in less time than the duration of the audio. (Only half kidding: They solved millions of eight second long captchas in a second and a half each and Recaptcha didn't even blink.)
Re: (Score:2)
...especially if they solve them in less time than the duration of the audio. (Only half kidding: They solved millions of eight second long captchas in a second and a half each and Recaptcha didn't even blink.)
or maybe it did blink and that's what tipped off Google to change the system?
Re: (Score:2)
I think the captcha on Coding Horror used to always be "orange". I don't know how much time Atwood spent deleting spam, but I certainly never saw any (besides his own).
Gone too far... (Score:4, Interesting)
Re: (Score:1)
I apologize that I'm anonymous coward here - too lazy to log in (copb.phoenix) - but there is a better solution.
Machines are not too good at following natural language, so rather than a capcha, a problem written in natural language would - in theory - work best.
Something clear enough to a human eye, but not too obvious mechanically. One of the best ones I ever saw was not labelled at all, other than "signincheck" on the form and said "tob0rAtONm@i in the reversed proper English, please?"
Re: (Score:2)
Even that might not work in the long run. IBM Watson gets better every day. It's good enough already for chatbot and it wasn't even designed to do that. I think watson might be nearing ai complete for natural language. Just give it a couple of years and see what else comes up
Re: (Score:2)
Re: (Score:1)
Re: (Score:2)
yawn (Score:3)
It EXACTLY minimizes their accomplishment. Everyone knew the day that was easily exploited, google would get a little less accessable to the disabled. Everyone knew it was the weakest attack point. (jerks!)
Re: (Score:2)
Better rate than me (Score:2)
They get harder, and these days I'm four for five at best.
Maybe I'm just a machine dreaming I'm human?
I'd like to find out how to break it too (Score:3)
Google's captchas are the worst I've ever seen. They're almost always unreadable and need to be refreshed all the time. I like Recaptcha (which isn't what Google uses on their sites despite owning it), they're generally pretty clear and in addition provide a free service to anyone that wants to use it. I have no clue why Google sticks with their awful in-house captchas for Gmail, Youtube, etc.
I gave up on Recaptcha and now use AreYouAHuman (Score:1)
Someone recently brought "AreYouAHuman" and its "PlayThru" security test to my attention.
http://areyouahuman.com/
I've been using Recaptcha on a niche website I operate for a couple years now, and people have been increasingly complaining about how hard it's getting. While it's English-only right now, PlayThru is very easy to complete, sorta fun, and best of all it tells you whether you got it right before you submit the form, so there's no hoping or guessing. So after a few quick tests, and users raving abo
Re: (Score:3)
Ah but click on the "accessible" option and lookie lookie, an mp3 audio file with gibberish and a background voice. "enter the words you hear".
So this exploit would at least prevent using that option.
The game concept is pretty good though, they just need to make an accessible version.
Re: (Score:2)
Funny you should mention areyouhuman.com. It actually relies on recaptcha for accessibility. You would have vulnerable by the attack TFA talks about too.
I bet Siri could solve it. (Score:4, Insightful)
I bet Siri could solve it.
All the voice tools out there could be harnessed to this sad end.
Re: (Score:1)
Don't know why it never occurred to me... (Score:1)
...to use the audio version instead of the text version for those damn things. I bet the audio version doesn't have words that show up with weird non-alphanumeric characters or completely inked-out text in them, like a nontrivial percentage of the recaptchas I see seem to have.
Envy (Score:1)
Rather a neat way to make an employment application.
they managed to correctly answer audio captcha? (Score:4, Funny)
I did once get an audio captcha that was almost solvable -- AFAICT, it was a conversation between C'thullu in his native tongue and Tom Waits responding in Aramaic, recorded in a crowded airport terminal that had lots of loudspeaker announcements.
Re: (Score:2)
Only 58 words to crack (Score:2)
reCAPTCHA was also undermined by its use of just 58 unique words
I'm really surprised the corpus was so small. Would have expected to be on the order of thousands.
Attention whores claim 99.1% accuracy (Score:2)
New idea! (Score:2)
CAPTCHA alternative (Score:2)
The basic idea behind CFFormProtect is that spam protection shouldn't involve annoying hurdles that users have to jump over, and should be as invisible as possible to the user. It takes what I would say is a similar approach to SpamAssassin, in that it uses multiple heuristic methods to rank form postings for potential spamminess. I've used it extens
These hackers should be awarded. (Score:1)
Well done, sirs.