Researchers Break Video CAPTCHAs 109
Orome1 writes "After creating the 'Decaptcha' software to solve audio CAPTCHAs, Stanford University's researchers modified it and turned it against text and, quite recently, video CAPTCHAs with considerable success. Video CAPTCHAs have been touted by their developer, NuCaptcha, as the best and most secure method of spotting bots trying to pass themselves off as human users. Unfortunately for the company, researchers have managed to prove that over 90 percent of the company's video CAPTCHAs can be decoded by using their Decaptcha software in conjunction with optical flow algorithms created by researchers in the computer vision field of study."
the technology race (Score:3, Insightful)
Commies vs West
MPAA vs sharers
coders vs decoders (that includes captcha vs decaptcha)
It's fun to observe it when government does not interfere.
Re: (Score:2)
https://en.wikipedia.org/wiki/Cold_war [wikipedia.org]
https://en.wikipedia.org/wiki/SOPA [wikipedia.org]
http://www.wired.com/threatlevel/2010/07/ticketmaster/ [wired.com]
Re: (Score:2)
Cold war did not have a common sovereign on both sides, so the technology developed unhindered.
Before SOPA (that is now) government does not interfere in technology itself - torrents are still legal.
Government always interferes, catching what it thinks illegal use of technology. What I meant is outright ban on technology.
Re: (Score:1)
Aren't all CAPTCHAs doomed to fail eventually? (Score:5, Informative)
"Anything a computer can generate it can understand."
This is why chat bots still suck. Computers cannot generate context.
Re: (Score:2)
"Anything a computer can generate it can understand."
Well that's besides the point, isn't it? A computer can generate and understand hashes, but that does not mean they are easily breakable
You just need to make the decoding much harder than the encoding. There must still be computational areas in the visual domain where we humans are way more efficient.
Re: (Score:3)
There must still be computational areas in the visual domain where we humans are way more efficient.
Even if that is the case, there is still a relatively straightforward attack on captchas: the mafia porn site. It is generally easier to use a mechanical turk to decode captchas than to attack captchas algorithmically.
Re: (Score:1)
Link please?
Re: (Score:1)
Re: (Score:1)
I've always thought that going with a higher level thinking would be harder to break. Instead of copying letters from an image you have to identify a set of images that is easy for a person but more difficult for a computer. Think children's picture book type deal. Can a computer reliably tell a dog from a cat from a cow?
Re: (Score:3)
I've always thought that going with a higher level thinking would be harder to break. Instead of copying letters from an image you have to identify a set of images that is easy for a person but more difficult for a computer. Think children's picture book type deal. Can a computer reliably tell a dog from a cat from a cow?
I think that's a pretty good thought. I'd extend it with perhaps one of those, "which of these things doesn't belong" type of setups (which may have been what you meant). It could then show pictures of a banana, an apple, an orange, some grapes, and a baseball hat. I don't know, perhaps there is a way to solve these easily by computer. But I know the stupid text CAPTCHAs that I had to go through yesterday to sign up for one site were so "obfuscated" that I couldn't read them either and I had to click the
Re: (Score:3)
I know I've seen this idea before. I wonder why I've never actually seen it implemented anywhere. It seems pretty easy to do to. Collect images (either drawings or pictures), and assign tags. For example an apple might have the tags 'apple', 'fruit', 'food', and 'red'. Then when the system generates a captcha, it picks a random tag in its database, and finds 4 images with that tag, and 1 without. The user should be able to pick out which images isn't a 'fruit' or 'red'.
Users could even be used for ass
Re: (Score:2)
Re: (Score:1)
Maybe not but Apu sure can and he'll do it for peanuts.
Re: (Score:3)
There must still be computational areas in the visual domain where we humans are way more efficient.
On your left, you will see 21st century purely organic brains. Their limited capacity neural networks had not yet been mechano-electrically enhanced with additional storage, high speed neuronal interconnects, broad EM spectrum sight, or even simple wireless intercourse, or "telepathy" as the luddites of the past initially called it.
On your right, you will see the first machine intelligence construct which exceeded human levels of complexity. Not to worry, the intelligence that once inhabited this form
A Comic Strip Proposal (Score:1)
You could have panels of comic strips that combine visual information to provide a context, and partially filled word bubble with one very obvious missing word.
By combining visual cues to provide context for missing word, you could at least make it harder for algorithms, although Underpaid Indian People attack still works.
Re: (Score:2)
Re: (Score:1)
mod parent up! great points..
Re: (Score:2)
Computers cannot generate context.
They're getting better at it.
http://www.wolframalpha.com/ [wolframalpha.com]
Re: (Score:2)
"Anything a computer can generate it can understand."
Thus explaining the prevalence of these:
https://en.wikipedia.org/wiki/Ciphertext_only_attack [wikipedia.org]
The problem is not creating things which are hard for computers to decode, it is creating things which are hard for computers to decode but easy for humans. That is why captchas will ultimately fail: they rely on the idea that there is something that human brains can understand which computers cannot decode, but which computers can still generate.
Re: (Score:1)
As soon as computers are as capable as people, captchas are no longer necessary. Then computers can directly detect and block unwanted behaviour. With the added advantage that it can block that behaviour even if real humans do it.
Re: (Score:2)
Re: (Score:1)
Re: (Score:3)
What does breaking CAPTCHAs really do that's so bad to society? Comment quality goes down due to spam? a ticket scalper buys up a bunch of tickets to an event o
Re: (Score:1)
They were joking... *facepalm*
...the classic problem... (Score:2)
..if your user can interact with it, they can screw with it. The nature of HTTP and the web is a stateless environment, one has to impress state onto it for things like secure transactions and sessions. Basically, you need to come up with a test that randomly checks to see if the input is coming from a person; all without breaking the experience of the web browser, or the web in general. It's an arms race, and things are even again; another advantage bites the dust.
Why bother (Score:5, Insightful)
The catchpa is worthless against an army of Indians being paid just pennies a pop to break them. The only thing they do is annoy the script kiddies. Far better success would be had in doing pattern recognition on sign ups instead.
Re:Why bother (Score:5, Insightful)
The catchpa is worthless against an army of Indians being paid just pennies a pop to break them. The only thing they do is annoy the script kiddies.
No, They also annoy your actual, real human users. I often have to try three or four times to get the bloody thing right.
Re: (Score:2)
Words cannot express the rage I felt when I needed to register an XBox Live account to play a game I purchased because of the stupid G4WL DRM nonsense. I spent around 10 minutes on the bloody captcha because it differentiated capital, lowercase, number, and symbols. It was the most absurd captcha system I've seen to date. Was it an O, and 0? lowercase L or uppercase I? Was that a dollar sign or just some lines thrown in to distort the word further? An M or a W flipped on its side (was a 90 degree squiggle t
Re: (Score:2)
Secure sign-in with google or Facebook for a single player game and now we are tracked everywhere with all of our personal info attached.
.
Not even pennies (Score:2)
The going rate is around $1 per 1000 solved catchpa.
Good... (Score:4, Funny)
Re: (Score:2)
dompoli sprain?
That doesn't seem impossible, especially since only the first one will matter.
All capcha's eventually doomed (Score:2)
Re: (Score:2)
Multilevel marketting scam, anyone?
Breaking Captcha? (Score:2)
Re: (Score:2)
Don't these researchers ... (Score:3)
have anything else to do?
Sorry, had to say it.
Re: (Score:2)
Constructive (Score:4, Funny)
http://xkcd.com/810/ [xkcd.com]
At least something good could come out of captchas.
Re: (Score:1)
And after all that effort, the spammers will hire some Indians for pennies to figure them out. You'll be out lots of time and effort, they'll be out a couple dozen bucks.
Riddlers for niche sites (Score:2)
If you have a small-ish site that caters to a niche community where your target audience will share some knowledge that non-target folks don't have, a riddler where you can set the questions can work great. Just structure your questions in such a way that the answer is non-obvious in an automated way to all but the best AI engines.
For example, Phoronix could use a question like this --
Which of these is superfluous? Intel, ATI, NVIDIA, AMD
Re: (Score:2)
For example, Phoronix could use a question like this --
Which of these is superfluous? Intel, ATI, NVIDIA, AMD
And even that isn't as clear-cut as you might think. Most people probably think that ATI is superfluous, but if so, they're wrong.
If you say "ATI, nVidia and Intel", you don't need to mention AMD cause it's impled, thus AMD is superfluous.
If you make a question unambiguous enough, computers can answer it too. You can overwhelm a computer system by the sheer amount of ways to ask things, but then you need a human, who in the long run can't produce captchas as quickly as a computer can fail them.
Re: (Score:2)
It's just an illustration, but just like it can be hard for humans to decipher a captcha, it could be hard to understand the logic -- Intel, AMD and NVIDIA are all companies where ATI was actually purchased by AMD and would thus make it superfluous.
If it were easy to answer, it would be easy for automation to crack it.
Re: (Score:1)
You need a human to generate these types of questions. That limits the number of them you can cheaply create.
then a spammer gets a human solve them each once, record the answers and play them back as needed.
You might as well pay for a live operator to verify each person. Of course if you set that up, computers will pretend to be hearing impared and demand access through TTY interfaces.
Turring test anyone?
ReCAPTCHA needs to be retired (Score:5, Informative)
The CAPTCHA industry is not doing well.
ReCAPTCHA needs to be retired. OCR is getting too good. ReCAPTCHA, remember, is using images from book scanning, ones that the OCR system couldn't recognize. When ReCAPTCHA started, the text presented was usually an English word. Now, if the book scanning OCR system can't figure out something, it's probably not an English word. You're lucky if it's a sequence of characters found on an A-Z keyboard. People have reported ink blots, mathematical formulas, and Cyrillic.
Worse, ReCAPTCHA's idea of the "right" answer is crowdsourced. It's possible for bots to pollute the ReCAPTCHA database, by providing the same wrong answer more than once. You only have to get one of the words right, so if you can read one, a junk response for the other works. This goes into the database as a vote for the "right answer", to be presented to someone else later. I sometimes type "whatever" when one of the images is unreadable.
Re: (Score:1)
I sometimes type "whatever" when one of the images is unreadable.
You're missing an opportunity to add words to past texts. I always type "bunga-bunga". My hope is that someday in the far future, a scholar of historic literature will be scratching his head wondering why all these old books have the phrase bunga-bunga thrown in at random places.
Re: (Score:2)
I sometimes type "whatever" when one of the images is unreadable.
You're missing an opportunity to add words to past texts. I always type "bunga-bunga". My hope is that someday in the far future, a scholar of historic literature will be scratching his head wondering why all these old books have the phrase bunga-bunga thrown in at random places.
To screw things up?
Re: (Score:1)
Another reason I recently realized that recaptchas are useless: The whole idea is that one of the words could be read by a robot [spoiler]from the start[/spoiler] to be included in the rotation. Now, granted, they've modified the word to try and anti-robot it, but the fact remains that at some point it was readable; the other "word" never was. Thus it had a limited lifespan until the spambots caught up in OCR to Google's bots.
Re: (Score:2)
ReCAPTCHA needs to be retired.
Perhaps.
OCR is getting too good.
I've spoken with the founder of ReCAPTCHA about this when he came to campus for a talk several years ago. It's both the expected end game and seen as a victory ("we forced OCR to become usable with market pressures").
Don't worry, they have other puzzles in the queue that need machine comprehension models.
Re: (Score:2)
Worse, ReCAPTCHA's idea of the "right" answer is crowdsourced. It's possible for bots to pollute the ReCAPTCHA database, by providing the same wrong answer more than once. You only have to get one of the words right, so if you can read one, a junk response for the other works. This goes into the database as a vote for the "right answer", to be presented to someone else later. I sometimes type "whatever" when one of the images is unreadable.
Not just bots - humans can (unintentionally) do it as well. Sparkfun (an electronics hobbyist site) recently had a giveaway in order to stress test their servers. Several thousand people were solving CAPTCHAs as quickly as possible. There was a noticeable drop in the accuracy of the answers required, since a lot of people were taking shortcuts in entering them.
Re: (Score:1)
Because image recognition research is beneficial in many areas. Also, Captchas are mostly snake oil as there are tons of Indians willing to be paid next to nothing to break thousands and thousands of Captchas for the spammers anyway.
Re: (Score:1)
Because "focusing on Captchas" is dealing with image recognition directly? Besides, improving OCR to break Captchas directly helps improve the ability to OCR old and badly scanned works. Also, do you think that if these people did stop working on it that no one else will? Isn't it better for the good guys to be showing us the weaknesses rather than the bad guys exploiting it due to everyone being ignorant of the flaws? You can't fix flaws if you stop people from researching into them.
Multi-group CAPTCHAS (Score:2)
Still, it would only be a matter of time before the bots figured out how to track all the CAPTCHAs and thereby defeat it yet again.
Re: (Score:2)
Actually, the entire reason we have captcha is because the techniques you just listed don't work any more. Bots learned to run Javascript and ignore hidden fields years ago. Even if the bots could not do those things, it still wouldn't matter because whoever codes the routine to submit the form will pick up on those things. The best you can do is make it inconvenient enough that they will pick another target instead. But if you are Yahoo or Google or Wordpress, that won't deter them.
Re: (Score:2)
IIRC, eBay does all sorts of javascript loads and changes their HTML layouts commonly to reduce screen scrapers from crawling auctions. This cuts down on the problem, but people are still able to find a way to do it if they want it enough.
WHERE do I purchase (Score:1)
I NEED one of these captcha solver programs. When I try to register for a website or forum, many of them are so unreadable it takes me 20 minutes of trying to get it right and NO PHONE NUMBER to call their technical to register me by tele.
Charge CPU Time instead (Score:3, Interesting)
Re: (Score:3)
https://en.wikipedia.org/wiki/Hashcash [wikipedia.org]
So far this has not been widely successful, although perhaps it is because it targets the email system rather than the web (where things tend to change faster).
Easier said than done (Score:3)
What about charging 10-15 seconds of CPU time with some arbitrarily hard code?
A major obstacle to this is that you have to make the puzzle easy enough that your users on lower-end or mobile devices still have the necessary computation power to complete the puzzle in a reasonable time. Malicious organizations behind the spam will just put more hardware into their attack, typically by using the compromised machines in botnets. They'll also optimize the code, and parallelize the attack by performing the computation for multiple attempts on multiple CPU cores, while your code has to wo
Re: (Score:2)
They'll also optimize the code, and parallelize the attack by performing the computation for multiple attempts on multiple CPU cores
Then perhaps you should base the challenge on something from this class of problems:
https://en.wikipedia.org/wiki/P-complete [wikipedia.org]
Let's now imagine a perfect world in which you create a check that actually takes 15 seconds to complete. They can still do that 5,760 times per day.
The point of this proposal is not to stop spam entirely, but to keep the rate at which spam can be sent down to manageable levels. If a spammer can only send 5760 spam messages per day, that is a big improvement -- right now spammers are limited only by bandwidth, and can send tens of thousands of messages per day.
Great, now make this into an OCR program. (Score:1)
Low-tech is the new Hi-tech (Score:1)
Diversity and biological analogues (Score:3)
The key with CAPTCHAs is diversification, just like the key to avoiding disease in biological specimens is avoiding a monoculture. If there were 15000 different CAPTCHA methods, it wouldn't be profitable to create CAPTCHA tools that would each only work on some small subset. There are a lot of low population sites I use that check whether I'm a human with some unique set of hoops through which I must jump. The effectiveness of those hoops comes from the fact that they're often unique to that site, not a lump of code used by thousands of different sites. Diverse CAPTCHA breaking might require something like Watson, which isn't going to be available to spammy types in the near future.
Simple Solution: Porn (Score:3)
Have the captcha page displays some really good porn video footage - drawn from a huge repository of suitable images (say, the rest of the internet). The clips are fairly long (say 3-5 mins or so). To pass the captcha the user merely has to click on a button at the right time. :P
So, if the user clicks right away, its a bot. if there is a suitable pause (say 3-5 mins), then its more likely human
Just Wondering? (Score:1)
nowadays I refresh at least twice (Score:1)
finding a captcha is on the verge of proving that you ARE indeed a robot...
Whatever happened to cat captcha? (Score:2)
I always thought that was pretty secure because the machine couldn't tell which picture was a cat? What about combining video and cat captcha. 4 videos, one of which is a cat. But it could be a close video, or a zoomed out one where the cat is running around. A computer really shouldn't be able to decode that. Use a large enough database and they'll never solve it.
Re: (Score:2)
You'd need a lot of pictures and videos of cats. Good luck finding that on the internet!
Re: (Score:2)
if I have to start watching videos just to sign up for some forum or so, then the sign up is probably just not going to happen. Your idea sounds several orders of magnitude more annoying than the already highly annoying captchas in use (with ReCAPTCHA on the top of annoyances - most of them are simply unreadable).
Re: (Score:2)
Catch captcha has been around for awhile, but it doesn't even have to be a video. It could be 4 animated gifs. A computer would have a hard time deciphering an animated gif of a cat running across a room. But for a human they're very easy. Easier than recaptcha.
Chat with reCaptcha Creator (Score:1)
Stanford University (Score:1)
deCaptcha as browser plugin (Score:2)
The best captcha on the planet: (Score:1)
Made to look like a captcha, with the text, "What is 2+3?" Spammers read the captcha and submit that back. Normal people type 5. "What site is this?" is another good one. Heck, you don't even need to make it look like a captcha; it's just funnier that way. One site I mod used to get dozens of spam threads a day, until a couple years ago they added a box to the end of their registration with one of a handful of questions like "what do seal clubbers club" (answer: seals), or "what is the first letter of the a
Re: (Score:1)
Oblig XKCD.
http://xkcd.com/1019/ [xkcd.com]
Re:Can't stop... (Score:4, Insightful)
Re: (Score:2)
lol wut?