Breakthrough In Face Recognition Software 142
An anonymous reader writes: Face recognition software underwent a revolution in 2001 with the creation of the Viola-Jones algorithm. Now, the field looks set to dramatically improve once again: computer scientists from Stanford and Yahoo Labs have published a new, simple approach that can find faces turned at an angle and those that are partially blocked by something else. The researchers "capitalize on the advances made in recent years on a type of machine learning known as a deep convolutional neural network. The idea is to train a many-layered neural network using a vast database of annotated examples, in this case pictures of faces from many angles. To that end, Farfade and co created a database of 200,000 images that included faces at various angles and orientations and a further 20 million images without faces. They then trained their neural net in batches of 128 images over 50,000 iterations. ... What's more, their algorithm is significantly better at spotting faces when upside down, something other approaches haven't perfected."
Upside Down? (Score:5, Insightful)
"What's more, their algorithm is significantly better at spotting faces when upside down, something other approaches haven't perfected."
Add this step: Rotate the image and run the algorithm each x degrees. What am I missing?
Re:Upside Down? (Score:5, Insightful)
Re: (Score:2)
I looked at the paper and did not see where "performance" was defined or any hint as to what units it has. How is performance calculated? I see they compare "precision" to "recall".
From the article: “We evaluated the proposed method with other deep learning based methods and showed that our method results in faster and more accurate results,”
Therefore, I'd say that it's doing more than just rotating the image and re-running the algorithm, but you stick to doing that, it's cool.
Re: (Score:2)
a worse case scenario of x4 times worse performance...and/or battery life if working on a mobile.
Re:Upside Down? (Score:5, Interesting)
As someone who literally works on face detection/tracking software on low power ARMv7/8 CPUs, I can safely say you are dead wrong.
Assuming width==height (not likely given any current video formats or cameras), and assuming width%8 == 0 - it's a simple transposition of the rows and/or columns to do +/- 90/180 degrees, yes - and assuming you can fit your ENTIRE image in L1 cache you're going to incur minimal stalls (especially with an SoC that has a decent prefetch engine).
In reality:
* width != height
* width is however typically divisible by 8 so you can do pure NEON (not hybrid NEON + ALU/VFP) transpositions
* an 8bit grayscale VGA (640x480) image doesn't even fit in L1 cache, let alone a 720/1080p format (though most CV applications scale things down significantly, you tend to work at 320x180 - but that still doesn't fit in most L1 caches, although it does fit in 'some')
* L2 cache hits are dozens of cycles, L2 cache misses are HUNDREDS of cycles
* A real world case of rotating a 320x180 image takes ~2ms on a 700Mhz Cortex A9, that is not 'practically zero', that's 12% of your processing time at 60Hz - 36% of your processing time if you're going to rotate 3 times.
(Note: using 700Mhz Cortex A9 as an example as that's typical in automotive hardware systems we typically deal with, although the last 2 years has brought ~1-1.5Ghz A15's into the mix - though most of those cars aren't even on the market yet)
Re: (Score:3, Informative)
'Performance' is indeed an ambiguous term, it can refer to accuracy (RMS error of the detection results and false positive/negative rates in most cases) and it can also refer to speed (which I'm biased to thinking of as a programmer).
I've never seen both meanings used in some combined metric, from an algorithmic perspective you tend to only care about accuracy as the 'performance' metric - and from a production perspective you (typically) care about 'speed' as the performance metric.
On most ARM systems, you
Re: (Score:1)
Divide your image into blocks of 16x16 and all of sudden (local) transposition is much faster.
Re: (Score:2)
It takes practically zero performance to rotate an image.
You must be a web programmer...
Re: Upside Down? (Score:1)
Uh..you might not know it's upside down, smart arse. So you'd have to try it in both rotations. These people are probably smarter than even you; why not sit down, shut up and learn something?
Re: (Score:1)
do it in parallel in hardware. There are FPGA and ASIC solutions that can do hardware rotation, just send one through the rotate matrix into identical hardware. It costs twice as much, but in todays terms that still shouldn't be too bad.
Re: (Score:1)
I'm not sure I see why you couldn't do a rotate of an image in one clock cycle since it's a precomputed 1:1 mapping of source address to destination address with no math involved (eg not like doing keystone correction or other manipulations). I can't imagine that taking tens of milliseconds?
Re: (Score:1)
following up to myself- I was thinking of 180 degree or 90 degree fixed rotations (summary talked about being upside down), but it looks like these types of systems use varying rotations which makes sense. eg: http://citeseerx.ist.psu.edu/v... [psu.edu]
Re: (Score:2)
Re: (Score:3)
We are finally going to catch this guy!! - http://img.izismile.com//img/i... [izismile.com]
(the problem is the background - your brain is very good at understanding what upside-down means, but an algorithm trained by seeing tons of right-sided up images only understands that a silo is rounded on top and straight on the bottom. - The question I have, is what are the pratical implications of all the extra processing power that might take? Finally figuring out who that gymnast was from that circ-du-soleil screenshot? )
Re: (Score:3)
There's lots that you are missing.
The issue isn't the input data, it's the processing method. The processing method mentioned here as "revolutionary" is just about exactly the method that Raymond Kurzweil [wikipedia.org] posited: a hierarchy of "nodules" that pattern match on a cascading network of pattern matches....
We're living with a modern-day Turing. Do we give him ample credit?
Re: (Score:1)
Finally! We can bring the benefits of surveillance to southern hemisphere countries...
Upside Down? (Score:1)
The faintest grasp on machine vision. That goes for your 5 moderators, too.
Re: (Score:1)
"What's more, their algorithm is significantly better at spotting faces when upside down, something other approaches haven't perfected."
Add this step: Rotate the image and run the algorithm each x degrees. What am I missing?
That this is also true for all of us humans when we exit the womb?
so breakthrough (Score:2)
"The idea is to train a many-layered neural network using a vast database of annotated examples"
How novel.
Re: (Score:1)
It begs the question: why were they using few layers and skipping annotation in the past? The hardware couldn't handle it? They were too lazy to implement such? They needed a Flux Capacitor to make them work together? The boss didn't like the "look and feel" of the diagrams? It crashed Windows XP?
Re: (Score:1)
Re: (Score:3)
There wasn't a good algorithm for training general deep ANNs until 2006, although convolutional neural networks were an exception to that. It's likely nobody tried it before because computers weren't fast enough and the discovery of layer-wise unsupervised training hadn't made deep networks popular yet.
Re:so breakthrough (Score:5, Informative)
Re: (Score:1)
Attention technologists: quit inventing stuff that enables corporations and governments to spy on people!!!
Re:so breakthrough (Score:5, Interesting)
They're using a standard technique. Convolutional networks started to become big with LeCun's 1998 paper on learning to recognize hand-written digits http://yann.lecun.com/exdb/pub... [lecun.com] . His lenet-5 network could identify the digit accurately 99% of the time.
Convolutional networks are starting to become used to play Go, eg 'Move evaluation in Go using Deep Convolutional Neural Networks', by Maddison Huang, Sutskever and Silver, http://arxiv.org/pdf/1412.6564... [arxiv.org] Maddison et al used a 12-layer convolutional network to predict where an expect would move next with 50% accuracy :-)
Progress on convolutional networks moves forward all the time, in an incremental way. If we had one article per day about one increment it would quickly lose mass appeal though :-) The article is about one increment along the way, but does symbolize the massive progress that is being made.
Convolutional networks work well partly because they can take advantage of the massive computional capacity made available in GPU hardware.
This is supposed to be a good thing? (Score:5, Insightful)
Re:This is supposed to be a good thing? (Score:4, Insightful)
I think it's pretty well understood that there *are* terrorists...
Yes.
... and a lot of them ...
By almost every measure: No.
...and they're walking among us.
For virtually every useful North American or Western European definition of 'us': No.
Re:This is supposed to be a good thing? (Score:5, Insightful)
I think it's pretty well understood that there *are* terrorists and a lot of them and they're walking among us.
I disagree with this statement. If there were even a handful of real terrorists amongst us, there'd be blood in the streets. Seriously, if you really are hell bent on murdering infidels, it's not hard to drive a bus into a pack of school children, or carry a tin of petrol and a lighter into your nearest train station. That's the nature of terrorism, it is so trivial to execute that the threat is equally trivial to measure. See the history of the IRA for real world examples.
Re: (Score:2)
Re: (Score:2)
Well, putting 'terrorist' in quotes isn't helpful.
Considering what qualifies for the label "terrorist" these days, putting it in quotes isn't unreasonable.
I think it's pretty well understood that there *are* terrorists and a lot of them and they're walking among us.
While terrorists certainly exist, I don't believe for a moment that there are a lot of them walking among us. I think there's a very tiny number of them.
It's the storing and processing that bothers me.
I agree.
If the government is just watching the crowd and identifying people because they're searching for them then I'm okay with that.
This entirely depends on how they do it and what the false positive rate is.
If they start building a database that tracks me over a lifetime then I have a problem.
Then you have a problem, because that database exists and is tracking you for your entire life. It has existed for years now.
Re: (Score:1)
I think it's pretty well understood that there *are* terrorists and a lot of them and they're walking among us.
Indeed. In fact, in America, about half of all politicians thrive of terrorizing the public with threats of bodily harm, albeit indirectly because of drugs/criminals/pedophiles/illegal aliens/evil muslims/death panel obamacare/"they"/etc, arguing that you will only be safe once you give up your freedoms and start a new war. The other half of politicians thrive on terrorizing the public in a slightly different manner, arguing that you will only be safe once you give up your money. And the entire news media t
Re: (Score:1)
Re: (Score:3)
Yeah, I was surprised there was no mention of the huge privacy implications this has. But hey, maybe this'll reduce the number of IDs and RFID cards you have to carry around since it'll be so easy to identify and track you when you're just walking around.
Re:This is supposed to be a good thing? (Score:5, Insightful)
All of them.
Re:This is supposed to be a good thing? (Score:4, Informative)
Indeed, all of them.
Have you noticed you can go into Best Buy or Staples, pick up a camera or look at a printer you never searched for online, and you find ads for the device on Facebook? Didn't notice? Give it a try. It's far beyond this 2013 (minority) report http://www.businessinsider.com... [businessinsider.com]
Re: (Score:2)
No, for that I'd have to go into a Best Buy or Staples, and then use Facebook.
I kid, a bit, because I have actually been into a Staples recently, but since they were nowhere near having what I wanted or prices I would pay, I don't think I'll repeat that. I just needed one final reminder that it's a waste of my time.
Re: (Score:2)
Why, all of them, of course.
And should you ever commit a crime we will be able to retroactively find the evidence for your trial. If you really piss us off we'll edit the video record and call it parallel reconstruction.
In a few years, the pre-cog program will come online, but the surveillance is here to stay.
Now stop picking your nose, citizen.
Re: (Score:3)
It's recognition, not identification.
As in a yes/no if an image contains a face. No who is in the image.
Re: (Score:2)
It's kind of beside the point whether it's a good thing or a bad thing. No doubt it will have some combination of good and bad effects, but regardless of what the effects are, the cat is out of the bag -- the algorithm is invented and it's not going to go away. And if these guys hadn't invented it, somebody else would have. The only question that remains is how society ought to react to its existence.
Re: (Score:2)
The question is whether we should allow government to scale to be big enough for it to be a powerful tool.
We can clip some wings by not allowing ubiquitous cameras, or by limiting how big powerful global organizations can use the tech.
It isn't inevitable due to the existence of the technology. The technology exists for mass low cost execution of people. We don't allow large overreaching organizations to execute people freely. It remains a rarely used technique. Restrictions on the scaling of face recogni
Re: (Score:2)
Indeed. I am so torn over this. On the one hand, the technology is very cool. On the other hand, the inevitability of abuse seems to outweigh the benefits.
Spike boots (Score:5, Funny)
Rats, there goes my ceiling-walking bank-robbery plans.
Re: (Score:3)
Re: (Score:3)
Re:Spike boots (Score:4, Informative)
Yes, check this out 'High Confidence Predictions for Unrecognizable Images', by Nguyen, Yosinkski and Clune, http://arxiv.org/abs/1412.1897 [arxiv.org] . It's a paper that shows an image that the net is 99.99% sure is an electric guitar, but looks nothing like :-)
For the technically minded, the paper's authors propose that the reason is that the network is using a discriminative model, rather than a generative model. That means that the network learns a mathematical boundary that separates the images that it sees, in some kind of high-dimensional transformed space. It doesn't learn how to generate such new images, ie, you cant ask it 'draw me an electric guitar' :-) Maybe in a few years :-)
The authors don't compare the network too much with the human brain though, ie, are they saying that the human brain is using a generative model? Is that why the human brain doesn't see a white noise picture, and claim it's a horse?
Re: (Score:3)
There are two popular types of deep ANN at the moment: restricted Boltzmann machines and auto-encoders. RBMs are generative. Autoencoders can also be generative if you train them in a particular way, which works much better so most people train them that way anyway. So you can take an ANN and ask it to draw you a picture of a guitar.
I disagree with the authors of that paper. It seems more likely to me that they've cherry picked particular examples that fool their particular ANN. That's pretty easy to d
Re: (Score:2)
The authors don't compare the network too much with the human brain though, ie, are they saying that the human brain is using a generative model?
I don't think so, because saying something like that is not supported by evidence. The human brain doesn't actually work like neural networks do. Neural networks are only loosely inspired by one very, very narrow and specific aspect of the mechanics of the brain.
Is that why the human brain doesn't see a white noise picture, and claim it's a horse?
The human brain does this sort of thing all the time. You can see shapes in static, of course, but white noise doesn't elicit the strongest rate of this sort of error. People are constantly misidentifying things that are seen in a natural noisy env
Re: (Score:2)
Crime is about o become completely impossible without the assistance of a specially trained AI assistant.
Re: (Score:2)
The next one will recognize your gait.
Crime is about o become completely impossible without the assistance of a specially trained AI assistant.
So put a small stone in one of your shoes and watch your gait change without you trying to "walk differently." Problem solved.
Re: (Score:2)
The problem is that in the not to distant future, it will start anticipating such ideas, and train itself to prevent confounding. Heaven is terrifying.
Re: (Score:1)
Actually, in this well-documented [wikipedia.org] case of ceiling-walking bank robbery, wearing a rubber glove on one's head to look like a chicken was a very effective disguise.
Re: (Score:1)
You try working spike-boots upside-down in a mask, bub
Re: (Score:2)
Re: (Score:1)
But I've spent a lot of money on the spike-boots. In fact, I have to rob a bank to pay for them.
Re: (Score:2)
Comment removed (Score:3)
Re: (Score:2)
The disguises and cash wouldn't be worth much in the way of anonymity if you were still carrying your cellphone.
Re: (Score:2)
Re: (Score:2)
Walmart already tracks your purchases and can figure out someone if a woman is pregnant base on buying patterns.
Do you think they would say no to installing high res cameras in their stores to track what isles people walk down, what other products they stare at and for how long while deciding what to buy and associating that with a purchase?
I can see them wanting to know what people look at and don't buy, so they can market specials on those products to them (mixed in with random specials, because people fr
Re: (Score:2)
The in store tracking can often be stopped by shutting off WiFi on your smart phone. Camera's can be involved there too, but they generally don't know who you are or link to previous visits without the WiFi bit. The purchases often are linked to using a rewards card or some such thing that gives them a way to link your purchases, not that every store doesn't have your credit card purchase history, but hopefully only for that store. I'd be curious if anyone has info on how much the credit card companies know
Re: (Score:1)
Re: (Score:2)
If the store employees were somehow transferring their visual memory into a massive database then the outcry would be exactly the same.
Re: (Score:2)
... All it takes is a facebook profile. They already do facial identification in the background.
http://www.extremetech.com/ext... [extremetech.com]
Weren't deep convolutional nets debunked? (Score:2)
http://arxiv.org/pdf/1412.1897v2.pdf
Re: (Score:2)
I don't see the practical relevance of this? You can not walk through an airport with a scrambled face. So the images the camera will get are "regular" imaegs. Sure you can generate ridiculous images that triggers false positive. But these images will probably not be fed to an actual system.
Re:Weren't deep convolutional nets debunked? (Score:5, Informative)
Debunked?
They're a machine learning algorithm. All such algorithms do is place a fancy decision boundary in a high dimensional space. DnNs do a decent job for certain classes of problem. Far away from the training data, the boundary is not useful, but that's the same with all algorithms pretty much.
So no. They haven't been debunked.
Facial recognition is still very much imperfect (Score:4, Interesting)
Very much anecdotal, but here goes anyway - a little while back, I found a recipe for cow tongue that seemed intriguing. If I had eaten it before I couldn't recall, at least I hadn't prepared it myself. So off to the butcher's I was, as this is not found in every shop. The tongues they had on display there seemed very tiny (in retrospect, they must have been veal tongues), so I said "give me the largest tongue you have". As the saying goes, "you should be careful what you wish for" - what I ended up with was a monster, something like over 1.3kg (nearly three pounds). I really didn't need that much, but all I could do was to say thanks and go home with my prey.
As I laid it on my cutting board, pretty much filling it entirely, it looked at the same time so awesome and gruesome that I had to take a photo of it (not a food blogger, or a blogger of any kind, I just had to document it). And to share the experience, I sent it to a friend via Hangouts. Now, as she uses Hangouts from the GMail web interface, the images are not visible inline but are Google+ links. So she clicks the link.
...and G+ helpfully asks her "Is this xxxxx?" (xxxxx == her name) While people are, rightfully, concerned whether companies such as Google know too much about their lives, at least when it comes to Google and facial recognition, they have a long way to go.
Re: (Score:2)
Perhaps she has a face that looks like a cow tongue?
Re: (Score:2)
This is one of the big reasons why I don't use G+ (or Facebook, etc.)
Face it (Score:2)
When there is a competition to test solutions, do they call it a "face off" or a "face face off"?
The Bad News ... (Score:2, Funny)
So... (Score:5, Funny)
Re: (Score:1)
nope, this only does face recognition
Face detection, perhaps? (Score:4, Interesting)
I didn't read the article, of course, but the summary sounds like they're doing face *detection* not recognition.
Detection: find which portions of an image are faces.
Recognition: compare to a database of faces and find out whose face it is.
First is way easier than the other.
I wish... (Score:1)
I wish that interesting developments in algorithms such as this could be discussed without resorting to cynicism (as in, how they'll be used by the NSA to breach our privacy and etc). Yes they're valid concerns but my God, what point is there in enjoying advancements in technology if you're going to see the downsides in everything? I long for a simpler time when we didn't need to worry about such BS.
Re: (Score:2)
I long for a simpler time when we didn't need to worry about such BS.
One of the main reasons we have so many problems with privacy and security these days is precisely because of those simpler times when nobody really worried about the implications of various technologies. We're much better off being cynical.
Do not want (Score:3)
Facial recognition mostly gets used for all of the wrong reasons, Facebook tracking, illegal police tracking etc.
'photos of innocent people have been retained in contempt of an explicit order from the court to remove them' - 18million by police [techdirt.com]
Facebook's new face recognition policy astonishes German privacy regulator [pcworld.com]
And what about people who don't have Facebook accounts, does Facebook allow 'tagging' of their faces?, I'm already annoyed by Facebooks obvious data collection on me as shown by the fact I get email from them telling me who my friends are and inviting me to join.
aimbots (Score:1)
"Algorithm"? (Score:2)
Have I missed something?
I've always believed algorithms and neural networks to be essentially opposites to each other.
Algorithms are blocks of code that handles a predefined task. Classic example: quicksort vs bubblesort
Neural networks are a black box of systems that are trained with input until they produce the output you want. Further, even when it is working, you won't truly know what is happening internally, and you're only hope of knowing that it works is throwing a ridiculous amount of inputs at it
Re: (Score:2)
I've always believed algorithms and neural networks to be essentially opposites to each other.
I think you're mixing up two different levels of abstraction.
Algorithms are blocks of code that handles a predefined task.
Indeed, and the NN algorithms describe a predefined task that can be summarized as "train and operate a neural network". That's one level of abstraction. Once trained and operating according to the algorithm, then the NN proceeds to do the tasks it is meant for. There is no algorithm that the NN follows at this level of abstraction -- there is only the algorithm for how the NN itself operates, not for the specific task that NN is being used for.
Upside down! (Score:2)
What's more, their algorithm is significantly better at spotting faces when upside down, something other approaches haven't perfected.
Very usefull if you want your system to work in Australia!
Oblig "learn to recognise" WTF (Score:2)
OTOH, what these networks have actually learned can be eye opening [youtube.com].
I donno what the problem is (Score:2)
Don't those guys on the NCIS TV show do this all the time?
Re: (Score:1)