Recognizing Scenes Like the Brain Does 115
Roland Piquepaille writes "Researchers at the MIT McGovern Institute for Brain Research have used a biological model to train a computer model to recognize objects, such as cars or people, in busy street scenes. Their innovative approach, which combines neuroscience and artificial intelligence with computer science, mimics how the brain functions to recognize objects in the real world. This versatile model could one day be used for automobile driver's assistance, visual search engines, biomedical imaging analysis, or robots with realistic vision. Here is the researchers' paper in PDF format."
adverse effects (Score:5, Funny)
Re: (Score:1)
mod parent down - we've been here before: (Score:2)
Interesting, but what comes next? (Score:5, Insightful)
I'm not knocking the MIT research, I think it's amazing. It just seems to me like imitation rather than imagination. Granted, highly evolved and complicated imitation. But does it even have the abilities of a parrot?
TLF
Re: (Score:2)
Feng-Gui [feng-gui.com]
When I first visited the site, they had a porn site in their "Sample heatmaps" section, and I must say it was pretty spot-on.
Re: (Score:1)
Re: (Score:1)
Re: (Score:2, Insightful)
Give machines our own capabilities? We can't even have them move about in a reliable fashion, what makes you think we're even *close* to endowing machinery with creativity and abstract thought at huma
Re:Interesting, but what comes next? (Score:5, Interesting)
There is software systems that can approximate the size and distance between objects in a picture with reasonable accuracy, and if the scope of scenery presented to the system is limited, then that ability combined with sensing motion of objects is enough to determine a large percentage of what is desired. This is not the trouble or the hard part. The hard part is determining object classification and purpose in those times when it is not simple.
Each of us can almost always look at a scene and determine the difference between a jogger and a purse thief on the run or a businessman late for an appointment. For computers to do so takes a great deal more work. It is only a subtle difference and one where both objects maintain similar base characteristics.
The point? Even mimicking human skills is not easy, and fails at many points without the overwhelming store of knowledge that humans have inside their heads. This would point to the theory that if more memory was available, AI would be easier, but this is not true either. Humans can recognize a particular model of car, no matter what color it is and usually despite the fact that it might have been in an accident. The thinking that comes into play when using the abstract to extract reality from a scene is not going to happen for computers for quite some time.
The danger is when such ill prepared systems are put in charge of important things. This is always something to be wary of, especially when it is used to define/monitor criminal acts and identify those who are guilty whether that is on cameras at intersections or security systems, or government surveillance systems.
Re:Interesting, but what comes next? (Score:4, Insightful)
Actually, we can't, we just base this recognition on stereotypes. A well known Swedish criminal called "the laser man" exploited this in the early 90s when robbing banks. He would rob the bank and then change clothes to a business man or a jogger, and then escape the scene. The police would more often than not let him pass through because they were looking for a "escaping robber", not for a "business man taking a slow paced walk".
The police caught on eventually and caught the guy. Computers would of course have even greater difficulties to think "outside the box".
Re: (Score:1)
Re: (Score:2)
The desired purpose is what actually dictates the usefulness. For a police-interceptor robot, it would be important to be able to make those fine distinctions.
For an auto-driving robot, it's probably good enough to be able to tell there is a running human and what general locations they're likely to be as the robot passes. It won't need to know WH
Re: (Score:2)
If the running human is avoided, but not recognized, your AI car may find itself ensnared in the beginning of a marathon of runners, or perhaps mistakenly in the middle of a playground, or perhaps at the front of a building where people are running from a bomb scare?
Simply not hitting the human is simply not good enough all of the time. When software or AI systems have charge of life critical syst
Re: (Score:2)
Likewise with the 'bridge out'. The AI may not be able to interpret what
Re: (Score:2)
Simply not hitting the human is simply not good enough all of the time. When software or AI systems have charge of life critical systems, such as cars, getting it right 90% of the time is not good enough and never will be.
Is your point that it's
Re: (Score:2)
I would so LOVE for my car to drive me to work and back. I wouldn't have to drive but I still get where I want to go in a reasonable time. It's just like public transportation, but without all the people and really long transit times.
My guess is we'll start seeing systems where there will be roadways dedicated to AI cars. Where there are smart cars and "smart roads" to help them along. At first, the cars will be AI-hybrids that you can
Re: (Score:2, Insightful)
I do agree with you on one point, but not for the reason you do: the problem of control. If there's any reason that an intelligent driving system wouldn
Re: (Score:2)
Run it on a Dell laptop. It will learn faster.
Parallelism (Score:1)
Re: (Score:2, Interesting)
we are able to give these systems our own abilities as a starting point and then watch it somehow create something more intelligent than we are... then we really have something.
This technology is prerequisite to providing an AI system with a starting point. It offers for instance the powers of perception as input for a learning system. A baby for example opens their eyes and simply sees, this is only part of the baby's starting point. Other aspects of your "starting point" include predetermined goals such as eating and also include points of failure like starving. Many avenues of input are required for effective learning at different capacities, Helen Keller for instance learned
How youd you recognise super intelligence? (Score:2)
If this intelligence was self-promoting (as we are), then it would do whatever it takes to protect itself from us (like we do from other animals/diseases etc). The first we'd probably realise that something is going on is when we wake up one morning to find ourselves enslaved.
If, however, the super intelligence is peaceful and benign we'd probably just stomp it into the ground and never realise its full potentia
Re: (Score:1)
Re: (Score:2)
That's rather like asking whether the latest version of MS Word has the abilities of a parrot. It doesn't, but it was never supposed to.
I've always felt that the term "Artificial Intelligence" is a bit of a misnomer. Actual AI work is actually more like Imitation Intelligence - programs that do
Re:Interesting, but what comes next? (Score:4, Insightful)
You know this is pretty misleading so you can't take any blame for thinking so. Lots of people also think that we're also "a hundred years smarter" than those living in the 1900's, just because we were lucky to be born in a higher culture.
But think about it: what is our entire culture and science, if not ultra sped-up evolution. We make mistakes, tons of mistakes, as human beings, but compared to completely random mutations, we have supreme advantage over evolution in the signal/noise ratio of the resulting product.
Can we ever surpass our own complexity in what we create? But of course. Take a look at any moderately complex software product. I won't argue it's more complex than our brain, but something else: can you grasp and asses the scope of effort and complexity in, say (something trivial to us), Windows running on a PC, as one single product? Not just what's on the surface, but comprehend at once every little detail from applications, dialogs, controls, drivers, kernel, to the processor microcode.
I tell you what: even the programmers of Windows, and the engineers at Intel can't.
Our brain works in "OOP" fashion, simplifying huge chunks of complexity into a higher level "overview", so we could think about it in a different scale. In fact, lots of mental diseases, like autism or obsessive compulsive disorders revolve around the loss of ability to "see the big picture" or concentrate on a detail of it, at will.
Together, we break immensely complex tasks into much smaller, manageable tasks, and build upon the discoveries and effort we made yesterday. This way, although we still work on tiny pieces of a huge mind-bogglingly complex puzzle, our brain can cope with the task properly. There aren't any limits.
While I'm sure we'll see completely self-evolving AI in the next 100 years, I know that developing highly complex sentient AI with only elements of self-learning is quite in the ability of our scientists. Small step, by small step.
Re: (Score:1)
Re: (Score:1)
I don't think this is the goal, at least not for now. The goal is to automate known tasks, not create an electronic Einstein.
Re: (Score:2)
They've created something that works and works well (I've been using a simple version of their model in my own work), it's too bad it doesn't involve "imagination" or some kind of next step. Most of us are quite happy with a system that can categorize novel, natural scenes.
Re: (Score:1)
Re: (Score:1)
Re: (Score:2)
Save one half of one chapter, it's a very easy read, and makes a lot of fundamental ideas very clear. [communitywiki.org] While he doesn't give an algorithm for Intelligence, he does give a good (and somewhat original) definition of what Intelligence is, and then he describes some elements of what an intelligence probabl
Re: (Score:2)
1950s called and it wants it's "scientific" concerns back.
You forget one thing. (Score:2)
Computers don't have that stimuli, so they don't evolve.
Re: (Score:1)
Does anybody know where to find the actual paper? (Score:1, Informative)
Re:Does anybody know where to find the actual pape (Score:5, Funny)
nothing new (Score:4, Insightful)
Re:nothing new (Score:4, Informative)
That said, they do present a simple and biologically-motivated preprocessing layer that appears to be useful, which reflects back on the brain. In summary, I would say that this paper helps more to understand brain functioning than to develop machines that can achieve human-like vision capabilities. So, very nice, but let's not over-hype it.
I'm not getting it, why it is significant ? (Score:4, Insightful)
Re: (Score:1)
that's a generous view of it (Score:3, Informative)
1. The code is in a horrible hacked-together state and so not really fit for release, and nobody wants to put in the effort that would be needed to clean it up; or
2. The researchers don't want to release their code because keeping it secret creates a "research moat" that guarantees that they'll get to publish all the follow-up papers themselves, since anyone else who wanted to extend the work would have to first invest the time to reimple
Re: (Score:2, Informative)
research done at cyberdyne (Score:4, Funny)
this is, of course, the first step in finding Sarah Connor.
not like the brain does. (Score:1)
As far as i can tell from their paper (it is a journal version of their cvpr paper) only their low-level Gabor features are similar to what the brain does.
The rest of the paper uses the currently popular bag-of-features model, which is a model that discards all spatial information between image features, which i don't think the brain does. Furthermore, for classification algorithms they consider a Support Vector Machine and Boosting. B
Re: (Score:2, Interesting)
Re: (Score:1)
Probably because a suitable ANN would take years to converge.
Re: (Score:3, Informative)
Their low-level Gabor filters are indeed similar to V1 simple cells. The similarity between their model and the brain goes a lot further, though. The processing go
Re: (Score:1)
The caltech datasets are in my opinion artificial, since they rotate all images in the same direction.
For example, a moterbike always faces to the right, and the 'trilobite' is even rotated out of the plane (leaving a white background) so you only need to estimate the right angle of rotation.
for example, see:
http://www.vision.caltech.edu/Image_Datasets/Calte ch101/averages100objects.jpg [caltech.edu]
you would never get a consiste
Re: (Score:1)
My own two cents (Score:5, Interesting)
Usually I'm not forming long term memories during fugue states, but when I do, I remember some pretty interesting stuff. One thing that is typically impaired is object recognition, since this mostly seems to be handled by the right occipital lobe. I can see things but can't immediately recognize what they are, unless I use these left-brain techniques. The left occipital lobe can recognize objects too, but the approach it takes is different and more of a pain in the ass to have to rely on. It's more of a thunky symbolic recognition, as opposed to an ability to examine subtle forms, shapes, and colors. I have to basically establish a set of criteria that define what I'm looking for and then examine things in the visual field to see if they match those criteria. I'll look for a bed by trying to find things that appear flat and soft; I'll look for a door by looking for things with attributes of a doorknob such as being round and turnable; I'll find water to drink by looking closely at wet things. My wife says I make some interesting mistakes, like once confusing her desk chair for a toilet (forgetting for a moment that part of a toilet has to be wet, but at that point memory formation and retrieval is disrupted to the point where I could imagine forgetting that it's not enough to just be able to be sat on, toilets have to have water in them too). I have trouble recognizing faces, and she says I'm sometimes obviously pretending to recognize her. Recognizing a face using cold logic can be tricky even when you're not impaired. Recognizing familiar scenes and places becomes difficult. I drove home in a fugue state once, back in my twenties, and while I didn't crash into anybody or have any sort of accident, I did get lost on the way home from work. I ended up driving miles past where I lived. Even as a pedestrian, getting lost in familiar areas is still a problem.
People have been trying to come up with image processing algorithms that mimic cortical signal analysis for decades. I remember reading papers ten years ago like this. It's amazing to see they're still mistaking road signs for pedestrians. I don't think even I could make an error like that. The state of the art was totally miserable back then, too. Neuroscience has got to be one of the sciences most poorly understood by humans.
Re: (Score:2)
Re: (Score:2)
Besides, we can (and do) augment our intelligence by using computers and etc... I think some day we'll be able to understand our own brains.
Re: (Score:2)
These sorts of mistakes seem very common in computer vision, the system I used a few years back was forever mistaking trees for people. The problem is that there is a lot of variation in how people can look: what angle you are looking at them from, how their body is positioned and the colour of the clothes they wear. Creating an algorithm which can recognise all this variation can often lead to a system with many false positives.
It loo
Re: (Score:1)
Obviously it'd still need to detect pedestrians, stray dogs and non-CCCs (and not crash dumbly if someone hacks the transponders) but a standard system like this would free up a
Re: (Score:1)
Just wondering out loud.
K.
Earlier work 1989-1997 on street scene analysis (Score:5, Informative)
1. WPJ Mackeown (1994), A Labelled Image Database, unpublished PhD Thesis, Bristol University.
2. WPJ Mackeown, P Greenway, BT Thomas, WA Wright (1994).Road recognition with a neural network, Engineering Applications of Artificial Intelligence, 7(2):169-176.
3. NW Campbell, WPJ Mackeown, BT Thomas, T Troscianko (1997).
Interpreting image databases by region classification. Pattern Recognition, 30(4):555-563.
There has been various follow up research since then [google.com]
Re: (Score:2)
WPJ Mackeown (1994), A Labelled Image Database and its Application to Outdoor Scene Analysis, unpublished PhD Thesis, Bristol University.
Re: (Score:2)
Bah, Bristol University. I'll only take it seriously if it is from MIT.
:-)
Re: (Score:2)
Plus, bah, neural networks.
Who is Brian? (Score:1)
Revolutionary? Probably not... (Score:3, Insightful)
That isn't my real problem with this algorithm and the 100s of similar ones that have come before it. What bothers me is that they don't really get at the *way* the brain works. It's a top-down approach, which looks at the *behavior* of the brain and then tries to emulate it. The problem with this technique is it may miss important details by glossing over anything that isn't immediately obvious in the specific problem being tackled (in this case vision). This system can analyze images, but can it also do sound? In a real brain, research indicates that you can remap sensory inputs to different parts of the brain and have the brain learn it.
I'm still interested in this algorithm and would like to play around with the code (if it's available), but I am skeptical of the approach in general.
This is what is needed before true AI is made (Score:2)
Once you have the ability to interpret vision into 3d objects, you can then classify what they are and what they're doing in a language(English is good enough). You can then enter in sentences and the AI would understand the representation by 'imaginging' a scene. And what you have isn't really a thinker, but software that understands English and can be incorporated into robots too.
Re: (Score:1)
Yeah, because NLP is a closed problem just like vision.
While you're at it, why don't you just power the thing with a perpetual motion machine.
Re: (Score:1)
Excellent Big Brother Tool (Score:2)
Or to automatically scan streets, airports, bus stations, bank queues, etc. for "wanted" persons, terrorists, library fine evaders, dissidents, etc ad nauseum.
Hope most folks realize, once they get down vison (Score:5, Insightful)
Robotic vision is a tipping point.
A large number of humans become unemployable shortly after this becomes a reality.
Anything where the only reason a human has the job is because they can see is done in the 1st world.
Why should you pay $7.25 an hour (really $9.25 w/benefits & overheard for workers comp, unemployment tax, etc.) when you can buy a $12,000 machine to do the same job (stocking grocery shelves, cleaning, painting, etc.).
The leading edge is here with things like roomba's.
Re:Hope most folks realize, once they get down vis (Score:2)
when you can buy a $12,000 machine to do the same job
That's a great argument you make, except nothing that is programmed and isn't a mass market product costs $12,000. You're not going to buy a machine that can stock shelves, do cleaning, and painting. These are going to be seperate machines and they're each going to cost millions of dollars. The market for these machines? The same traditional market: production lines. It's just way too cheap to hire unskilled labor than it is to buy a machine to replace them - unless the job is dangerous - and sometime
Re: (Score:2)
Re: (Score:2)
Re: (Score:1, Offtopic)
That Would Be An Illegal Immigrant... (Score:2, Funny)
We've got an overstock of these in California, Texas, Nevada, Arizona and New Mexico. We'll be glad to ship 'em either north _or_ south if y'all will pay the freight or, at the very least, provide a destination address.
Re: (Score:2)
My point is this:
Robots can't replace many human jobs now because they cannot see.
Once robots can see, there will be a point where many "menial" jobs can be performed by them.
We need to start thinking about how we are going to handle the huge numbers of people who are only qualified for menial work now before we get to that day.
We may disagree on if that is 5 years (unlikely but possible) or 100 ye
Re: (Score:2)
Re: (Score:2)
We are so dependent on capitalisim that we can't just destroy all of the lower classes in a fell swoop. We need those people to be buying the food off the grocery store shelves.
Another weakness is that the operational costs of a grocery store robot have to be absorbed by the grocery store. Now you might not be fully aware, but mos
Re: (Score:2)
9. Self checkout machines. These allow one human to check out 6 customers at a time.
Look at my other thread for a cost analysis.
Most of your argument is predicated on robots being expensive.
Given $55k robots, in my other post, I show it's cheaper than high school students.
At $55k, you have three robots.
Breakdowns are just a maintenance and SLA issue (except for the forklift issue but that won't stop robots any more- just slow them down- especially with today's surveillance abilities- likewise, w
Re: (Score:1)
Likewise an automatic cleaning robot for buildings- our building has a staff of 20 every night.
I worked for a janitorial service when I was in my late teens, and wouldn't really be confident in a robot's ability to do that. It sounds simple on the surface: Clean floors, empty trash, right?
One of our clients was the local symphony. One office in particular stands out in my mind; when you opened the door, at LEAST 1 page of a stack of paperwork came flying off due to the sudden breeze. Sometimes you saw it fly if they left the light on, but usually you just heard it move. You walked in, put th
Re: (Score:2)
We already have roombas that can find their plug and plug back in.
While there will be special cases (like the symphony), a lot of generic offices would do just fine with a fleet of roombas that come out to vacuum at night and then return to a storage closet.
Re:Hope most folks realize, once they get down vis (Score:2)
Sure people are unreliable for all sorts of reasons, but they don't break down as often and usually have initiative to think through new situations (even a grocery shelf stacker).
Re: (Score:2)
They require maintenance but they really only even start breaking down after a few years (75-80 thousand miles).
Say a Kroger Stocking robot cost $55,000 and it requires $3,000 a year maintenance before being worn out after 5 years (total cost $60,000). It doesn't break down, it doesn't call in sick, and it can work seven days a week.
Having two low wage humans work a full shift 7 days a week all year runs about $36,000 a year after matchi
Re: (Score:2)
Watching reality TV [mit.edu].
That's right. When the new visually acute robots put you out of a job, and you take your severance check and slink home to watch "Cops," you'll find a robot already hogging the La-Z-Boy, remote control in hand. Not only are we obsolete--our obsolescenece is obsolete, too.
Evaluating vision papers is tough (Score:2)
Reading vision papers is very frustrating. At one time I had a shelf full of collections of papers on vision. You can read all these "my algorithm worked really great on these test cases" papers, and still have no idea if it's any good. You can read the article on the vision algorithm used by the Stanford team to win the DARPA Grand Challenge [archive.org], and it won't be obvious that it's a useful approach. But it is.
This is, unfortunately, another Roland the Plogger article on Slashdot. So this probably isn't
obligatory fearmongering (Score:2)
What it will be used for (Score:3, Funny)
Tinfoil hat included? (Score:2)
Fine paper, but why not quote all of PAMI ? (Score:5, Informative)
Interested readers can browse the content of PAMI current and back issues [ieee.org] and either go to their local scientific library (PAMI is recognisable from afar by its bright yellow cover) or search on the web for interesting articles. Often researchers put their own paper on their home page. For example, here is the publication page of one of the authors [mit.edu] (I'm not him).
For the record, I think justifying various ad-hoc vision/image analysis techniques using approximations of biological underpining is of limited interest. When asked if computer would think one day, Edsgerd Dijkstra famously answered by "can submarine swim?". In the same manner, it has been observed that (for example) most neural network architectures make worse classifiers than standard logistic regression [usf.edu], not to mention Support Vector Machines [kernel-machines.org], which what this article uses BTW.
The summary by our friend Roland P. is not very good
I could go on with lists and links but the future is already here, generally inconspicuously. Read about it.
Lena (Score:1)
http://www.cs.cmu.edu/~chuck/lennapg/lenna.shtml [cmu.edu]
Told you so :-) (Score:2)
http://slashdot.org/comments.pl?sid=221744&cid=17
Come on AI researchers, it's pattern matching that is what the brain does! it is so obvious!
Does any of you read Slashdot?
And all the operations of the brain can be explained in terms of pattern matching; even mathematics.
on intelligence (Score:1)
As described in: http://www.amazon.com/Intelligence-Jeff-Hawkins/dp /0805078533/sr=8-1/qid=1171294577/ref=pd_bbs_sr_1/ 002-9722002-6024059?ie=UTF8&s=books [amazon.com]
P.