Google's Latest Machine Vision Breakthrough 113
mikejuk writes "Google Research recently released details of a Machine Vision technique which might bring high power visual recognition to simple desktops and even mobile computers. It claims to be able to recognize 100,000 different types of object within a photo in a few minutes — and there isn't a deep neural network mentioned. It is another example of the direct 'engineering' approach to implementing AI catching up with the biologically inspired techniques. This particular advance is based on converting the usual mask-based filters to a simpler ordinal computation and using hashing to avoid having to do the computation most of the time. The result of the change to the basic algorithm is a speed-up of around 20,000 times, which is astounding. The method was tested on 100,000 object detectors using over a million filters on multiple resolution scalings of the target image, which were all computed in less than 20 seconds using nothing but a single, multi-core machine with 20GB of RAM."
Porn Collection (Score:5, Funny)
Can it sort and identify duplicates automagically in my porn collection?
Re: Porn Collection (Score:1)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
"Pixelbeat"
Re: (Score:2)
You may be joking, but when people are looking for "duplicate" videos a comparison of hashes isn't applicable. They're usually not bit-for-bit duplicates. Usually if I want to locate duplicate files I want to match up two files as duplicates that are the same input video, but one might be offset by a few seconds, they might be at different resolutions, and might be compressed using different codecs. To a machine using hash checks those two files are nothing alike, but to a human they're the same.
Though I do
Re:Porn Collection (Score:5, Funny)
Sure! It sorted your stuff into these categories:
400-lb. naked guys kissing
Stuff reported to the NSA
Someone's drawing of a dragon humping a car
Taylor Swift
Over 750,000 pictures in all!
Re: (Score:1)
hahahahahahha, well done sir!
Re:Porn Collection (Score:4, Funny)
Re: (Score:2)
Does it support boolean operators? So that, you know, you could find 400-lb. naked guys kissing Taylor Swift and similar material?
Facebook "Graph Search" to the rescue!
Re: (Score:2)
You want Visipics for that.
http://www.visipics.info/index.php?title=Main_Page [visipics.info]
20GB?? That's it??? (Score:5, Funny)
Re: (Score:2)
Re: (Score:1)
did it cost an ARM to buy?
Re: (Score:2)
Damn, I bought a MacBook last year and it had only 4 GB of RAM. I don't know what universe you guys live in, but it sure is greener on that side :)
Re: (Score:2)
but it sure is greener on that side :)
Bluer. The future is luminescent blue.
(For about another 5 years. Then it will be the burnt orange vinyl of our generation.)
Re: (Score:2)
The future will be orange and teal according to the movies. Then again, so is the present and the past.
Re: (Score:1)
This isn't 2005, 32GB in a workstation costs peanuts nowadays. Come out from under your rock.
Re: (Score:3, Insightful)
This isn't 2005, 32GB in a workstation costs peanuts nowadays. Come out from under your rock.
Cashews, maybe, but not peanuts.
Re: (Score:2)
Yeah, and the unbroken ones... not the "cashew pieces" you can get for a lot cheaper. Still, I recall the excitement when RAM went under $100 a megabyte.
Re: (Score:1)
One-up: My first computer (ok, not counting my Vic-20) had 640k of RAM, when 128k was standard.
Re: (Score:2)
This isn't 2005, 32GB in a workstation costs peanuts nowadays. Come out from under your rock.
a workstation is hardly "even a mobile computer".
now to more important things, is this algo good enough for sorting trash?
Re: (Score:3)
Re: (Score:2)
Re: (Score:2)
Re: (Score:1)
Re: (Score:2)
I'd have to buy or build a new computer. Neither of the two old towers will take more than a gig, and I'm not going to add any to this notebook, too much of a PITA.
All the old computers do what I need them to so it'll be a while before that happens. I guess if I needed this functionality I'd have to spend a few hundred bucks.
Speaking of it doing what I need, I guess I should get back to work on that book (yep, that's why I haven't been here much lately).
Re: (Score:1)
My rule of thumb for building a new PC since the 90s has always been: $100 CPU, $100 RAM, $100 HDD, plus incidentals = $400-500 depending on what you need.
RAM prices are up since I bought my 16GB for $80 this winter, but you can still build a very nice PC with 20GB of RAM for under $500.
Re: 20GB?? That's it??? (Score:2)
My laptop has 16GB. More than 20GB is not unusual in a workstation today.
Re: (Score:2)
Iit's irrelevant, Google is all about cloud computing, all the gruntwork would be happening elsewhere on a heavily compressed/uploaded copy of your data, much like their speech recognition on even the latest smartphones.
Re: (Score:2)
Google is all about cloud computing
I guess I won't need to build a new PC after all, then, because Google and the NSA already get too much data from me, the creepy fucks. Considering some of the things I google for I'm probably on some list already.
Let me know when I can do this without "the cloud." I don't like not having control over my own data and processes.
Re: (Score:2)
and even mobile computers
Apparently, the stuff that Google hands out to I/O attendees is really worth the money!
Re: (Score:1)
My laptop has 8 cores and 16gb ram
My 11" netbook has 2 cores and 4gb ram
What are you using? An Abacus?
Re: (Score:2)
-"... using nothing but a single, multi-core machine with 20GB of RAM" Phew.. here i was thinking it'd need some unrealisticalll high specs from my PC!!
My Thinkpad W530 has 32GB of RAM. Maybe you need a new PC.
Re: (Score:3)
Re: (Score:1)
As for mobile devices, the HTC one has 2 GB, so using Moore's Law, we are looking at another 15 - 20 years for hand helds to have 20 GB of RAM. That's depending on whether you like David House's 18 months assertion or the more accurate 2 year approach, but according to the International Technology Ro
Re: (Score:1)
Re: (Score:2)
Coming to mobile? (Score:2)
"...might bring high power visual recognition to simple desktops and even mobile computers... computed in less than 20 seconds using nothing but a single, multi-core machine with 20GB of RAM."
Right... and by mobile computers you mean computers that I can lug from one desk to another.
Re:Coming to mobile? (Score:5, Informative)
Wait, your phone can decode video?!? In real time, playing the movies at normal speed? How many kilograms does it weigh, and how long is the power lead? How big is the mortgage on it? (/socraticmethod)
The computer innovation process broadly goes like this: first algorithm sort-of works but is incredibly inefficient - tweaks on this - a rethinking of the whole approach that leads to massive speed-ups - further refinement - implementation of the algorithm in hardware, where it becomes just another specialized processor - everybody profits!.
This article is about the third, or possibly fourth, phase of the process. If it it works out, phase 5 is straightforward. By itself, step 5 typically leads to two orders of magnitude increase in performance, three orders of magnitude decrease in power consumption, and two to four orders of magnitude decrease in cost.
Phases 6 and 7 happen if and when enough people find the provided service useful. (If technologies are no good, that's when only rich people have them. Successful technologies, everyone gets access to eventually.)
Re: (Score:3)
Argh! There is no phase seven. Buffer overflow error.
Re:Coming to mobile? (Score:4, Funny)
Re: (Score:2)
Phase seven is going back to software again. i.e. the Wheel of Reincarnation [catb.org].
Remember DSPs, memory, onboard in sound cards? LISP Machines? Ageia PhysX cards? etc, etc.
Re: (Score:2)
If technologies are no good, that's when only rich people have them. Successful technologies, everyone gets access to eventually
That seems like question begging. The popularity of a technology defines its success, not the other way around.
Re: (Score:2)
Re: (Score:2)
One day this tech will be able to run on
Re: (Score:3)
Surely you realize the video decoding on phones is done with dedicated hardware.
You could do it on the CPU though, the latest models (Galaxy S4 and all) should be powerful enough.
Re: (Score:3)
"...might bring high power visual recognition to simple desktops and even mobile computers... computed in less than 20 seconds using nothing but a single, multi-core machine with 20GB of RAM."
Right... and by mobile computers you mean computers that I can lug from one desk to another.
Like the MacBook Pro Retina with 16 GB? The point of their approach seems to be lots and lots of RAM to do table lookups. The memory subsystem in a normal laptop is plenty fast for that. Bandwidth would be more of a problem than total space in a cellphone. If we had a compelling case for loads of RAM in a smartphone, it would be possible to design one without going wildly beyond current power or cost envelopes. A few more years of Moore and things will be fine.
Re:Coming to mobile? (Score:4, Insightful)
To go from 2GB to 30GB following Moore's law would take 8 years. I contend that it will take longer than that because we won't see exact doubling of specs due to improvements in power. Either way, 10 years is far enough out that I think the summary claiming that this will come to mobile is far fetched for now.
Re: (Score:1)
Current mobile seems to cap out at 2MB of RAM.
Surely you meant to say 2GB of RAM. My Galaxy S3 has 1GB of RAM. Hell, even my POS work-issued BlackBerry 9310 has 512MB.
Yeah, well (Score:3, Funny)
Re: (Score:3)
Can it find Waldo?... (Score:1)
Re: (Score:3)
No but it can spy on you day and night.
Re:Can it find Waldo?... (Score:4, Funny)
Some years ago, I had an idea for a tool that would, in a nutshell, identify a plant simply from a photo and some metadata (time of year, geolocation, etc). I know how it would work (and it would work), but I came to the conclusion that someone (ie. Google) would use the methods to develop a tool that would do the same thing but for human faces.
It was at that point I decided to leave that box closed.
Re: (Score:3, Informative)
There are several non-too-creepy apps that can identify plant species by a smartphone-photo of a single leaf.
http://leafsnap.com/about/
They seem to request metadata directly via your phone's location and time-of-request (their server, not your phone, does the pattern-matching). Which is convenient, although it may place you at a time and place you may rather not be placed, for instance if burying pirate gold under a particular tree.
Re: (Score:2)
You should develop it. Google will do it eventually, and it would be better for all of us if we had the same tech to balance the power a little.
Captcha's be gone? (Score:3, Interesting)
Re:Captcha's be gone? (Score:5, Funny)
So Captcha's will become even easier to crack? Great, the sooner we can get rid of them, the better. As it is they are getting impossible to read by humans, thanks to idiots who don't know how to design them.
But there's no need to get rid of them if we'll all have a handy browser plugin that can decode them for us at the press of a button!
Re: (Score:1)
There will be new type of captchas. I just love trying to prove to a computer i'm a human and failing.
Re: (Score:1)
There will be new type of captchas. I just love trying to prove to a computer i'm a human and failing.
You mean, like this? [xkcd.com]
Re:Spatial Hashing (Score:5, Informative)
Yes, it's a breakthrough. It won the best paper award at this year's Conference on Computer Vision and Pattern Recognition, a tier 1 computer vision conference.
Hashing invarient properties in images isn't new, but,
banded winner-take-all hashing of histograms-of-oriented-gradient part filters and then using matches across those bands to identify a test feature's nearest neighbors, while simultaneously computing an upper bound or exact dot products of those test features with their nearest learned features, for up to 100,000 objects with small amounts of memory, is new.
Re: (Score:1)
Re: Spatial Hashing (Score:1)
That's a small amount.
Seriously, what's with all the 1980s throwbacks on here today...
Re: (Score:2)
Re: (Score:2)
When you say small amounts, you mean 32Gb.
Which according to newegg can be had for $229, sure it's not pocket change but if you're thinking of say a computer vision program for a car... that's a tree, that's a house, that's a dog, there's a child running around. I would imagine it's a lot easier to collect sensor data than to make sense of it in real time, if you can rapidly identify points of interest like facial recognition in photo cameras on steroids you can put processing power - and potentially directional sensors - to good use. For example,
Re: (Score:1)
Re: (Score:1)
But is it a breakthrough compared with the normal way to speed up a convolution, that is to compute it in Fourier space using a Fast Fourier Transform (FFT) or variants (DFT, DCT), etc.? I don't know the answer to this and would like a comparison as it is very relevant to my research work...
Re: (Score:1)
A paper written by Google is always considered more significant than real academic research, even if it ignores all prior art.
See MapReduce for example. What a grand innovation from Google.
Per object memory (Score:3)
BMW already scans for highway signs. (Score:2)
Nice (Score:3)
It would be nice if it could identify bird species (or other animals) preferably up to specific individual animals, like they do it with whales and penguins already. .45, 0.23 miles in that direction, so there is still hope.
I'd gladly pay money for such a program instead of getting only a free version, where I can check if aunt Mary with a drink in hand is in any photo in my collection.
We have already been waiting for years to get a program that can identify bird songs after shazaa came out, no luck yet, but hey, after all many towns have already a program that tells them: Somebody shot somebody with a
Re: (Score:2)
Tag, You're It (Score:2)
You have been tagged at the laundromat
You have been tagged at the Quickie Mart
You have been tagged at work
You have been tagged at the gym
You have been tagged.
Everything tastes like Chicken. (Score:2)
Its fast, but training set is random garbage from YT thumbnails and they have NO PROCESS to assess accuracy. All they can do is measure precision and its ~16% on average. What this means is their algorithm could very well just say FACE every single time and by shear coincidence every sixth image in dataset contains some face - tada, you just reached 16% precision.
I'll believe it when I see it... in the real world (Score:1)
Re: (Score:2)
It probably has been well tested "in the real world" - check out Google Goggles sometime (which is available for Android and iOS).
In fact, this probably came out of the stuf
This does not rule out deep neural networks (Score:2)
Bear in mind, this particular method is just a way to quickly do a large number of convolutions and get statistically fairly accurate results for the most activated convolution kernels.
This isn't incompatible with deep neural network models. This method can be combined with them and provide the same speedup there.
That's a lot of objects (Score:2)
However, how fast can it find Waldo?
Inane comment by article (Score:2)
Sounds like the Kinect? (Score:2)
To make the Kinect work (version 1.0) Microsoft gathered thousands upon thousands, possibly millions of data points, processed the images, checked the results etc. and after zillions of computations ended with digested data and some algorithms that use it, giving an accurate result in real time.
From reading the abstract I'm under the impression that Google basically did the same thing ; it's trading computation for memory use. The "hashes" of what the camera see match somehow with the digested data they ama