Forgot your password?
typodupeerror
AI Google

Google's Latest Machine Vision Breakthrough 113

Posted by Soulskill
from the can-now-gauge-your-receptivity-to-ads-by-scanning-your-face dept.
mikejuk writes "Google Research recently released details of a Machine Vision technique which might bring high power visual recognition to simple desktops and even mobile computers. It claims to be able to recognize 100,000 different types of object within a photo in a few minutes — and there isn't a deep neural network mentioned. It is another example of the direct 'engineering' approach to implementing AI catching up with the biologically inspired techniques. This particular advance is based on converting the usual mask-based filters to a simpler ordinal computation and using hashing to avoid having to do the computation most of the time. The result of the change to the basic algorithm is a speed-up of around 20,000 times, which is astounding. The method was tested on 100,000 object detectors using over a million filters on multiple resolution scalings of the target image, which were all computed in less than 20 seconds using nothing but a single, multi-core machine with 20GB of RAM."
This discussion has been archived. No new comments can be posted.

Google's Latest Machine Vision Breakthrough

Comments Filter:
  • by sycodon (149926) on Wednesday July 24, 2013 @01:21AM (#44368041)

    Can it sort and identify duplicates automagically in my porn collection?

  • by Roshan Halappanavar (2994663) on Wednesday July 24, 2013 @01:27AM (#44368055)
    -"... using nothing but a single, multi-core machine with 20GB of RAM" Phew.. here i was thinking it'd need some unrealisticalll high specs from my PC!!
    • My current iMac has 32GB of RAM, so I don't see it as being too far fetched.
      • by Anonymous Coward

        did it cost an ARM to buy?

      • by Clsid (564627)

        Damn, I bought a MacBook last year and it had only 4 GB of RAM. I don't know what universe you guys live in, but it sure is greener on that side :)

        • but it sure is greener on that side :)

          Bluer. The future is luminescent blue.

          (For about another 5 years. Then it will be the burnt orange vinyl of our generation.)

          • The future will be orange and teal according to the movies. Then again, so is the present and the past.

    • by Anonymous Coward

      This isn't 2005, 32GB in a workstation costs peanuts nowadays. Come out from under your rock.

      • Re: (Score:3, Insightful)

        by Anonymous Coward

        This isn't 2005, 32GB in a workstation costs peanuts nowadays. Come out from under your rock.

        Cashews, maybe, but not peanuts.

        • Yeah, and the unbroken ones... not the "cashew pieces" you can get for a lot cheaper. Still, I recall the excitement when RAM went under $100 a megabyte.

          • by mcswell (1102107)

            One-up: My first computer (ok, not counting my Vic-20) had 640k of RAM, when 128k was standard.

      • by gl4ss (559668)

        This isn't 2005, 32GB in a workstation costs peanuts nowadays. Come out from under your rock.

        a workstation is hardly "even a mobile computer".

        now to more important things, is this algo good enough for sorting trash?

        • by quadrox (1174915)
          My laptop came equipped with 8 GB RAM - I upgraded it two 32 GB after finding out just how cheap RAM has become nowadays. I must admit to rarely being able to come anywhere near filling this up, but that is a good thing in itself. I still have some ~200 megapixel panoramas I need to stitch with hugin, so maybe it will come in handy then.
        • by mcgrew (92797) *

          I'd have to buy or build a new computer. Neither of the two old towers will take more than a gig, and I'm not going to add any to this notebook, too much of a PITA.

          All the old computers do what I need them to so it'll be a while before that happens. I guess if I needed this functionality I'd have to spend a few hundred bucks.

          Speaking of it doing what I need, I guess I should get back to work on that book (yep, that's why I haven't been here much lately).

    • by Anonymous Coward

      My rule of thumb for building a new PC since the 90s has always been: $100 CPU, $100 RAM, $100 HDD, plus incidentals = $400-500 depending on what you need.

      RAM prices are up since I bought my 16GB for $80 this winter, but you can still build a very nice PC with 20GB of RAM for under $500.

    • My laptop has 16GB. More than 20GB is not unusual in a workstation today.

    • Iit's irrelevant, Google is all about cloud computing, all the gruntwork would be happening elsewhere on a heavily compressed/uploaded copy of your data, much like their speech recognition on even the latest smartphones.

      • by mcgrew (92797) *

        Google is all about cloud computing

        I guess I won't need to build a new PC after all, then, because Google and the NSA already get too much data from me, the creepy fucks. Considering some of the things I google for I'm probably on some list already.

        Let me know when I can do this without "the cloud." I don't like not having control over my own data and processes.

    • and even mobile computers

      Apparently, the stuff that Google hands out to I/O attendees is really worth the money!

    • by Anonymous Coward

      My laptop has 8 cores and 16gb ram

      My 11" netbook has 2 cores and 4gb ram

      What are you using? An Abacus?

    • by hrvatska (790627)

      -"... using nothing but a single, multi-core machine with 20GB of RAM" Phew.. here i was thinking it'd need some unrealisticalll high specs from my PC!!

      My Thinkpad W530 has 32GB of RAM. Maybe you need a new PC.

    • At work I have two machines 32 cores 256 GB (one linux, one windows) for regression testing, a 32 core 24 GB machine for development and a 16 core 16 GB machine for paperwork, like emails, Rally, presentations etc. The spec is actually on the low end for a professional. Heck, we order the most powerful graphics cards on headless workstations without display (to do massively parallel computations).
    • Dell sells a computer with 32 GB of RAM starting at $2000 and Apple at $2400. Both are desktops, but once it reaches manufacturers like this its pretty common place. Apple even has a 64 GB version available.

      As for mobile devices, the HTC one has 2 GB, so using Moore's Law, we are looking at another 15 - 20 years for hand helds to have 20 GB of RAM. That's depending on whether you like David House's 18 months assertion or the more accurate 2 year approach, but according to the International Technology Ro
    • by cStyled (1328751)
      My 2 year old computer has 16 GB, and its a laptop.
    • by jon3k (691256)
      I just built a PC with 16GB, the RAM itself only cost about $140. The whole build was under $1k. My first PC was around $2,400. Not sure how 20GB of RAM is unrealistic?
  • "...might bring high power visual recognition to simple desktops and even mobile computers... computed in less than 20 seconds using nothing but a single, multi-core machine with 20GB of RAM."

    Right... and by mobile computers you mean computers that I can lug from one desk to another.

    • Re:Coming to mobile? (Score:5, Informative)

      by real-modo (1460457) on Wednesday July 24, 2013 @02:12AM (#44368147)

      Wait, your phone can decode video?!? In real time, playing the movies at normal speed? How many kilograms does it weigh, and how long is the power lead? How big is the mortgage on it? (/socraticmethod)

      The computer innovation process broadly goes like this: first algorithm sort-of works but is incredibly inefficient - tweaks on this - a rethinking of the whole approach that leads to massive speed-ups - further refinement - implementation of the algorithm in hardware, where it becomes just another specialized processor - everybody profits!.

      This article is about the third, or possibly fourth, phase of the process. If it it works out, phase 5 is straightforward. By itself, step 5 typically leads to two orders of magnitude increase in performance, three orders of magnitude decrease in power consumption, and two to four orders of magnitude decrease in cost.

      Phases 6 and 7 happen if and when enough people find the provided service useful. (If technologies are no good, that's when only rich people have them. Successful technologies, everyone gets access to eventually.)

      • Argh! There is no phase seven. Buffer overflow error.

      • If technologies are no good, that's when only rich people have them. Successful technologies, everyone gets access to eventually

        That seems like question begging. The popularity of a technology defines its success, not the other way around.

      • by faffod (905810)
        I quoted the summary, and pointed out that it is ridiculous to expect that this tech will be coming to mobile anytime soon. I pointed out that the only "mobile" that I can see this working on is something that I would have to lug around. So to use your socratic method, Why are you asking if my phone can decode video in realtime and why where you asking about weight and power, when I specifically was saying that this is not anything close to a mobile configuration?

        One day this tech will be able to run on
      • by loufoque (1400831)

        Surely you realize the video decoding on phones is done with dedicated hardware.
        You could do it on the CPU though, the latest models (Galaxy S4 and all) should be powerful enough.

    • by cnettel (836611)

      "...might bring high power visual recognition to simple desktops and even mobile computers... computed in less than 20 seconds using nothing but a single, multi-core machine with 20GB of RAM."

      Right... and by mobile computers you mean computers that I can lug from one desk to another.

      Like the MacBook Pro Retina with 16 GB? The point of their approach seems to be lots and lots of RAM to do table lookups. The memory subsystem in a normal laptop is plenty fast for that. Bandwidth would be more of a problem than total space in a cellphone. If we had a compelling case for loads of RAM in a smartphone, it would be possible to design one without going wildly beyond current power or cost envelopes. A few more years of Moore and things will be fine.

      • by faffod (905810) on Wednesday July 24, 2013 @08:30AM (#44369263)
        Current mobile seems to cap out at 2MB of RAM. There is a reason for this - power consumption. RAM requires a continuous trickle of power to maintain state. An increase in RAM leads to a direct increase in power consumption. Mobile improvements are going to be focused on power consumption rather than raw power. Moore's law will be followed, but it will not result in something that is 2x more RAM, it will result in something that is 2x less power drain. Ok, I will grant you that it will probably be a mix - some increase in RAM, some increase in computation, but a significant increase in battery life.

        To go from 2GB to 30GB following Moore's law would take 8 years. I contend that it will take longer than that because we won't see exact doubling of specs due to improvements in power. Either way, 10 years is far enough out that I think the summary claiming that this will come to mobile is far fetched for now.
        • Current mobile seems to cap out at 2MB of RAM.

          Surely you meant to say 2GB of RAM. My Galaxy S3 has 1GB of RAM. Hell, even my POS work-issued BlackBerry 9310 has 512MB.

  • Yeah, well (Score:3, Funny)

    by Anonymous Coward on Wednesday July 24, 2013 @01:58AM (#44368127)
    my cat can spot a Dentabite bag from across the room in 20 milliseconds, does that mean my cat has 20TB of RAM?
    • by pspahn (1175617)
      And on the same note, a dog can predict where a ball is going to be when you bounce it off a wall, but that doesn't qualify it to go around processing physics simulations.
  • ... or Sarah Connor for that matter?
    • by lxs (131946)

      No but it can spy on you day and night.

      • by pspahn (1175617) on Wednesday July 24, 2013 @03:03AM (#44368259)

        Some years ago, I had an idea for a tool that would, in a nutshell, identify a plant simply from a photo and some metadata (time of year, geolocation, etc). I know how it would work (and it would work), but I came to the conclusion that someone (ie. Google) would use the methods to develop a tool that would do the same thing but for human faces.

        It was at that point I decided to leave that box closed.

        • Re: (Score:3, Informative)

          by Anonymous Coward

          There are several non-too-creepy apps that can identify plant species by a smartphone-photo of a single leaf.

          http://leafsnap.com/about/

          They seem to request metadata directly via your phone's location and time-of-request (their server, not your phone, does the pattern-matching). Which is convenient, although it may place you at a time and place you may rather not be placed, for instance if burying pirate gold under a particular tree.

        • by AmiMoJo (196126) *

          You should develop it. Google will do it eventually, and it would be better for all of us if we had the same tech to balance the power a little.

  • Captcha's be gone? (Score:3, Interesting)

    by Suhas (232056) on Wednesday July 24, 2013 @02:03AM (#44368137)
    So Captcha's will become even easier to crack? Great, the sooner we can get rid of them, the better. As it is they are getting impossible to read by humans, thanks to idiots who don't know how to design them.
    • by Anonymous Coward on Wednesday July 24, 2013 @03:45AM (#44368357)

      So Captcha's will become even easier to crack? Great, the sooner we can get rid of them, the better. As it is they are getting impossible to read by humans, thanks to idiots who don't know how to design them.

      But there's no need to get rid of them if we'll all have a handy browser plugin that can decode them for us at the press of a button!

    • by Anonymous Coward

      There will be new type of captchas. I just love trying to prove to a computer i'm a human and failing.

      • by Anonymous Coward

        There will be new type of captchas. I just love trying to prove to a computer i'm a human and failing.

        You mean, like this? [xkcd.com]

  • by gshegosh (1587463) on Wednesday July 24, 2013 @06:38AM (#44368797)
    20GB per 100000 objects is 209kB per object. Don't know what resolution each image was, but I think 200kB is quite small.
  • BMW has a forward facing camera under the rear view mirror that scans for highway signs for posted speed limit and no-passing signs and displays them on the dash. I am not it is basic car or you have to buy some advanced tech package for it.
  • by nospam007 (722110) * on Wednesday July 24, 2013 @08:04AM (#44369135)

    It would be nice if it could identify bird species (or other animals) preferably up to specific individual animals, like they do it with whales and penguins already.
    I'd gladly pay money for such a program instead of getting only a free version, where I can check if aunt Mary with a drink in hand is in any photo in my collection.
    We have already been waiting for years to get a program that can identify bird songs after shazaa came out, no luck yet, but hey, after all many towns have already a program that tells them: Somebody shot somebody with a .45, 0.23 miles in that direction, so there is still hope.

  • You have been tagged at the ATM
    You have been tagged at the laundromat
    You have been tagged at the Quickie Mart
    You have been tagged at work
    You have been tagged at the gym
    You have been tagged.
  • Its fast, but training set is random garbage from YT thumbnails and they have NO PROCESS to assess accuracy. All they can do is measure precision and its ~16% on average. What this means is their algorithm could very well just say FACE every single time and by shear coincidence every sixth image in dataset contains some face - tada, you just reached 16% precision.

  • Being a software engineer myself I understand the sense of excitement accomplishment after completing internal testing. But as with many projects, as soon as this leaves the controlled "lab testing" environment it's a whole different ball game. Until then it's still a white paper product and I'd suggest remaining cautiously optimistic...
    • by tlhIngan (30335)

      Being a software engineer myself I understand the sense of excitement accomplishment after completing internal testing. But as with many projects, as soon as this leaves the controlled "lab testing" environment it's a whole different ball game. Until then it's still a white paper product and I'd suggest remaining cautiously optimistic...

      It probably has been well tested "in the real world" - check out Google Goggles sometime (which is available for Android and iOS).

      In fact, this probably came out of the stuf

  • Bear in mind, this particular method is just a way to quickly do a large number of convolutions and get statistically fairly accurate results for the most activated convolution kernels.

    This isn't incompatible with deep neural network models. This method can be combined with them and provide the same speedup there.

  • I'm sure it would take me more than a few minutes to identify that many objects.

    However, how fast can it find Waldo?
  • One of the cited articles says

    What is clear is that it is never safe to write off an approach in AI.

    ... which is an example of a (sadly) standard over-generalization often used by writers to sound important in a closing paragraph. IF the approach used by Google was being "written off" by someone somewhere, what we have here is one example of someone being wrong about something. Claiming, as the article does, that it's therefor impossible to judge any AI technique as a dead end is unsupported and in no w

  • To make the Kinect work (version 1.0) Microsoft gathered thousands upon thousands, possibly millions of data points, processed the images, checked the results etc. and after zillions of computations ended with digested data and some algorithms that use it, giving an accurate result in real time.

    From reading the abstract I'm under the impression that Google basically did the same thing ; it's trading computation for memory use. The "hashes" of what the camera see match somehow with the digested data they ama

The first version always gets thrown away.

Working...