Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

An Advance In Image Recognition Software

Posted by kdawson on Sat May 24, 2008 07:45 PM
from the needle-in-a-haystack dept.
Roland Piquepaille alerts us to work by US and Israeli researchers who have developed software that can identify the subject of an image characterized using only 256 to 1024 bits of data. The researchers said this "could lead to great advances in the automated identification of online images and, ultimately, provide a basis for computers to see like humans do." As an example, they've picked up about 13 million images from the Web and stored them in a searchable database of just 600 MB, making it possible to search for similar pictures through millions of images in less than a second on a typical PC. The lead researcher, MIT's Antonio Torralba, will be presenting the research next month at a conference on Computer Vision and Pattern Recognition.
+ -
story

Related Stories

This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • by HeavensBlade23 (946140) on Saturday May 24 2008, @07:48PM (#23532724)
    This will be used to break CAPTCHA-type schemes even worse than they already are.
    • I had thought it would be more along the lines of categorising pictures into 'PORN' and 'NOT PORN'. Or possibly even 'PORN', 'GAY PORN', 'SHEEP PORN' and 'NOT PORN' if it's really advanced.
    • by Jeremiah Cornelius (137) * on Saturday May 24 2008, @07:57PM (#23532792) Homepage Journal
      This will be used to identify YOU, citizen.
    • by elnico (1290430) on Saturday May 24 2008, @08:22PM (#23532926)
      Incorrect. First of all, in a CAPTCHA, you're trying to very rigorously inspect a single image. This advance seems to be more about taking quick glances at lots of images. Furthermore, in the article, they talk about recognizing flowers and cars. The fact is, computers already have no problem recognizing letters and numbers in images. We got that down a long time ago. The difficult things about reading a CAPTCHA image are removing distortion and splitting the whole image into the component characters. If you read the article, you'd see that this research has nothing to do with that.
      • Yes, but as you say, letters and numbers are getting trivial, so we will need captchas outside of that. "Click the pictures with people out of the following set of 20 pictures" could have been a useful method.
  • thank god, now i can get some assistance when i'm taking one of those "real tits or fake tits" online quizes. fapfapafap.
  • by ZxCv (6138) on Saturday May 24 2008, @07:55PM (#23532786) Homepage
    ...I'll believe it when I see it.

    Until then, it's snake oil, as far as I'm concerned.
  • .... the answer is always "a picture"
  • I hate reading press releases of reading papers with real explanations of what's going on.

    I just finished reading "Small Codes and Large Image Databases for Recognition" written by the guy. All he did was implemented Geoff Hinton's idea of databasing images with a binarized coefficients produced by Restricted Boltzmann Machines.

    Hinton himself gave a talk on it for Google here:
    http://www.youtube.com/watch?v=AyzOUbkUf3M [youtube.com]

    Actually I'm wondering, is he plagiarizing Hinton?
    • Re: (Score:3, Insightful)

      I think plagiarizing is a strong word to throw around. And particular implementations of general approaches can often be very interesting when one considers what tradeoffs are made transferring pure theory to practical applications. If this sort of thing were attempted in the 90's, they'd probably arbitrarily pick a few hundred features by hand and KL-transform it down to the most significant dimensions and hash those into one of these codes. Since I've been out of "the biz" for awhile, it's pretty inter
    • by Hays (409837) on Saturday May 24 2008, @10:17PM (#23533305)
      Jeff Hinton worked with them, you really think they're plagiarizing him? That claim doesn't even make sense, this is a novel research domain. A big part of science is taking people's ideas, reproducing them, and applying them to novel domains. That's how it's SUPPOSED to work.

      This research involves the use of one of the largest image databases seen in computer vision. It shows that you can do extremely rapid scene matching for databases of this scale. No, that's not obvious no matter what you think. This image data is fairly high dimensional.

      This research says something about the space of likely scenes and it might be a key enabling technology to a lot of the heavily data driven computer vision and computer graphics approaches popping up lately.

      • My mistake...I just saw the video where Hinton was suggesting to Google, tongue-in-cheek, to use his RBM bottleneck trick as a method of databasing and then to see this guy's paper mentioning the very same thing a few weeks later.

        It was a skim to see what the hell the article was really about, didn't know these two were connected. I jumped the gun 'cause I got burned by a plagiarizer in the past, sorry.
  • They're going to distinguish an individual based on images with 256 to 1024 bits of data?

    I guess nobody there thought to do the math before making these claims. This story probably shouldn't have made it to the front page; it's less than useful.

    • Re: (Score:3, Insightful)

      "They're going to distinguish an individual based on images with 256 to 1024 bits of data?"

      No one said they were going to identify individual people with this. The main gist of this research seems to be efficiency (in both space and time, if I read it correctly). For instance, if one wanted to identify every face in a picture of a crowd, they could apply this algorithm to a low-res version of the image to quickly find the locations of every "face," and then use a more advanced face recognition algorithm t
    • 2^1024 is 1.797693134862315907729305190789e+308. I think you could do a lot with a number that big.
  • This is not so much an "advance", and more a demonstration that some image recognition problems can be solved with fairly simple, well known methods and a lot of data.
  • all the Obama halloween masks setting off false positives from sea to shining sea.
  • a fake ufo picture and a real one?

    How will spammers make use of this? Well just make that viagra pill be reflected in a coke bottle.

    anyone for a random bit generator to see what random results gets labeled?

    Wonder what fractals might produce?

    moral of the question: we can always break what we make.

     
  • by Bayoudegradeable (1003768) on Saturday May 24 2008, @08:23PM (#23532936)
    Oh my, the soon to be most searched "name" on the web is... Jenna Jameson! Wait a minute, I think I misunderstood "facial" recognition...
  • if they grabbed 13 million images from the net, there's a good chance that many of them are copyrighted. if they are using those copyrighted images in their (presumably FOR SALE) software, wouldn't that require some serious licensing fees, even if it's an internal-you-never-see-the-pictures usage, since it's a part of their algorithm, or what-have-you?

    for the record, i say this as a concerned/curious artist who isn't looking for a payout.

  • Given this works under real world conditions, it would make it possible that every shop gets the list of faces of criminals or other people which "you dont want" in your shop, recognize them in real time using little investment only and throw them out. Or the possibility to track a person on traffic surveillance cameras. That is pretty freaky. Politicians, please make laws which restrict such databases. I don't like the idea of beiing escorted out of the shopping mall because of my credit rating. Or that so
  • This actually solves a problem I've been stumped on for a while. I need a way to search for similar images such that images that are similar have a searchable value with an inherent "nearness" quality.

    That is, there are a number of image similarity algorithms, but the computed values of two similar images are not necessarily mathematically near to each other. This algorithm produces values that are, which can make searching for similar images among very many images, quite fast.
  • I hope they didn't use any of my pictures because I sure as hell didn't give them permission to use my images for something like that and I clearly state that my own work is my property on my website for my use only.

    While I don't care about most uses about my images (go a head and PS a penis in my mouth) but I would fight it if I found out it was used for this.
  • Even if this was perfectly efficient, I'm pretty sure there's more than 256 * 1024 things you could have an image of out there. The amount of information this analysis could give just can't be very useful.

    Thats not meant to disparage the work - image recognition is important and difficult. This particular 'advance' just isn't that 'advancing'
    • You might have misread the numbers. They said 256 'bits'. 256 bits can distinguish roughly 1.1 * 10^77 states. That's a LOT. 1024 bits can distinguish roughly 1.8 * 10^308 states. I don't think there are that many atoms in the universe. Jere
        • It still isn't big enough to do even a half decent job.

          Imagine that a bit has a coherent meaning, such as "image contains a kitten." If the bit is zero, there is no kitten in the image; if it is one, there is at least one kitten in the image.

          Now imagine that system extended for all types of animals. Going to go past a requirement for 1024 bits pretty fast, isn't it?

          Now imagine that a bit represents "image contains a keyboard" in the same fashion.

          Now imagine a bit for every type of macro and mic

            • You should imagine it because you are conflating the number of bits required to count things with the number of bits required to discriminate among things. The two are entirely disjoint.

  • But then again, rotating an image 90 degrees might defeat the whole system. Scientists are so busy being sophisticated and racing to write journal articles that they often miss the obvious.
    • Re: (Score:3, Insightful)

      Any decent object recognition algorithm supports at least affine transformations, which include rotation.

      Some of those scientists are actually pretty smrt.
      • Any decent object recognition algorithm supports at least affine transformations, which include rotation.
        Which always made me wonder, how do they go about doing that? Do they perform cross-correlation for every variant of each support affine transform? Or is it something completely different?
        • by ceoyoyo (59147) on Sunday May 25 2008, @02:58PM (#23537837)
          There are all kinds of ways, but two simple ones come to mind. If you convert to a polar coordinate system the power spectrum is conveniently orientation independent. You can use the same trick with a shift: the power spectrum of a Cartesian coordinate system is shift independent.

          Another way is to somehow identify the orientation. An simple way to do that is to find the axis along which there's maximum variation and rotate until those axes match in both images.

          Pixel by pixel co-registration basically does look at a similarity measure for a lot of variations on the affine transform. You generally don't have to look at them all though: you use an iterative algorithm with a clever optimization strategy so your transform gets better and better instead of searching through the parameter space randomly.
  • search for similar pictures through millions of images in less than a second on a typical PC.

    Of course that typical PC is a dual quad-core machine running at 3GHz with 8GB of memory, GPU X3 running offloading co-processing software, and 1TB of hard drive space.

  • Check out libpuzzle : http://libpuzzle.pureftpd.org/project/libpuzzle [pureftpd.org]

    It's also designed to quickly find similar images, even out of millions of images. The documentation describes a possible indexation technique (as suggested in the original paper):
    http://download.pureftpd.org/pub/pure-ftpd/misc/libpuzzle/doc/README [pureftpd.org]

    Images are stored as 544-bits signatures by default.
    • Kind of like how the brain uses heuristics and previous knowledge to recognize images instead of actually solving the problem (whatever that means)?
    • Yeah, what a trivial test environment, their "All pictures on Flickr" database. That's so narrow.
    • Well, no. As Hays pointed out they have a really huge database of images, plus by limiting severely how many bits it takes to represent an image they've made it possible to do the search in real time, all in RAM, with a modest investment in computing power. In addition, it's not simply pixel matching. The images are arranged according to a semantic tree of English nouns so, if I'm interpreting it correctly, you will end up with images next to each other on the map that are also semantically similar. Thi