Follow Slashdot stories on Twitter


Forgot your password?
AI Facebook Technology

Facebook Thinks Occlusion Is the Next Great Frontier For Image Recognition 32

An anonymous reader writes: Researchers at Facebook AI Research (FAIR) have published a paper contending that image recognition research is now advanced enough to consider the problem of occlusion, wherein the objects AI must identify are either partially cropped or partially hidden. Their solution is the predictably labor-expensive route of human annotation of existing image-set databases, in this case 'finishing off' occluded objects with vector outlines and assigning them a z-order. This article looks at the practical and even philosophical problems of getting IR algorithms to 'guess' objects usefully, and asks whether practical IR research might not be currently limited both by the use of over-specific image datasets and — in the field of neural networks — by problems of theory and limited 'local' processing power in critical real-time situations.
This discussion has been archived. No new comments can be posted.

Facebook Thinks Occlusion Is the Next Great Frontier For Image Recognition

Comments Filter:
  • If a series of images is available and observer or target or intermediate objects are moving, occlusion will vary image to image and the nature of the delta portions should be highly informative for recognition. This requires an object/region re-identification subsystem.

    Also, scene context statistics should be used, much as preceding utterances are used in speech recognition. Given that we've already recognized a situation type with this that and the other object-type in it in this (possibly dynamic) relati

    • Doesn't matter how much research they do, this kind of vision only will only work if part of a Strong AI. The keyword is 'dynamic processing' and its pretty difficult even with Strong AI. I know, its a field I have worked on directly.

  • So if regular object recognition is such a solved problem, why to they need people to manually prepare the images? I'd just take a normal image, recognize the objects, and then partially cover some of them to train their algorithm.

    • That sure makes sense. Don't tell them, though. The inability of image recognition software to handle cropped pictures is one thing which my better replacement for CAPTHCA depends on. CAPTHCA sucks because humans aren't much better at computers at recognizing squiggly letters. We are, however, MUCH better at recognizing certain specifc types of images when they are cropped and rotated.

    • Yes. One can synthetically create cropped images to train CNNs. Then if you recognize "person standing" in the left side of an image and "front end of commercially relevant automobile" in the right side of the image you can likely expect that this is a person standing in front of the automobile, unless the template for junkyard is also signalling recognition. Then you zero in on which of your friends is standing there, and try to get that friend to recommend to you that you need a new car just like his. Alm

  • The use of vector completion and all is a good idea, but it seems systems like that would work better in conjunction with other techniques, like trying to consider context of the area where you are in. What is behind a tall narrow object varies a lot depending on if you are in a jungle vs. a parking garage...

  • by LMariachi ( 86077 ) on Monday September 07, 2015 @07:20PM (#50474755) Journal

    Pull back. Wait a minute. Go right. Stop.
    Enhance 57 to 19. Track 45 left. Stop.
    Enhance 15 to 23.
    Gimme a hard copy right there.

    • No only did that Decker have access to a plainly ridiculous level of zoom, when panning around, the perspective of the image changes, and object that were hidden from the original perspective appear. [] We're left having to assume that "enhance" operation can do wonders on an old snapshot, or that it's something of an old snapshot from a holographic Polaroid. It would sure make image occlusion an easier problem to solve.

      • I figured the device was "looking around the corner" by extrapolating from visible reflections. A human can easily do that given a properly-placed mirror, even a curved or broken one, but a computer might be able to piece it together from distorted fragments around the room — a shiny doorknob here, a beercan there, a metallic light fixture up above. Sort of reverse raytracing?

      • by e5150 ( 938030 )
        Well, that's nothing compared to uncrop. []
  • by Tablizer ( 95088 )

    What about a kind of genetic algorithm to evolve candidate 3D models, and the model that best matches observations and context "wins". However, that is computationally intensive. But, it is highly parallelizable.

  • In fact, this won't stop at merely recognising faces that are partially obscured - in the not so distant future, they will be able to recognise faces that are completely absent!

"In matrimony, to hesitate is sometimes to be saved." -- Butler