Forgot your password?
typodupeerror
Graphics Software

Cheap 3D Computer Vision? 102

Posted by michael
from the no-glasses-required dept.
InspectorPraline writes "According to this article at the New York Times [free reg req'd], a tech firm known as Tyzx is developing optics technology that will have three-dimensional capability -- using two cameras attached by a high-bandwidth connection to a custom processing card inside a PC. The article makes one believe that the system would have a top speed of as much as 132 stereo frames per second, which could be very useful in security systems. Of course, the real question is who's behind the cameras, but we can all drool over the other possibilities, right?"
This discussion has been archived. No new comments can be posted.

Cheap 3D Computer Vision?

Comments Filter:
  • Yay! (Score:4, Funny)

    by echucker (570962) on Monday June 17, 2002 @06:35AM (#3714287) Homepage
    No more taping the red and blue filters from my Mag-Lite to my eyelids any more! :-)
  • by jukal (523582) on Monday June 17, 2002 @06:39AM (#3714302) Journal
    See the company's website [tyzx.com] for better details on the used technology [tyzx.com], here are some interesting publications [tyzx.com], this one (PDF) [tyzx.com] is the core: Real-time Stereo Vision for Real-world Object Tracking.
  • 2.5d? (Score:2, Interesting)

    by GnomeKing (564248)
    can real 3d be obtained with just two cameras?

    or is it merely 2.5d

    Regardless of where the cameras are, is there not still a plane which the cameras/software cant determine the "height" of

    Dont you need 3 cameras minimum for proper 3d?
    • by giel (554962)
      Nope two camera's and some triangular thingking will do the job. The angle (or position in view) objects appear give enough info to determine their location. However this will be very inaccurate for distant objects. - I think...
    • Re:2.5d? (Score:4, Funny)

      by cyberon22 (456844) on Monday June 17, 2002 @06:54AM (#3714328)
      Do you need three eyes?
    • It's still 3D (three dimensions) but clearly with only two viewpoints you cannot judge height as accurately at longer distances (problem of resolution decreasing), but the computer will be better at discerning perspective than the human, who subconsciously changes what he/she sees....
    • No, you can see 2 planes (2d) with 1 camera, and 4 planes with 2 cameras. The difference between the two 2d views creates the third plane in a 3d environment.
    • The first 2 dimensions of an image create the plane that is being viewed (height & width). The 3rd dimension provides depth (the ability to percieve multiple planes that are parallel to the plane being viewed).

      The reason humans view the world in 3 dimensions is our eyes are placed far enough apart on our head to provide 2 different images for 1 percieved object. Over time, we get used to this and learn to take the 2 images and compare them; taking that in to consideration with our preconceived notions of how large the object being viewed is (If it's smaller than we think it should be, we tend to believe it is far away), we're able to approximate the distance between ourselves and the object being viewed.
  • Security? Nah.... (Score:1, Redundant)

    by hockeygeek (192147)
    Forget security, we all know it'll be used by the porn industry first!
    • Soon we'll finally be able to properly verify if Britney's boobs are indeed unnaturally big... without getting into a sexual harassment lawsuit.
    • Forget security, we all know it'll be used by the porn industry first!

      If I have understood correctly, this is for tracking/sensing movements accurately in 3 dimensions and being able to record them in binary, not for reproduction of images in 3D onto a screen with viewing glasses and all that stuff. Indeed, pretty good 3D technology is available but the pr0n industry relies on the cheapest technology available to make the most money - at least as a general rule this is the case. The current massive pr0n market has been enabled by Internet and digital media, but they ain't going to poney up loads of cash for this kind of technology.

      Pr0n is not necessarily for the discerning film critic after all, known rather for hand relief and titilation of couples who like a bit of that. Not for amazing technology and three dimensional shots. Would you pay more for 3D DVD quality pr0n??? DVD works for the pr0n industry due to form factor and ease of quality pausing of the frames, at least that's what I reckon ;-)

      • by billcopc (196330)
        Nahh.. easy of quality pausing ? I don't think so. DVD works for the pr0n industry because of all the cheap kids who own a 60$ DVD player and no VHS. Not that there's anything wrong with that, but it does represent a very early technology jump.

        Just think back of the early 80's when audio CDs started hitting the market; lame cd players would cost 400$ and up, and the discs themselves were hard to find, but eventually gained popularity and eclipsed 4-track tapes. It took years for the transition to progress, and now if you're seen buying a music tape the clerks will be wondering where you've been living for the last fifteen years. But fifteen years ago if you were purchasing a CD, those same clerks (ok, their parents) were probably wondering where you got all the cash for a cd deck/discman, and pretty much everyone in the street would chat you up about your shiny cd player. Same thing's happening with video, right now we're somewhere in the middle, as DVD is well on its way to widespread acceptance in the home market.

        Of course the pr0n industry has little choice but to follow the technology trends. Nowadays everyone wants 4-hour multi-angle ass-to-ass compilations with running commentary by the not-so-great Rocco himself, and since DVD discs are so compact, they can stuff more of them in the bottom dresser drawer underneath their socks.
  • It will be virtually impossible to palm chips or any other sleight (spelling?) of hand tricks that people do at card tables. I'm sure there's millions of other more interesting possibilities, but that, and stopping pickpockets, are the ones that arrived immediately in my head..
    • No more radar guns for police (now you'll need an invisible car)

      Fighter planes that don't need radar (but will need scads of cameras all over it -- both visible, infrared, and tetrawave)

      Computerized athletic officiating (which may finally kill the politics of skating and gymnastics)

      Better identity recognition software (now you don't have to face the camera)

      Custom-tailored clothing (no more scanning mechanisms)

      Automated grocery checkout (the machine identifies the fruits & veggies so that the clerk doesn't have to type in a 4-digit produce code)

      Another reason for George Lucas to go back and re-film all 6 episodes into digital 3-D.
      • No more radar guns for police (now you'll need an invisible car)
        Using radar guns or laser based detectors is still going to be cheaper and more accurate. A 3D picture of the car speeding is just lots of extra information that isn't needed to determine that the car is speeding, so is a waste.
        Fighter planes that don't need radar (but will need scads of cameras all over it -- both visible, infrared, and tetrawave)
        This technology is unlikely to work reliably at a distance of a mile with "eyes" only as far apart as even each tip of a fighter's wings. Fighter planes are very narrow as compared to the distance they need to be able to "see".
        Computerized athletic officiating (which may finally kill the politics of skating and gymnastics)
        Just because computers can see people moving in 3D doesn't mean they know anything about gymnastics or skating. That is a significantly harder artificial intelligence problem.
        Better identity recognition software (now you don't have to face the camera)
        The software only works on shapes that it can see. If there is no camera that can see the front of your head, it won't know what your face looks like. I find it unlikely that we'll have software that can recognize people from the back of their head anytime soon. Anyways, my argument is simply that just because we can collect 3D data more easily doesn't mean that any of the very difficult problems involved in analyzing 3D data have gotten any easier.
  • by jukal (523582) on Monday June 17, 2002 @06:55AM (#3714333) Journal
    This is taken from the document Real-time Stereo Vision for Real-world object tracking [tyzx.com]:
    <clip>
    The DeepSea chip is hardware implementation of the census correspondence algorithm invented by Tyzx staff... The algorithm's key concept is transforming a pixel's numeric absolute intensity value into a bit string that represents the pixel's brightness relative ot it's neighboring pixels. For each pixel, The DeepSea chip examines the pixels surrounding area called a neighborhood. A typical neighborhood is 7x7 pixels centered on the subject pixel. Comparing a subject pixel's intensity to its neighbours, the chip produces a relative intensity map (show in the document, page 8).
    .... the DeepSea chip may not be able to find a valid match for every pixel in the image. Large unformity lit areas of scene may have pixels of identical intensity; for pixels in such area, no single match can be found. Pixels that correspond to an object that is invisible to one imager but the other also do not have matching pixels.
    ... Once the matching process is complete, the range of each pixel can be calucated using the horizontal disparity of the matching pixels, the focal lenghts of the lenses and the distance between them. The DeepSea chip designates the range or anormalous pixels as invalid.
    </clip>
    (typos are mine) :)) See also a HP document [hp.com] covering partly the same matter.
  • by WebfishUK (249858) on Monday June 17, 2002 @07:06AM (#3714357)
    Having worked in machine vision for over 10 years now (in particular stereo vision) I feel I am able to provide some useful comments on this.

    The technology employed (both hardware and software) is limited. CMOS sensors of the type described suffer from poor signal to noise as well as interlacing artifacts. Pixel jitter is of major importance in machine vision and I doubt these sensors offer much clock control over and above the 1 pixel mark (if any).

    The matching algorithm described is very primitive, assuming rotation in depth between views doesn't effect the scene projection into the image - ooh but it does. The concensus matching algorithm is very simple and whilst it does recognise the problems of illumination variation it fails to solve the problem in a manner you could describe as robust. Also contrary to popular belief you cannot robustly recover depth from every pixel n the image! There is no evidence that the human vision system does it (without knowledge of the object) so why are people trying it? Even if you ataempt it you are going to need some way of telling which data is more accurate than not in order to start using the results. Edges are your best bet and I didn't see any evidence of preprocessing described in their system (although to be fair I only read it breifly).

    I appreciate that this is supposed to be a cheap system and thus its limitations are probably to be expected. Might be fun to play with for a hundred Euros or so.

    For more state of the art look at what is possible you could do better than take a look at TINA [tina-vision.net] an open source machine vision system with a very sophisticated stereo depth estimation algorithm (we even built a chip to accelerate it!)

    • I haven't been working in stereo vision for nearly as long as you, but presently, I am working with real cheap stereo vision. Our present cheap setup involves two $40 CDN USB webcams, and a frame for them, running on Linux. Total cost of stereo vision equipment beyond your system cost: $100 CDN

      We need to improve our software a long way at the moment, but the ability is already there for the 3D images.

      The main part of our application is actually how we are sending the video over multicast to several machines, including an SGI Onyx3200. We are using MPEG-4 for the video compression, real-time on high-end AthlonXP systems.
  • I don't know what "inexpensive" means. It's all relative, and no real point of reference is given. If it truly is inexpensive, this could open up a market for lots of new products which track objects in 3D (real) envirnments where it just never made economic sense before.

    Product ideas anyone?

    -Pete
  • by bodin (2097) on Monday June 17, 2002 @07:25AM (#3714397) Homepage
    But it is no longer in production and it is patended.

    Works with any software as it is attached at the front of the screen. Surface mirrors and the idea of doing the view-master 'on screen'

    I'll keep mine for a long time.

    A description and pictures of it here [nau.edu]

    Patent here [uspto.gov] with description.
    • There must have been some good stuff around in the patent office the morning they approved that one. There's only about 150 years' worth of prior art for stereo viewing via mirrors. Love the part of the claim where they discussed all the prior art but said that it required the user to bend over.

      Seriously, this is a perfect example of the USPTO issuing patents for trivial things. I can't even imagine calling this an invention, there are so many precedent devices that use the same optical principles.

      BTW, a similar device is currently available at http://www.pokescope.com/.
  • Of course, the real question is who's behind the cameras

    When are slashdotters going to stop adding these kind of remarks at the end of their news post? I'm getting tired of all the paranoia and propaganda that is around on Slashdot, even if it is justified sometimes.
    • Of course, the real question is who's behind the cameras

      When are slashdotters going to stop adding these kind of remarks at the end of their news post? I'm getting tired of all the paranoia and propaganda that is around on Slashdot, even if it is justified sometimes.
      Whether or not that comment is justified, I interpreted as being a misquote from the end of the article itself. The original line was:

      "The question," said Marc Rotenberg, director of the electronic privacy group, "will always be who's behind the lens?"
  • Can anyone think of more interesting apps for this? How about this one: A computer system that is able do be referee for a sports game.
  • The two cameras approach requires relatively high performance. Is there are reason why combination of digital camera and laser based distance meter (accuracy is measured in millimeters) would not be more accurate, reliable and require less computational performance.

    Take image, feed the laser distance-o-meter, which scans the distances and embeds the results with the imagedata. We could even have a matrix of the lasers for example to measure the distance on a single shot, for example at 8x8 (64) beams would be already good for scanning an area of few square meters - if the objects that we are looking for area bigger than insects, ofcourse :) To me atleast this aproach is also easier to comprehend than some magic algorithm.
    • > scanning an area of few square meters

      square meter is not maybe the correct term to be used in here, but what I meant is focusing the camera so that the image taken covers a flag size of 2 x 2 meters. What's the correct terminology here ? I don't even own a camera, so... :)
    • Yep! People have thought of this one too. Take a look at Cyra's laser scanner [cyra.com]. Only one teensy little problem. It costs $150,000. Systems like this work with a time-of-flight laser system and a gimbled mirror to move the spot. The mechanical system is very tight, which drives the cost high. Moral of the story; Good 3D (like 1mm resolution @ 50m) ain't cheap.
  • Now nerds like me will finally have chance to simulate reality even better without actually having to face it.

    "In event of an emergency in 'real life sim 3000' press [enter] to pause and scroll up the history window to see what went wrong"

    I wonder if cheat codes are applicable ...
  • by LSD-OBS (183415)
    Expect a whole new onslaught of X10 [x10.com] ads as soon as this technology becomes popular :(

    "We must destroy X10! We must destroy all Internet ad!" - KOMPRESSOR [everything2.com]
  • The problem I can see with such technology is that it should be adjustable to your vision.

    The article is about computers/robots seeing in 3d, not us. It will enable much more precise handling of objects in realtime, whatever the application might be. (insert ref to porn here)

    3D glasses have already been with us for genrations.

    Destoo - reading /. with red-blue glasses.
  • Close one eye. Can you still estimate the distances of objects around you? Of course you can. This demonstrates that there's much more to depth perception than stereo vision.

    Stereo vision is inherently limited. It requires that the objects have sufficient texture so that points on the two stereo images can be correlated. Our depth perception relies on much more than stereo e.g. common sense knowledge about the world, intution about shading and lighting, etc.
    • I disagree with both of your points:

      Close one eye. Can you still estimate the distances of objects around you? Of course you can. This demonstrates that there's much more to depth perception than stereo vision.

      This does not mean that you did not learn depth cues such as perspective and relative size from other experiences, such as 3D perception. Simply because you have learned that certain shading patterns imply depth does not mean that you did not initially gather that information via stereo vision

      Stereo vision is inherently limited. It requires that the objects have sufficient texture so that points on the two stereo images can be correlated. Our depth perception relies on much more than stereo e.g. common sense knowledge about the world, intution about shading and lighting, etc.

      Random dot stereograms were invented to disprove this statement. They clearly demonstrate that you do not need features to see in depth. There is a VERY large body of research surrounding these topics. Start with the book by David Marr.
      • This does not mean that you did not learn depth cues such as perspective and relative size from other experiences, such as 3D perception. Simply because you have learned that certain shading patterns imply depth does not mean that you did not initially gather that information via stereo vision

        It's relatively easy to test your argument. A person blind on one eye from childhood would never be able to learn stereo vision. Yet, it's VERY likely that he are still able to estimate distances.

        The argument that he gathered distance information through moving and seeing an object from different angles and constructing 3D (or 2.5 D as some argue) in his head could be a good one. However, if you provide a photo of some scenery never seen before, this one eye viewer should still be able to estimate object boundaries and relative distances.

        Simple rules like "object A covering object B is in front of it" play a much more important role than SV. SV is rather an addition to an already existing machinery, not it's primary tool.
    • e.g. common sense knowledge about the world, intution about shading and lighting, etc

      yeah, also crazy things like size and objects blocking other objcets.

  • This is similar to some work I did on Eye Gaze Tracking in my senior year at University of Connecicut. The project page can be viewed here [gbook.org].

    I wish I had done more with it, there are more applications for this than just tracking people in public. They can be used for keeping the laser in the correct position if a person moves their eye during lasic eye surgery. It can be used to by a paraplegic to use a computer. And most importantly it can used to target in Quake3.

  • but we can all drool over the other possibilities, right?

    You mean 3d pr0n?

    Riiiight, and those X10 cameras are for surveillance too.
  • EDISON [greatmindsworking.com], is a free C++ toolkit that performs edge detection and image segmentation. The image segmentation portion is based on mean-shift analysis.
    A colleague and I are currently in the process of porting portions of EDISON to Java.
  • There is a company called Point Grey Research
    (http://www.ptgrey.com/) that has external binocular and trinocular stereo units for sale that use firewire. They don't do the processing on the unit, but have algorithms that run on standard PCs to process the data for you. Pretty interesting little guys, the computer vision lab where I got my degree (http://cvrr.ucsd.edu) had 3 of the triclops camera systems. They have a new one called the bumblebee that looks to be cheaper and maybe do processing onboard?

    There are linux SDKs available also. Note my version of Mozilla (version 1.0) doesn't load their page correctly, maybe some IE messy code?
  • Not meaning to offend anyone here.

    but, wasn't this all invented in the early 1900s?

    History of Cameras [photographer.org.uk]

    If so, then why is taking a picture with two cameras and then displaying them to people so they have stereoscopic vision so "computationally intensive". It seems not to difficult for me. (What's really computationally intensive though would be rendering the two pics, but even then it only requires the "camera" to be shifted and two images to be rendered for each frame. So therefore requires O(f(x)) (f(x) = big O for time to render one picture) computation time and I am guessing it's roughly double the computation time.

    Maybe, I am missing something though.
    • It's not about displaying the stereo image for a human, it's about a machine being able to interpret imagery and 'understand' what it sees around it - machine vision. Algorithms for automatically determining depth from stereo are usually imperfect and may require several passes through both images.
      • Ah, okay. That clarifies things a bit.

        But, I still find it confusing as to why this would be more difficult then interpreting one image.

        Since the algorithms would be the same algorithms used to render 3d images. You just have to compensate for the angle differences between the two overlapped images. Then you should be able to easy obtain a distance using simple trig formulae.
        • Basically, the computer needs to find the matching objects in each image and how much they're displaced, then compute the distance map based on the displacement map. The second part is easy, but the first part is challenging and computationally intensive. Simple algorithms I've worked with look for identifyable edges in one image and try to find that edge in the other image. Starting with greatly reduced images to develop a "guesstimate" map, the resolution is increased in each pass using the previous results to determine what area to search (so the entire image doesn't need to be searched). This is just one of many approaches to pulling depth from stereo. I'm not sure what kind of algorithm Tyzx is using or what kind of accuracy they're claiming (if they're advertising that info). In machine vision, time complexity is an issue. In other applications, such as mapping and generating 3D models, accuracy is a bigger issue.
  • I like the name of the company.
    No, we are not related.

    Kryzx

  • we can all drool over the other possibilities

    It's always the same on Slashdot - somebody will eventually end up talking about p0rn...

    I'm disgusted!!!
  • As others have commented, this technology seems like nothing any better than what is already out there. Who really cares? Why is this relevant news? Show me some serious attempt at real 3D vision any day.
  • I want one of those implants like the movie Johnny Mneumonic(Spelling). That way I can put on the 3d glasses attach a processor to my spine and play quake all day. No Boss, these are prescription glasses, honest.
  • 132 stereo frames per second, which could be very useful in security systems

    The local video store puts all its p0rn under the "documentary" category. Has the codename been changed, and no one told me?
  • http://www.zipworld.com.au/~surturz/threed/3dindex . tm

(1) Never draw what you can copy. (2) Never copy what you can trace. (3) Never trace what you can cut out and paste down.

Working...