Video with Depth 110
Lifewolf writes: "A new technology from 3DV Systems uses pulsed infrared illumination to capture depth information for every pixel of a video stream. This allows for neat tricks like realtime keying without need for color backgrounds. JVC is already selling a product based on this, the ZCAM."
Modeling applications? (Score:2)
digitizing 3D models. Anybody heard of this
technology already being used for that?
Re:Modeling applications? (Score:2, Interesting)
This system could probably be used for modeling by placing a physical model on a turntable and recording its changing z-depth over time. I wonder how accurate it is at close range. This could be really useful for architects who want to develop a 3D site plan. Simply snap a few shots at the building site, construct a DXF file based on the depth information, and import it into your CAD software.
The camera is probably intended for use with compositing applications like Shake, which can process z-depth information, as well as RGB, and alpha. Great for seamlessly integrating live action with computer generated 3D, particularly realtime 3D
This also poses the question: what other types of useful information can a digital camera acquire, if we are not limited to the visual spectrum? Would it be possible to extract diffuse color, reflected color, transparency, or other "ray depth" information from real life subjects?
Re:Modeling applications? (Score:1)
Re:Modeling applications? (Score:1)
This is kind of offtopic, but interesting nonetheless. Apple recently bought Nothing Real [nothingreal.com], the company that makes Shake and Tremor.
Can you say Final Cut Pro 4?
What's so difficult? (Score:2, Interesting)
Humans have 2 eyes in the front of their heads, inches apart. All that is needed in a camera is for two syncronized tapes to run simultaneously, with the lenses just a few inches apart.
Playback the left half on the left eye, the right half on the right eye, and our own built-in systems have no problem building those two images into a single 3D image.
I think the difficulty is not in the recording of 3D information, but of building a display to play it back to multiple people.
Re:What's so difficult? (Score:2, Informative)
This is not an attempt at 3D-video. This is video with depth information.
It's primary application is to select parts of the image that you want to replace ('keying'), nothing else.
Re:What's so difficult? (Score:2)
Twofold problem (Score:1)
Re:Twofold problem (Score:3, Interesting)
True, but what most people don't realize is that we see just as much depth in a TV screen, as we would in real life if we covered one eye.
Speaking of complex problems... There are certain devices that, when placed over your eyes, will essentially trick your eyes into seeing the depth on a flat screen, so there is quite a lot of information saved on a 2D image. The strange thing is that computer generated images are still seen as flat, while the rest has depth. What is different in the two is a mystery, but it just goes to show that our minds are privy to much more information than we are consciously aware of. (Have you ever seen a movie which used special effects and it just didn't seem right, even through you couldn't point out any real problem?)
Re:Twofold problem (Score:2)
Remember, a strong queue for 3D perception does not require two eyes: Moving your head just slightly gives you stereo vision over time. Sometimes you can't get the same thing from a steadicam shot.
Re:Twofold problem (Score:2)
I do gain a lot of information from motion.
At the same time, starfield simulations and the like (if done properly, refresh rate, etc) can really draw me in.
Re:Twofold problem (Score:2)
Re:Twofold problem (Score:1)
Re:Twofold problem (Score:1)
I find this quite interesting, but not hard to believe. His brain is probably using either change-of-perception (his own head moving around) or focus lengths of objects - or, more likely, combining the two.
Back on topic - one way for a digital camera to get a Z coordinate is focal length. Cameras have had auto-focus features for years - why not run all the way through the focal range, and with a decent embedded DSP you should be able to pull together at least an estimated Z buffer for the whole frame. I figure the whole focal range should take less than a second, but what do I know?
Re:Twofold problem (Score:1)
Re:Twofold problem (Score:2, Interesting)
It may seem that I am splitting hairs here, but I get very frustrated when people think that having one eye covered eliminates all depth perception. That is a catagorically wrong assertion since the retina in each eye occupies a three-dimensional space. People who have lost an eye encounter problems with depth preception, but do not lack the *ability* to precieve depth.
If you pay close attention to any stereoscopic image, whether it is a "magic eye" or a viewmaster you will notice that things are collected into two-dimensional sheets that appear to have depth relative to eachother. A similar situation in real life would be if everything was either a backdrop or a cardboard cutout.
By contrast the image displayed in a hologram presents an integral depth of the surface that is preceptible by a single human eye. It looks *real* becuase it is exactly the same 3-dimensional wavefront that existed when light was bouncing off the object to record the hologram.
It is all a little confusing, but a little thought and casual observation will reveal these things to you. In my case I spent three-months interning in a holograpy studio in NYC, so I got to hear many interesting discussions on this and various other strange concepts of reality.
So please peole, paralax does not mean the same thing as depth. If anything, please take that away from this thread.
Re:What's so difficult? (Score:1)
A double camera system with a computer vision system would have difficulty picking out edges of subjects, and the resulting 'bluescreening' would be bodgey, at best. This is a relatively cheap, and simple solution.
Re:What's so difficult? (Score:1)
Re:What's so difficult? (Score:2)
There are many ways to generate holographic images. The question is in the details. Will we see the same thing from any angle? Will a series of mirrors be used or just several lasers? How big will the picture really be?
It's just as possible in the future we'll all just strap on somethng similar to the I-Glasses [thinkgeek.com] and individualize te experience.
Re:What's so difficult? (Score:2)
Why are there no holographic cameras? How about a personal photographic system that could take a 2D picture, along with depth information. Couldn't the lab then use that information to extract some semi-3D models as a basis for a hologram? (You know; one of those thin colour-banded holograms they put on CDs and credit cards..?) Or is the cost of making those holograms prohibitively high..?
Re:What's so difficult? (Score:1)
Using normal holographic film you need monochromatic light to expose them (ie a laser) and exposure time is measured in seconds. Not very practical for a "camera".
Re:What's so difficult? (Score:2)
Btw, what's the deal with the monochromatic light? I realize you have to use one colour or else you get a blurred resultant image, but is there any way to sort of do a component hologram and then put the parts back together? Sort of like how video has RGB..?
Re:What's so difficult? (Score:4, Informative)
The biggest problems are computational; it's hard to do a good job of stereo reconstruction at high frame rates in real time. It's by no means impossible, and there are commercial out there that do it, like this one [ptgrey.com].
Two cameras aren't really necessary, either, if your camera is moving in the scene. It's possible to recover both the movement of a camera and 3-d information about a scene just by moving a camera through it. Googling for structure from motion [google.com] is a good place to start looking into those techniques, and there's a pretty cool page about one groups application here [caltech.edu].
In short, this company may have an interesting prodect (depending on cost and more details on the error characteristics) but this isn't something that couldn't be done with existing methods.
Also, as an aside, I find it interesting that they take a swipe at laser rangefinders as requiring a spinning mirror, when just about all IR cameras have a spinning "chopper" as an integral part of the exposure system...:)
Re:What's so difficult? (Score:2)
It is the difference between having a 3d object and taking a front on picture of it and importing that picture into Adobe Illustrator (or Kilistrator, take your pick, now Kdraw or KVector isn't it? ) and using a "convert to paths" tool, which will get you a very nice 3d -looking- image but it will only store two dimensions for you, VS taking multiple shots of that object and importing them into an Application that calculated the 3d space of that object.
Of course the advantage of what THIS camera does is that you get some 3d information without having to do a lot of REALLY nasty interpolation between multiple images. Granted modern techniques to do such have gotten better, but artificialy creating 3D data from 2D pictures of 3D objects, well. . . . heh. Even worse if those objects are "4D" (aka moving).
This new camera seems to deal with moving objects just fine. Yah.
The MAIN thing that I am thinking of this of is that you could possably translate objects around in your 3D space that was created by this camera.
Your point of view would remain fixed and none of the objects could rotate (more on this latter) but you could still do some REALLY nice stuff in regards to Object Based Encoding.
In fact the integration of 3D data into Object Based Video Encoding technologies could work to create for some VERY nice bit rates, or at least the removal of gobs of artifacts.
Imagine if the Video Encoding KNEW that such and such person was going BEHIND that plant.
Now of course one other use for this is that if you combined it with the pre-existing methods of using multiple cameras to capture a 3d space. With this method you could, mabye even after just creating an object outline in one viewpoint, (I will have to think over this particular facet of this new technology more in order to prove or disprove that idea) to rotate all the seperate OBJECTS within the scene, and not just move your view around the scene. (This is of course excluding any partialy obscured objects, which would likely have some strange things happen to them.
Because you have each objects X, Y, and Z coordinates, and your camera could have almost complete X, Y, and Z plane movements (remember, interpolated between multiple sources and your image quality when zooming in would be dependent upon your original image capture quality) you have yourself what is basicaly a fully workable 3d workspace.
Imagine importing your video some day not into Adobe Premiere but rather into Maya or 3D Studio Max.
Kick Ass.
Re:What's so difficult? (Score:1)
Our eyes gather most of their depth information by focusing on a particular object, then doing some really cool neural-network trigonometry to measure the inward angle each eye is at to approximate the distance to the object. This could be done with cameras on extremely sensitive servos, but the information is only obtained for a single object, not everything else in the scene.
Re:What's so difficult? (Score:1)
So the angle of the eye positions the object in 2D space and the curvature of the lens gives you the depth of the object in the center of your vision.
t.
Fun to abuse... (Score:4, Insightful)
Used within a survellance camera, it could detect motion without getting tricked by that tree near the air vent.
It could also be used in surgical situations where a specialist located in another state can more easily study facets of the video being provided to him (cutting out noise, if you will).
You could do some really weird video editing where you could create a scene of a person standing in a verdant field in the middle of summer with snow falling within his 'mask'.
Items recorded in this way (presuming the mask is also recorded) could perhaps be admissable evidence that helps the court focus on a specific action that might otherwise get missed.
It might also provide a less-expensive way to make 3-D videos. Precursor to holographic movies?
Re:Fun to abuse... (Score:1)
Forger's wonder tool (Score:5, Interesting)
IMHO, this technology would rather do the contrary. It makes photo forgeries so damn easy: no afternoon-long sessions with the gimp to get exact contours of people to delete from or insert into picutres: just use the ZCAM's distance keying and you get instant masks. The example given was scary: a business meeting, from which they could edit out people at will. The ideal tool for anybody that wants to rewrite history. So, forget about photos staying admissible as evidence in court.
Re:Fun to abuse... (Score:1)
If I remember correctly, there were 3D prOn
flicks made during the 70's golden era
(think "Boogie Nights"). I was told
these films were pretty impressive on widescreen...maybe this technology
will also bring a revival of those
artistic explorations !
Re:Fun to abuse... (Score:2, Insightful)
It's a step along the way, but it's got one major drawback: it only captures a scene from one viewpoint. As soon as you move away from that viewpoint you're going to see holes in the scene where the camera didn't capture any information. To fix this, you must either (a) keep the viewpoint fixed at the camera's center of projection or (b) capture multiple views of the environment to fill in the missing bits.
Cameras like this have another potential benefit: better video compression. There's a section of the MPEG-4 standard that provides for segmenting your scene into objects so you could, say, encode the weatherman separately from the backdrop he's waving his hands at. If you shoot with a camera like this that can give you a rough silhouette of major objects in the scene, you could spend more of your time doing high-quality encoding of the people running around in the foreground and less of your time on the background that doesn't change for the length of the shot.
That said, I'm awfully skeptical about their claims of precision. As another poster has mentioned, there's a reason why laser range scanners cost so much: building an accurate rangefinder with lots of dynamic range is hard. As for object segmentation... I personally don't believe the image they provide as an example. Take a look at the depth map of the people at the conference table. In particular, look at the tabletop. It's nearly parallel to the camera axis, which means that its depth should be increasing fairly rapidly, which means you should see a gradient from light (near) to dark (far) in that part of the image -- but no, it's all one color.
I suppose you can explain that as treating everything between depths D1 and D2 as a single object, but that doesn't work all that well in practice. What's far more likely in my opinion is that that object mask is a hand-created example rather than the actual output of the device.
I used to key images... (Score:2, Informative)
The real reason blue was used is because if you see a video signal, it is only 11% of the signal, at most, and also a very rare color(saturation wise) in a picture. Most people don't wear blue tarp mascara, and it was acceptable.
The other type of keying was on an Amiga with a Gen Lock, using background color as the transparency, a static image over a live background. You could also set the transparency, so you could get ghost-like effects.
But with one of these, you can probably make a scrolling background with the occasional tree popping to front. If you were to do the same with an editing suite, you're looking at at least a good hour, and when you rent out facilities, you look for all the helpies you can. Just printing out a still from video can cost more if you're using a "video printer".
I wonder if you can set the depth manually, or if it's hard coded. It might be fun to see something pass "through" something else.
Re:Compression? (Score:1)
For teleoperation of remote systems, it might make way more sense to weight the compression with respect to relative distance, something that is closer gets higher quality where something farther away gets lower quality.
This will revolutionize color keying. (Score:5, Interesting)
with this new technology, however, you could film an actor just about anywhere with very little preperation, and key him/her out based on depth AND color (some situations may need both), and easily pop new things both in front and behind the actor. It could save movie studios a lot of time, effort, and money for doing special effects, especially after you consider how easily it would be to generate a virtual stunt double from the 3d mesh (film the actor from a few angles, and merge the resulting 3d wireframe. Voila, perfect model down to the wrinkles in the skin)
Re:This will revolutionize color keying. (Score:3, Informative)
especially after you consider how easily it would be to generate a virtual stunt double from the 3d mesh (film the actor from a few angles, and merge the resulting 3d wireframe. Voila, perfect model down to the wrinkles in the skin)
Uh, no... I wish it were that easy - but scanned 3D meshes of that quality are still in the domain of laser scanning. There's just so much detail that even the best scanners can't pick up, major wrinkles and folds yes but pores and fine lines have to be simulated with displacement/bump and colour maps derived from the scan data (basically as it scans, the device takes a big long photo of the object to wrap around it later). Once you have the point-cloud from the scan (raw data) there is a LOT of cleaning up to do to get a parametric mesh with correct UVs (texture mapping co-ordinates) for use in production.
For more info, check these guys [headus.com.au] out - we've used em recently on a couple of film and tv projects and their output is damn nice, but the price tag reflects the complexity and difficulty of the task.
Re:This will revolutionize color keying. (Score:1)
Re:Interesting thought .. (Score:1)
I doubt anyone will believe this, but it is the only porn movie I have ever gone to see, I swear I'm not part of the dirty raincoat brigade.
--
Benjamin Coates
used to do this with 3d studio (Score:2)
it's great for doing depth based effects such as artificial depth of field (3ds4 didn't have that)
I'd love to have one of the cameras available for making live video stuff, I'm looking forward to getting my hands on one, I hope my local video facilities unit gets one (I'm going to mail them a link).
Coming soon to an MTV near you. Sadly probably not from my studio any more. I gave that up when 3dsMax came out, Seemed like there was no room left for a two man outfit (one gfx, one coder).
What this gives you (Score:1)
* The normal video-stream that any video-camera will give you.
* Another video-stream containing depth information.
So, what you have, at best, is a way to tell the relative distance from the camera to each point in the image. Which, will let you adress seperate elements of the image based on depth. But, you _won't_ have anything more image-wise than you can record at home with your Sony.
Sorry, no 3D-porn.
Re:What this gives you (Score:1)
Couldn't you take the image streams, do the red/blue-shift thing based on the depth stream (or better yet, disneyland-style polarization, if you've got the playback gear), and there you go, 3D-porn
--
Benjamin Coates
Re:What this gives you (Score:2, Interesting)
From there you probably can move on to the more sophisticated compression techniques
(soon to be) intruduced my MPEG-4.
Ever seen the move "Enemy of the state" where they triangulate 3D shapes with satellites
and movements? Great techniques in that movie, but scary scenario.
Re:What this gives you (Score:2)
You might be able to actually generate a 360 degree view of the background and encode the distance and angle of the view in each scene, then place the seperate actors into the scene.
The really cool thing about this technique is that it would make it easy to delete or replace any one object in a scene in a video.
Re:What this gives you (Score:1)
The downside: (Score:5, Funny)
Great, now you won't be able to distinguish between the show you're watching and the advertisement. Now when I'm watching TechTV, I can look forward to Britney Spears bouncing thru with a Pepsi at 30 second intervals.
Wow... (Score:1)
You could do lots of interesting tricks with this - like changing the cut-off on the z-buffer, so when someone walks away from the camera, it looks like they're walking through a wall.
Re:Wow... (Score:1)
(MODS- It's an obscure joke, just because you don't get it doesn't mean it's offtopic)
The new reaches of minaturization (Score:2)
I guess this thing is targeted more for reporters and the media, than the consumer.
I assume "keying" is what we dumb consumers typically know as "blue screening" or "green screening", but this lets you do the same without a solid background, since it can separate out the people in the foreground using a depth cutoff instead.
Neat technology. I think there'll be more practical uses for this than you might think at first.
I wonder how accurately the z layer aligns with the pixels. Since it's a different infrared source, bounced off the subjects, I wonder if there's some fancy alignment that has to be done, or if the same pixels on the camera pick up the depth information. It'd be the difference between perfect alignment, and having sloppy edges around objects, which is pretty significant for a lot of uses.
-me
Re:The new reaches of minaturization (Score:1)
Just amazing how DV cameras just keep getting smaller and smaller. I think I'll pick up that ZCAM, and get the optional belt case, so it's with me everywhere I go :-)
The "video-camera-in-a-match-head" phenomenon is pretty much exclusive to consumer gear. A good professional video camera should be at least two feet long. :)
I assume "keying" is what we dumb consumers typically know as "blue screening" or "green screening"
No, what you know as "blue screening" is technically known as "system crash". :)
The real technical terms are "Chroma-key" (make pixels with a certain colour transparent) and "Luma-key" (make pixels with a certain brightness transparent) Most guy-in-front-of-unusual-situation stuff (like, say, a weatherman) is done with chromakey.
Re:The new reaches of minaturization (Score:1)
There was an article in EQ magazine about working the X-games in Philly. Helmet-cams and matchbook-size "button mics" abound.
It doesn't appear to be on their site [eqmag.com]
Not mentinoned in the article, but another cool thing they have at the X-games is a camera suspended above the half-pipe by three (four?) cables connected to computer-controlled high-speed winches. It can "fly" over the skaters' heads at amazing speed, dropping down into the pipe to look up at them on the vert ramp, and then zooming up out of the way.
Blue screen? (Score:1)
I assume "keying" is what we dumb consumers typically know as "blue screening" or "green screening"
That is, unless they use Windows 9x regularly ;-)
but this lets you do the same without a solid background
Actually, use of a still (non-solid) background would help even this technique, as post-processing can massage the background vs. foreground using traditional MPEG motion compensation for an even more accurate contrast between a background moving in one direction and a subject moving in the other.
Still? (Score:2, Interesting)
--
Benjamin Coates
Re:Still? (Score:1)
Re:Still? (Score:1)
Re:Read and Blue 3D (Score:1)
the technique is pretty old (Score:2)
Re:the technique is pretty old (Score:2)
Will Make For Some Pretty Cool Effects (Score:1)
Apart from the obvious use getting virtual objects to pass correctly between/around objects in the real scene, or vice versa you've freed up the colour channel info being used as a depth key for other things.
Imagine keying an actors and his or her clothing in blue and using the depth keying to to replace the blue with a projected texture or somesuch using the depth information to do the texture calculations, or keying sports equipment in sports broadcasts.
Or if the technology eventually scales down to an affordable level it might make an interesting input device for playing video games.
I've already got realtime keying w/o color bg... (Score:3, Funny)
Thank you.
Remote sensing? (Score:2)
Now take your CAD file, recompile and render with a Quake3 engine, apply sampled textures, and you've got a very cheap, fast, good 3D walkthrough - architects will enjoy this too, as will tourism sites.
It's also going to mean some great first-person-shooter maps
Re:Remote sensing? (Score:1)
Re:Remote sensing? (Score:1)
The camera produces a 3D point cloud, from which geometry (CAD) does not fall out of naturally.
Visual Effects work (Score:2, Interesting)
I posted a comment [slashdot.org] a while ago that explained the uses in visual effects work for depth-cameras, and some of the problems with existing methods of pulling a matte off of live action plates...
We were actually talking about this at work the other day; mainly wondering how well it would deal with things like fine hair, smoke, transparent objects and stuff like film grain/video artifacts/lens artifacts etc...
Would love to try one and find out...
ZCAM has been around for quite a while now (Score:1)
That fact that it actually works as advertised is somewhat astonishing. If there's a large enough distance between fore and background (> 1,5 Meters) it Keys without any hassle. No more Blue or Greenscreens, that means.
Hair? Glass? (Score:2, Informative)
If this system, as it claims is simply making a z-buffer (depth buffer) of the image, then it's going to see hair and glass as a opaque lump, not the semi-transparent reality.
Blue and Green screening (not chroma keying) can do a very good job of pulling out variable opacity and thin items like hair. Especially with the newer LED screen illumination camera rings.
This technology has some nifty tricks and will allow more poor quality keying to continue, but it won't replace blue and green screens.
This is huge for MPEG4 (Score:5, Informative)
-Billy
Goes Beyond MPEG4 Codec (Score:2)
Stereoscopic video? (Score:2)
Why bother. A vertical split-screen image for left and right eye is all you need. Theres nothing stopping conventional television from broadcasting stereoscopic images. Get two camcorders, tape em together at the sides and videotape stuff in your house. Edit the video so that the left camera's image displays on the right-hand side of the screen, and vice versa. Bingo, 3D video.
See what I mean? [ibiblio.org]
Cheers,
Re:Stereoscopic video? (Score:2)
In *theory* you could do this with two cameras, and some amazing processing that compares the two images, extracting the depth information for each pixel. But if such software even exists (and I think it might, for leading edge 3D scanning techniques), there's no way it could be done in real time, like the ZCAM does.
-me
Not new technology - saw this at NAB in 2000 (Score:1)
NAB is the National Association of Broadcasters conference. The ZCAM was was being demoed then.
In the demos they had realtime keying so they could fly a 3D CGI character in front of and behind the live talent. There was only about a 40ms delay. This is impossible with normal keying (ie blue/green screen). (You can only put stuff behind the talent).
It's biggest limitation was the resolution of the 3D sensor was low - so you had rough edges (think jaggies).
They also demonstrated a 3D Realplayer and 3D Windows Media players (which you watched with stereo shutter glasses). These players were called 'deep players'. Pretty cool but definitely not new.
antialiasing?? (Score:1)
Are you sure this was the problem?? I've been wondering how well this technology would actually work (it was announced quite some time ago), and have heard that indeed it had "jaggies".
Though I was under the impression it was due to the inherent problem with anti-aliasing z-depth based composites.... the depth is represented as grayscale, from white (nearest) to black (furthest). If you were to antialias a foreground subject (say it's white) onto a black background, you'd end up with various shades of gray pixels along the foreground object's edge.... this would translate to the edges of the foreground object being at a distance between foreground and background, which is obviously inaccurate, as you're still deaing with the foreground object.
Re:antialiasing?? (Score:1)
From what I saw it wasn't an antialiasing problem in the traditional sense. It was an aliasing problem in that the depth sensor they use has a low spatial resolution.
The demo I saw that was live used the depth as a key - but it wasn't used to blend the objects together, it was used to make a simple visible/not visible decision. What you are describing is using the depth key to vary alpha which would be interesting to look at.
One VERY cool demo they did was to fly a CG character through the arms of the live talent. Where the CG object and the real object intersected you could clearly see blocky edges.
At the time I thought the technology (at least in a live setting) was maybe good enough for a young kids show were you want high speed production and the production values don't matter too much. Once they get the depth sensor up to broadcast the resolution however it will be a VERY nice live tool. (Maybe they have already done this - I haven't seen it in two years and they must have improved).
Here's the Patent (Score:1)
Looks rather simple, akin to simple range gating.