Stories
Slash Boxes
Comments
typodupeerror delete not in

Book Reviews

Recent reviews from Slashdot readers:

Submitting a review for consideration is easy; please first read Slashdot's book review guidelines. Updated: 2008114 by samzenpus

Comments: 144 +-   Photosynth Team Does It Again on Thursday August 14 2008, @07:48AM

Posted by CmdrTaco on Thursday August 14 2008, @07:48AM
from the see-what-i-see dept.
graphics
software
STFS found an update to the Photosynth stories that we already ran. You might remember the amazing photo tourism demos. Well, this new version kicks things up several notches with paths and color correction to more smoothly transition between photos taken in different lighting conditions. As before, this stuff is worth your time. Check it out.
story

Related Stories

This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • color (Score:4, Interesting)

    by catbertscousin (770186) on Thursday August 14 2008, @08:06AM (#24597867)
    The color matching section was quite impressive given the wide variety of lighting and color temp in the starting photos; if they wrote their own software to do that, it sure counts as R/D.
    • Re:color (Score:5, Interesting)

      by Gewalt (1200451) on Thursday August 14 2008, @08:35AM (#24598173)

      The color matching section was quite impressive given the wide variety of lighting and color temp in the starting photos; if they wrote their own software to do that, it sure counts as R/D.

      AFAIK; adobe created the technology first in response to the needs of automation in the pornography industry. It seriously helped alot of "studios" color match the whole set just by having a wizard scan the pics and correct them all.

      • Lol, why does everything have to be for the porn industry. It couldn't possibly have been because it'd be useful for any other group that uses large numbers of photos... Just about any photographer would find it userful... whether it's a wedding or a fashion or a sports photographer. I'm not attacking you personally, there are just so many insane, "well this format won because the porn people picked it" type urban legends that it gets a bit ridiculous after a while.
        • Re:color (Score:5, Interesting)

          by Gewalt (1200451) on Thursday August 14 2008, @10:26AM (#24599899)
          Actually, I was a rabid Adobe Forum troll when some self-declared porno studios started clamoring for the feature. The other people it would be useful for actually dismissed it, as they did not seem to think they wanted that step in their workflow automated. But once the feature was added, everyone seemed to appreciate it. Of course, adobe is not one to normally listen to and assimilate feedback, especially not from their forums, so that could have just been coincidence.
  • And THIS is why I tend to take huge numbers of photos and never delete any... Technology like this will account for easy geotagging, date I already have in the EXIF data, whereas people can be tagged with face recognition soon enough.

    That done, I'll be able to navigate my tens of thousands of photos by asking for things like photos taken of the kids while outside at the cottage when they were 3 years old.

    Also, remember to backup! :)

    • It looks like taking a video would be easier. That way, you wouldn't have to spend time stringing all the stills to together - if I understood correctly.
      • by Max Romantschuk (132276) <max@romantschuk.fi> on Thursday August 14 2008, @08:46AM (#24598305) Homepage

        It looks like taking a video would be easier.

        Depending on what you are trying to do... My original point was that technology like this will make it possible to navigate the swamps of data we're accumulating.

        I like having a lot of family photos, but traditional albums won't do when we have literally thousands of them. Stuff like this can make it possible to easily call up photos based on suitable criteria. Like I said we need other parts to, like face recognition, but summing it all up we'll eventually have a feasible way to navigate a huge amount of photographic data.

      • Well the tech demo is using photos taken by arbitrary people. While it could be used to similar effect on your own photo collection (if you take enough photos from enough positions), the real power would seem to be when it's used on a large collection of user-submitted photos, or if its fed the contents of Flickr, etc.

    • Geotagging is not that hard nowadays, assuming you have a device capable of creating a gps log and are using a digital camera (timestamps). Take the log, load it into gpicsync [google.com] and let the program tag your photos for you. Just make sure the gps device and your camera have their clocks synchronized. I'm still waiting for a decent way to browse photos on a map, though - pretty much what you're looking forward to, I guess. Picasa lets you view geotagged photos in albums in google earth, but it's not much more t
    • That done, I'll be able to navigate my tens of thousands of photos by asking for things like photos taken of the kids while outside at the cottage when they were 3 years old.

      That raises an interesting concept. Could they do a 4D orbit? For example identify pictures of your kids at different ages and then you could watch them grow up in front of your eyes. Or watch how a city street changes over a decade? That would be really interesting...shame it will probably only every be available for Windows.

      • by ka9dgx (72702) on Thursday August 14 2008, @10:33AM (#24600023) Homepage Journal
        Because a video camera is nowhere near the quality of a still image, still cameras will win for a number of reasons:
        • Still Camera - less motion blur, if any
        • Bigger sensor - less noise
        • Focus mechanism - an SLR has a much better focusing mechanism
        • Image Compression - almost all video codecs record a stream of images, and do not optimize the quality of an individual frame
        • Exposure time - A still camera can take from 1/8000 second to 5 minute exposures for a single frame, as opposed to a fixed time of about 1/30 second for NTSC
        • Aperture - A still camera can control the aperture to get desired depth of field

        So, those are the ones I can think of off the top of my head.

  • by pz (113803) on Thursday August 14 2008, @08:07AM (#24597883) Journal

    Very cool stuff! Does anyone know (are any of the project team members here?) how much foreknowledge of the object being orbited that is required?

    For example, is a 3D wireframe model necessary?

    Is a filtering of the photos necessary to ensure that they are all of the same subject?

    What level of pre-processing is required on the photos, either automated, or manual?

    How well does the system fare when the object being photographed isn't absolutely static? A drawbridge, for example, changes shape. Or Niagara Falls. Or a flag. Or a single person.

    Anyone know?

    • This is described in their SIGGRAPH paper, which was prominently linked from the article.

      It's a bit dense and involves some cross references, but here's a part which may answer some of your questions. For more detail you oculd always read the paper yourself.

      We use our previously developed structure from motion system to recover the camera parameters for each photograph along with a sparse point cloud [Snavely et al. 2006]. The system detects SIFT features in each of the input photos [Lowe 2004], matches features between all pairs of photos, and finally uses the matches to recover the camera positions, orientations, and focal lengths, along with a sparse set of 3D points. For efficiency, we run this system on a subset of the photos for each collection, then use pose estimation techniques to register the remainder of the photos. A more prin- cipled approach to reconstructing large image sets is described in [Snavely et al. 2008].

    • by dave420 (699308) on Thursday August 14 2008, @09:11AM (#24598645)
      You just give it the photos - it figures out the rest. It works by stitching them together in 3D, so if there is a photo of one part of the subject that is not overlapped by one other, the photo won't be part of the finished "model". If you download the old demo, you can see the Yosemite demo, which shows what happens with movement (hikers climbing a mountain). If it can match up most of a scene in an image, the image can still be used. I'm sure it'll only get better. Another great example is in the old demo, where they simply searched Flickr for "Notre Dame", and then constructed the entire cathedral. It picked up a photo of a poster in someone's house, and seamlessly integrated it into the model. It recognised what it was from, and where on the cathedral it was positioned, and reflected that by putting that image exactly where it should be in the finished "model". Of course this is just stuff I've gleaned from watching the demo videos, using the demo, and reading as much as I can about it, so I might be wrong on some of it, but that was the impression I got. If I'm far off, I'd appreciate being put right, as this technology is nothing short of stunning.
  • Science fiction and VR have primed me to believe someday we would all be walking around some imaginary digital world (oh wait, WoW anyone?), but this is "virtualization" of the real world. Like Google street view on crack. I am simultaneously in awe of the technological achievement and embarrassed that my life in computers hasn't yet created anything so cool.

    I, for one, welcome our new PhotoSynth overlords.

  • Security (Score:5, Interesting)

    by robvangelder (472838) on Thursday August 14 2008, @08:21AM (#24598013)

    I was on an ocean cruise recently, and a little girl was lost... Ship's Security were looking for her.

    I later heard she had been found, and as I walked back to my cabin I thought of this software.

    Every corridor of the ship has cameras.

    The parent could recall the last time she was with the child. An operator could then fly through a 3d map of the ship, from that point in time, with recorded video overlaid, following the girl in fast-forward until the current time was reached.

    The flying would be like spectators do in first-person-shooter type computer games.
    An observer could even be automatically tethered to the missing person.

    • Convoluted (Score:5, Funny)

      by CmdrGravy (645153) on Thursday August 14 2008, @08:43AM (#24598265) Homepage

      Erm, isn't that a bit of a long winded complicated way of doing things ? I mean sure, Computer could do that for you but why not just ask instead ?

      "Computer, where is " and that would be that. I mean typically she'd be stowed away in the engine room re-configuring the sensor array for some nefarious purpose but that's just kids nowadays I guess.

    • imagine the salivations of the UK security forces.

      there is a book, 'lacey and his friends' which contains a few short stories about a society with such abilities...

      not pleasant.

    • Forget the ship, use all of London and any other place with CCTV. Orwell was an optimist.

    • An observer could even be automatically tethered to the missing person.

      Or you could just tether the kid to the parent with a kiddy leash.
  • I've seen some of these articles about Photosynth, and there seems to be a lot of hype. But... I don't get it.

    I see that Photosynth can glue a series of images together so that you can zoom into and move around a scene and get an epileptic-seizure of correlated viewpoints. This group seems to have made a virtual walk-through using this. But I am unclear:
    1) What is the point
    2) What is the breakthrough

    As for #1, Photosynth is ugly. I would much rather have a few good quality same-lighting photos to look a

    • by Anonymous Coward on Thursday August 14 2008, @08:34AM (#24598169)
      If this was an OSS project, your post would have been rated "flamebait".
    • by Anonymous Coward on Thursday August 14 2008, @08:40AM (#24598227)

      It needs neither input of coordinates or input of a rough 3d layout. It generates its own 3d model by analyzing the photographs programatically, you do not even need to tell the program they were taking in the same area. The photographs are then automatically applied to the generated 3d model and finally it lets you move freely in the generated 3d world selecting the best photo matching your current viewpoint while applying perspective remapping, color correction and lens correction.

    • If it's the same project I think it is, this can do it all using image recognition - correlating photos that appear to be of the same location, and then stitching them together.

      It takes a crap load of processor time to do it, but it's largely a hands off process.

    • by hkz (1266066) on Thursday August 14 2008, @08:42AM (#24598255)

      From what I took away from the original demo, they were doing everything algorithmically. The original demo showed a wireframe of the Notre Dame generated completely from amateur pictures, then overlaid with those same pictures to give it texture. So yes, it is quite impressive. I'd be surprised if Google wasn't doing anything similar for Google Maps though.

    • by dave420 (699308) on Thursday August 14 2008, @09:14AM (#24598675)
      Because you can, say, search Flickr for a landmark, get the images, run them through this, then you can navigate through the space in 3D, looking at high-res imagery of the subject, from all available angles, without having to previously know anything about the subject. Even the system doesn't need to know anything about the subject, it only needs the photos. It is ENTIRELY automatic, using only images. If you look at the old Notre Dame demo, you can see that it even correctly inserted a photo of a poster of Notre Dame into the 3D model, in the exactly-correct position. 100% automatic. That's the breakthrough.
    • It does not need input other then a massive amount of photos. The program does all the piecing together and building of the 3D layout. If you goto the microsoft site on it they have more details. For the video all they did was get a bunch of photos from different people of the same object and feed them into the program and you saw what they got.
      As for the bad part, it takes days and a powerful computer for anything beyond the very basic set of pictures.
  • Video (Score:5, Informative)

    by c_g_hills (110430) <{chaz} {at} {chaz6.com}> on Thursday August 14 2008, @08:35AM (#24598175) Homepage Journal

    Obligatory link to the youtube video [youtube.com] (not a rickroll, I promise!)

    Thanks, Network Mirror!

  • The Photosynth technology preview runs only on Windows XP SP2 and Windows Vista. nuff said.
  • by replicant108 (690832) on Thursday August 14 2008, @08:52AM (#24598401) Journal

    There was some discussion recently about the possibility of building an open source photosynth - and creating an 'open voxel space' map of the planet.

    Anyone know if there's been any progress on this?

    http://lists.burri.to/pipermail/geowanking/2008-June/005373.html [burri.to]

    • Re:Huh (Score:4, Informative)

      by BitterOldGUy (1330491) on Thursday August 14 2008, @08:14AM (#24597947)
      Shut The Fuckup Sonny - it's what the old guys on Photo.net say when you tell them that photography as we know it is dead - especially if you mention film in the same sentence. It'd be like saying that BSD is dead here on /..
    • Re:Wow (Score:4, Interesting)

      by ErroneousBee (611028) <neil:neilhancock,co,uk> on Thursday August 14 2008, @08:16AM (#24597969) Homepage

      Seems a bit simplistic to me, I'd have thought that they'd turn the photos into a virtual world, using the colour corrected photos to create wireframes and bumpmaps and then being able to apply whatever lighting and other effects to the world. That allows you much more freedom to use other methods (e.g. LIDAR) to populate the database.

      Creating 3d models also allows you to remove transient objects (people), or add objects to the scene, e.g. what would David look like on the empty plinth in Trafalgar Square.

      I suspect the reason they've done it this way is more about the patents than practical application.

      • Re: (Score:2, Interesting)

        by Anonymous Coward
        I imagine that's the ultimate goal. But what they have now is still amazingly impressive...

        The next step in that goal would be making it automatically determine what's in the structure, and what's 'in the way' (a tourist, a security guard, a pigeon...). It would be annoying if a tourist was thrown in with the 3-d model if they happened to populate the set with a ton of pictures of them and the object you want modeled.

        Still, as it stands now, it's still an amazing way to experience a historical landmark
          • Re:Wow (Score:5, Insightful)

            by YrWrstNtmr (564987) on Thursday August 14 2008, @10:22AM (#24599821)
            Huh? Why not get out there, meet people from those countries, eat the food they eat, get drunk with them, and actually experience the world?

            Of course! Because every familiy has the time and resources to visit every possible interesting place on the planet.
          • Re:Wow (Score:5, Insightful)

            by swillden (191260) <shawn-ds@willden.org> on Thursday August 14 2008, @10:41AM (#24600143) Homepage Journal

            Huh? Why not get out there, meet people from those countries, eat the food they eat, get drunk with them, and actually experience the world?

            Ummm, because we can't afford it? Taking six people to Greece would consume our family vacation budget for 3-4 years. I'd rather stay closer to home and spend more time with my kids.

          • Re:Wow (Score:5, Interesting)

            by JWSmythe (446288) * <jwsmythe@Nospam.jwsmythe.com> on Thursday August 14 2008, @12:30PM (#24601949) Homepage Journal

                I thought one of the previous stories said it would do that.

                What I was curious about is, how? A distinguishable photograph could be associated. But, even with one of the examples in the display, the Statue of Liberty, if this is automated, how would it be able to distinguish the real statue of liberty with say a souvenier sitting on my coffee table? Basing it on size and distinguishing shapes, it would match either one. Basing it on those, and the background objects is impossible. It already has to take into account that there are changes in the foreground (people, extra objects like light poles that are not present in very similar views). Background objects like clouds come and go, and leave entirely different images.

                For not quite as distinguishable objects, it would be a lot harder. Say you used the Statue of Liberty as your starting point. If you were to travel into Manhattan, there are many very similar shapes for buildings and storefronts. Sure, unique buildings would be obvious, but for every obvious building, there are dozens of almost identical buildings.

                Even then, you would have to know the city. Similar architecture can show up in a variety of cities, and be close enough to match. Cameras may record timestamps embedded in the original image (assuming unedited photos are added to the system), but there is nothing useful like geographic coordinates included.

                All the photos were shot from the same perspective. It was as if they were shot by one or more photographers of about the same height. There should have been a more significant change to the view from say a 4' tall child to a 6'8" tall man. I don't claim to be a "great" photographer, but I'm pretty good. One of the essentials between being someone who can take snapshots, and someone who can take photographs, is making the composition of your photograph to illustrate the view. That frequently involves changing height and view. Maybe you want to lay on the ground for one, and climb on a ladder for another.

                I took some photographs at the World Trade Center on 9/9/2001. Those photographs aren't just of the skyline, although I did take some snapshots at the time. Some are composed lookup up towards the top of the buildings from the ground, and down while leaning on the glass of an observation deck window. Photography isn't documenting a first person view. It's beautifying and romanticist a view, without necessarily changing anything about what's in the composition of the photograph.

                There are other features that I don't see how they're getting, such as the zones where photos were shot from. That takes an awful lot of extrapolation. What's the difference between a photographer 10 feet away, and a photographer 200 feet away with a good zoom lens? Almost nothing, except maybe a little focal distortion at the edge of the photo. That varies with the quality of the camera and lens anyways.

                I did a little project once years ago. I was sitting in the hills just under the Hollywood sign. We were sitting on top of a hill, so I had a good panorama view. I tried to keep the horizon centered, and I shot frames the whole way around. When I stitched them together in Gimp, I noticed that each frame had variations in it's color. It wasn't because of AWB, it was because the camera (good for the time) had some weird variance, so there was a difference in color from the left to the right side. So, two shots from the same camera at the same settings were significantly different.

                I would be willing to suggest that the demo shown isn't a demonstration of a functional piece of software. It is a good example of what can be generated with a computer. I could do the same thing in Gimp or Photoshop. If my job let me play like this for a few weeks, I could have made a better example of vaporware.

            • Re: (Score:3, Insightful)

              There are other features that I don't see how they're getting, such as the zones where photos were shot from. That takes an awful lot of extrapolation.

              I suspect it isn't as complex as you think - exif tags usually include focus distance and focal length. Also included with that is sensor size or camera model, which will tell you effective focal length.
              When you combine that info with the apparent size of the object in the photo (i.e. statue of liberty is x percentage of the frame high), you should be able to get a reasonable estimation on where the picture was shot from.

              For relatively isolated objects (like the statue of liberty), I'd assume you'd need a s

            • Re: (Score:3, Informative)

              There are other features that I don't see how they're getting, such as the zones where photos were shot from. That takes an awful lot of extrapolation. What's the difference between a photographer 10 feet away, and a photographer 200 feet away with a good zoom lens? Almost nothing, except maybe a little focal distortion at the edge of the photo. That varies with the quality of the camera and lens anyways.

              Perspective changes a lot based on where the camera is, a big zoom lense does nothing to change the perspective it just makes the image larger.

              Their process finds machine recognisable points in each photo, then looks for matching points between photos. Once you know that 2 photos are of the same subject you can use the separation between these known points to work out the relative viewing position of each camera. It only takes about 4-5 common points on different planes to pinpoint where each camera is rel

    • Re:Wow (Score:5, Insightful)

      by ttapper04 (955370) on Thursday August 14 2008, @08:22AM (#24598017) Journal
      Microsoft had better not repeat google's slight miscalculation. The credits given to the flicker accounts tell that they must of had to opt in, unlike streetview. This photosynth system would be incredibly powerful if it used all flicker images or crawled the web. People are clearly visible everywhere in this system, and some may become upset.
      • They could create some kind of agreement with Flickr & other photo-sharing sites where users could check a box to opt-in to photosynth.

        (Or if they want to be sneaky about it, require the users to check the box to opt-out, or just change the Flickr privacy policy to opt-in all new photos.)

    • If you read the paper [washington.edu] you will see that it is the same researchers!

    • Absurd thing is, they buy out the researchers and at least use a fake multiplatform thing like Silverlight to impress/trick people about its possibilities.

      So, Linux thing has become Windows only as result of buyout. Complete Microsoft way of doing things and exactly why people like me says "Stay away from Silverlight, .NET, their open source clones and people involved with them."

      That is the "open source loving" Microsoft for you which will transform itself to multiplatform company. If you buy it...

    • by dave420 (699308) on Thursday August 14 2008, @09:23AM (#24598853)
      Microsoft have turned Photo Tourism into something incredibly more powerful. But don't let that get you off your high horse. Some of us don't play the "them" and "us" game.
    • Re: (Score:3, Insightful)

      Yes, and this is nothing like that. That was apparently creating additional information that simply wasn't in the original photo. This is using a whole bunch of photos of the same scene, taken at different times, angles, etc to automatically build up a 3d model. Nothing is being enhanced, you're "merely" being shown the most appropriate, pre-existing photo based on your location and view direction in the generated 3D model.

      Damn cool tech, but not the same as that used in Blade Runner (or CSI, or any other "

    • Blaise Aguera y Arcas on Photosynth [ted.com]

      the book thing, i don't think can easily be faked by flash, nor the gigapixel resolution image nor the really neat zooming and zooming and zooming into stuff.

      really what photosynth, and by extension seadragon which is what photosynth is built on, promises us is a way to semantically link all those photo's on the web together, to build up those real places into virtual places with little or nor human intervention.

      this should enrich the web in a way we are only just begi

      • http://labs.adobe.com/downloads/flashplayer10.html [adobe.com] . See how many platforms supported?

        That is how companies work in age of 2008 where people uses 2-3 different operating systems in a day.

        I am not your average "anti M$" guy to pick at, I am just telling that kind of actions will result in some kind of reaction, it depends on the money company has and it is not infinite.

        I can't comment about the technology since I can't view it!

Adde parvum parvo manus acervus erit. [Add little to little and there will be a big pile.] -- Ovid