Forgot your password?
typodupeerror
Graphics

Rome, Built In a Day 107

Posted by timothy
from the what-about-compilation dept.
spmallick writes "Researchers at the University of Washington, in collaboration with Microsoft, have recreated the city of Rome in 3D using images obtained from Flickr. The data set consists of 150,000 images from Flickr.com associated with the tags 'Rome' or 'Roma,' and it took 21 hours on 496 compute cores to create a 3D digital model. Unlike Photosynth / Photo Tourism, the goal was to reconstruct an entire city and not just individual landmarks. Previous versions of the Photo Tourism software matched each photo to every other photo in the set. But as the number of photos increases the number of matches explodes, increasing with the square of the number of photos. A set of 250,000 images would take at least a year for 500 computers to process... A million photos would take more than a decade! The newly developed code works more than a hundred times faster than the previous version. It first establishes likely matches and then concentrates on those parts."
This discussion has been archived. No new comments can be posted.

Rome, Built In a Day

Comments Filter:
  • legality (Score:4, Insightful)

    by timpdx (1473923) on Wednesday September 16, 2009 @04:41PM (#29445885)
    IANAL, but is this legal? I somehow think that Microsoft doesn't have 150K photographer releases in their paws.
    • by KingSkippus (799657) on Wednesday September 16, 2009 @04:44PM (#29445931) Homepage Journal

      As far as I can tell, after skimming TFA and watching the little demo video, they weren't actually copying the pictures, but using them to build a 3D model.

      It would be kind of like aggregating a bunch of books in the library to come up with a letter distribution chart. You're not violating the copyrights of the authors, just compiling information from raw data.

      • Re: (Score:3, Interesting)

        by Z00L00K (682162)

        And I hope that they didn't mess up by getting pictures from this Roma [google.com].

        It actually has a ruin of a monastery too [wikimedia.org], so it's easy to get confused.

        And this name confusion has actually caused some mail to take the long way around by having a turnaround in Italy.

      • Re: (Score:2, Interesting)

        by CaseyB (1105)

        Some interesting questions arise here. Where exactly is the line between aggregate and specific? Would it be OK if they used the photographs to texture the model? Would it depend on how many photographs were averaged to generate the texel?

      • by Tim4444 (1122173) on Wednesday September 16, 2009 @05:02PM (#29446231)
        No, it's designed to help you find images of a particular location and then it shows you the original photos. The 3d model part is kinda misleading as they're just using it to calculate the relative positions of where the pictures were taken and then browse it like a giant 3d menu. The summary gave me the impression that they built a photo realistic 3d model of the city, but it's just a glorified image browser. You could argue it's like Google image search, but it seems that they did actually copy the pictures instead of just linking to the originals on Flickr. Still, it's some pretty neat photo processing.
      • by TooMuchToDo (882796) on Wednesday September 16, 2009 @05:16PM (#29446427)
        Also, their import app is most likely checking the Creative Commons license on the photos they're pulling from Flickr.
    • They are using the photos to get relative position information on the things in the photo. That information would not be subject to copyright.

    • by tehcyder (746570)
      Yes, because the legal aspects are always the most important thing to consider in any story.
  • Cool, but... (Score:2, Interesting)

    by Eggplant62 (120514)

    I wish this were done more with free software rather than with help from the Beast from Redmond. I'm certain the faculty at UW are completely familiar enough with free software that they could have made this work without MS's help.

    • Re:Cool, but... (Score:5, Insightful)

      by westlake (615356) on Wednesday September 16, 2009 @04:51PM (#29446065)

      I'm certain the faculty at UW are completely familiar enough with free software that they could have made this work without MS's help.

      150,000 photos. 21 Hours. 496 Cores. That makes it a labor intensive, computation intensive project. None of that comes "free as in beer."

      • Sure it does (Score:4, Insightful)

        by SuperKendall (25149) on Wednesday September 16, 2009 @06:32PM (#29447487)

        ...None of that comes "free as in beer."...

        150,000 photos.

        From Flickr. It's not like some poor bastard was paid to be out there photographing for weeks.

        21 Hours. 496 Cores.

        Don't recall folding@home or seti@home paying me anything.

        In short - who wouldn't pony up a few days of computing power to have a fully open 3D model of some of earths greatest landmarks? We only need someone to do the code to distribute, but the basic framework for distributed computation is already in place.

        • by westlake (615356)

          From Flickr. It's not like some poor bastard was paid to be out there photographing for weeks.

          No. But Flickr simplifies the problem if you are building a model of a world destination-city like Rome or Venice.

          What interests me more is the possibility of building models of cities and landmarks across time. Perhaps using sources other than photographs. Lincoln's Washington. New York City in 1939.

          who wouldn't pony up a few days of computing power to have a fully open 3D model of some of earths greatest landmar

          • What interests me more is the possibility of building models of cities and landmarks across time. Perhaps using sources other than photographs. Lincoln's Washington. New York City in 1939.

            Me too but that's not what Microsoft or these researchers are doing, so it's not really related to my response.

            That's still a serious commitment - and we could be talking weeks or months

            So what? The results last forever. And any one person doesn't have to be serious, it's not like I run Folding all the time - but in aggr

        • by j1mmy (43634)

          Most university research groups do not have funds to buy bits of computing time here and there. For a project like this, the research group more likely has a dedicated computing cluster bought with grant money or sponsor money.

    • by Anonymous Coward on Wednesday September 16, 2009 @04:52PM (#29446073)

      Hell yeah! I come to Slashdot for the slightly outdated stories, stay to read the comments of disillusion Linux fanboys.

    • by natehoy (1608657)

      Possibly, but I'm not sure Hugin or another free equivalent could manage something quite on this scale. Microsoft appears to have customized matching algorithms and provided some pretty staggering amounts of computing power. Short of using something similar to folding@home, I can't imagine the school doing this.

      Microsoft gets publicity and some new algorithms for image stitching, the school gets funding for a research project.

      • Re:Cool, but... (Score:4, Informative)

        by afidel (530433) on Wednesday September 16, 2009 @05:11PM (#29446359)
        496 cores isn't all that much, with HT enabled a 1U server can hold 16 cores so a 42U rack can hold 672 cores, blade servers are even more dense. The budget for most midsized IT departments probably has room for a few compute clusters of that size.
        • And how many of those servers are filled to the brims with grad students parallel processing versions of a glorified infinite loop?

      • Possibly, but I'm not sure Hugin or another free equivalent could manage something quite on this scale. Microsoft appears to have customized matching algorithms and provided some pretty staggering amounts of computing power. Short of using something similar to folding@home, I can't imagine the school doing this.

        Microsoft gets publicity and some new algorithms for image stitching, the school gets funding for a research project.

        Well, one of the authors works for Microsoft Research - so I doubt any FOSS projects were ever considered. Microsoft also donates bucketloads of money to UW's CSE department, and the faculty have lots of Microsoft ties. Bill Gates has been an invited speaker over here many times.

        The CSE department is very high caliber, don't get me wrong. But it is widely perceived by the rest of the campus as a Microsoft shop.

    • by Quarters (18322)
      As soon as people who write free software can band together and field something like Microsoft's R&D division I'm sure the U of W will consider it. It wasn't just software Microsoft contributed it was the enormous freaking brains that wrote the software. Smart people can make money with their smarts. Most choose to do so. Many go to work for MS because they pay their researchers extremely well. You can blather on all you want about how evil Microsoft is (which isn't possible as corporations are amoral b
      • Re: (Score:2, Informative)

        by Anonymous Coward
        Most of the brains on this project are AT the University of Washington. If you recall (or read the article), Photosynth is a UW project licensed to Microsoft. Not that there aren't amazingly smart people over at MSR, too. They can't help that they're right across the lake from each other (UW & MSR).
    • Don't worry Google is working on this as well, they will soon recreate the world using pictures from your accidentally public facebook pictures. Not only will they be able to provide 3d city models but also provide you a model of your home, pool, garage, neighborhood, vacation spots, etcc... Soon you won't even need to vacation you can just step in the the local Google Holo Deck and be instantly transported to your favorite destination with real-time visual updates!!

      Aint the future grand?
    • Re: (Score:3, Informative)

      by Anonymous Coward

      There are 2 opensource projects aiming to do similar 3d reconstructions:

      http://code.google.com/p/libmv/ [google.com]
      http://insight3d.sourceforge.net/ [sourceforge.net]

      So while getting those 496 cores would still be a task for you, opensource software _is_ nearly there too.

    • Key parts of this software are available as free software [washington.edu].

      .

    • by spoco2 (322835)

      Why the hell does it matter? Seriously, are you that anti MS that you can't handle them funding some cool research. Things are allowed to exist without being OS you know.

      There have been plenty, plenty, plenty of fantastic benefits to mankind done with the help of private industry. Using Open Source or not does not equate to good vs evil

    • Done 'more with free software'? It's original code.

      If you want to license the algorithms you can contact UW and they'll happily come up with an arrangement for you.

      I don't see what bearing Microsoft has or does not have with this project except licensing some of their older technology for Photosynth. Most of the tech used in this project which UW isn't trying to license is open source.

  • by jhsiao (525216) on Wednesday September 16, 2009 @05:00PM (#29446197)
    Photosynth was showcased in a mid 2007 TED talk. You can find it here [ted.com].

    It would be nice to have photosynths of monuments, art, or architecture that have been damaged or destroyed (e.g. the Buddhas dynamited in Afghanistan, the churches that collapsed in the 2009 Italy earthquake) from tourist photos that may be floating out in the interwebs.
    • by afidel (530433)
      Hmm, you just made my day. My dad's aunt has pictures she took of the Buddhas that she took shortly after the Russians retreated (quite an adventuresome lady she is). I'll ask her if she can make copies from her negatives and mail them to me so I can scan them in hopes that someone performs such a project.
    • Re: (Score:3, Interesting)

      by TooMuchToDo (882796)
      Don't give me photosynths. Give me full 3D scans and material inventories so the damn thing can be rebuilt from scratch if need be.

      http://www.david-laserscanner.com/ [david-laserscanner.com]

      • Ugh. Wish they knew what "free software" meant... 199 or 299 eur (or $292 / $440 respectively) for a piece of software which will be forever bound to either a single computer, or a single flash drive.

        Cool idea. Would be better free.

        • Was just using it as an example. Perfect solution is open hardware and open software, with just as open data.
  • Why O(n squared)? (Score:4, Interesting)

    by clone53421 (1310749) on Wednesday September 16, 2009 @05:01PM (#29446225) Journal

    Previous versions of the Photo Tourism software matched each photo to every other photo in the set.

    If you're building an entire digital model, wouldn't there be some point at which it would be more efficient to match each new photo to the digital model itself (instead of all the other individual photos)? At that point, the 3D model would be nearly complete, and matching new photos would be closer to O(n), as I see it. Additional photos would primarily only increase the detail/resolution of the existing model.

  • The research team also announced their next project: Natalie Portman 3D based on the actress of the same name. The team is asking geeks everywhere for their assistance in providing pictures of her, and of course, grits.
    • Seems doubtful. These photos, as varied as they are, are at least all pointing at a relatively stationary object. Obviously, not every photo of any human would be in the exact same pose.

  • UW website (Score:5, Informative)

    by guido1 (108876) on Wednesday September 16, 2009 @05:05PM (#29446281)

    The teams actual site has more pics and videos, including St. Peter's Basilica, Trevi Fountain, and info on Venice.

    http://grail.cs.washington.edu/rome/ [washington.edu]

  • by chord.wav (599850) on Wednesday September 16, 2009 @05:05PM (#29446291) Journal

    The newly developed code works more than a hundred times faster than the previous version. It first establishes likely matches and then concentrates on those parts.

    It would have been even faster if they'd have started with the edges and leaved the sky for the end like in any other puzzle.

  • FTFS:
    It first establishes likely matches and then concentrates on those parts."

    Sounds like when you are putting together a jigsaw puzzle and you find the edge pieces first and work in from there.

  • Video games (Score:5, Interesting)

    by VinylRecords (1292374) on Wednesday September 16, 2009 @05:07PM (#29446315)

    Imagine if the God of War team could instantly recreate entire cities like this. Or the Fallout 3 team could snap a few thousand photos of Las Vegas and then digitize an entire city within a day and then work out the kinks. Or the Grand Theft Auto developers could recreate New Yo...ahem, Liberty City and then build a perfect 3D model and just slap textures on the buildings.

    Sure it's not a perfect system but this has so much potential to help recreate cities or terrain within video games.

    • Re: (Score:3, Interesting)

      Although your idea is very cool, it would be much easier to use something like the Google Vans to do this. The hard part of this project is figuring out where the cameras were pointing when the pictures were taken. With good geo location information and an electronic compass you can eliminate that, difficult, part of the process. I'm sure if you payed enough you could just license the high quality originals used in street view and do the same thing for a fraction of the cost.
      • Re:Video games (Score:4, Interesting)

        by shutdown -p now (807394) on Wednesday September 16, 2009 @07:32PM (#29448161) Journal

        The way you get pics isn't really a big deal, the interesting part is software that takes them and makes a 3D model out of it.

        But yeah, combining Street View with Photosynth is an obvious thing that comes to mind.

        • by loconet (415875)

          The way you get pics isn't really a big deal, the interesting part is software that takes them and makes a 3D model out of it.

          The way you get the pics _is_ a big deal - it makes the second part (making the 3d model) much harder if you don't have the right information. That is one of the biggest obstacles this project attempts to address according to the paper. As the paper says, this has somewhat been done before in a controlled environment (eg. google earth) with mostly commissioned areal photographs where the camera is calibrated, shots are taken at predefined intervals, time is known, GPS aids location, etc ,etc. When you remov

        • by Timmmm (636430)

          "But yeah, combining Street View with Photosynth is an obvious thing that comes to mind."

          Been done. I think I saw it in a demo video for an 'augmented reality' phone navigation program, possibly for android. The trouble is you only have two or three views of each point. What *would* be cool is to record a video panorama as you travel round a city. That would be better because

          a) You have many more views, and
          b) You know one frame is taken near the previous and next frames, so you can use optical flow algorith

      • by 4D6963 (933028)

        Isn't where the new iPhones come in? They have GPS and a compass. But you're right that it would probably simplify things to make it more systematic, mostly when you already have all the StreetView data readily available, considered that it's full panoramas with sufficient increments of parallax for anything.

    • I know this sounds ridiculous, but this is the current (insane) state of copyright laws we have. If game companies recreate real cities from tourists' pictures and put them in games, they are violating the copyrights of those tourists. I assume putting pictures on Flickr does not mean assigning copyright to them nor gave blanket permission to 3rd parties to do whatever they want.

      If game companies like the current "protection" of the copyright laws, they need to be bound by the same rules.

  • by BigBadBus (653823) on Wednesday September 16, 2009 @05:11PM (#29446363) Homepage
    ....but it would have been if the first coat had dried.
  • by Monkeedude1212 (1560403) on Wednesday September 16, 2009 @05:17PM (#29446439) Journal

    Aren't humans just awesome?

    We build amazing structures that last over a thousand years of constant wear and we invent photography to capture the awe inspiring moments that such marvelous creations cast upon ourselves, then create computers to recreate their 3D Dimensions almost perfectly in a virtual environment using nothing but our pictures that we've taken and our impressive ingenuity.

    If you can read this: Pat yourself on the back.

    • by tool462 (677306)

      I just fed all of the photos ever taken of me into Photosynth, made a 3-D model of myself, and then made the model pat itself on the back. I'm WAY too lazy to lift my own arm that far.

    • by lanceran (1575541)

      Aren't humans just awesome?

      We build amazing structures that last over a thousand years of constant wear and we invent photography to capture the awe inspiring moments that such marvelous creations cast upon ourselves, then create computers to recreate their 3D Dimensions almost perfectly in a virtual environment using nothing but our pictures that we've taken and our impressive ingenuity.

      If you can read this: Pat yourself on the back.

      Now imagine in a hundred years or so we'd able to neurally interface with a computer and explore said 3D structures in a pseudo-reality. Now THAT would be amazing.

  • by grumbel (592662) <grumbel@gmx.de> on Wednesday September 16, 2009 @05:20PM (#29446497) Homepage

    It is nice to see that they have optimized the algorithm, but what about the presentation? It looks like it is still just a point cloud, just as it was two years ago. Why isn't it a fully textured 3d model? It shouldn't be that hard to do that when you already have the points in 3d.

    • by mrchaotica (681592) * on Thursday September 17, 2009 @02:56AM (#29451093)

      Why isn't it a fully textured 3d model? It shouldn't be that hard to do that when you already have the points in 3d.

      You might have answered your own question: since developing an algorithm like marching cubes is a solved problem, slapping it on as a post-processing step wouldn't really count as research. These academics are trying to make a cool demo to show off their research, not create a finished product. If they waste too much time polishing it, they risk not getting enough real research done and losing their funding.

  • by Anonymous Coward

    The next step would be to use video as the data source, or even panoramic video like the Google Street View cars [autoblog.com] capture. With such a system, simply driving by a building would provide thousands of frames from a range of viewpoints already. Putting all that together would be immensely computational intensive, but the result would be 3D-models of everything the Google cars have ever filmed.

  • by Anonymous Coward

    Can this be used for Pr0n?

  • I hereby declare this It's-Okay-to-Like-Microsoft-For-a-Day Day. This is pretty cool.

  • If it takes a year for 500 computers, does that mean it'd take a month for 6,000 computers, or a day for 182,500 computers, or an hour for 4,380,000 computers?

    Or, in other words, the original version would cost about $438,000 of EC2 [amazon.com] time.

    The new version takes 21 hours on 496 cores -- again, could you do it in an hour on 10,416 cores? And that becomes $1,416 of EC2 time.

    So, it's not 100 times faster, just 100 times cheaper.

  • by PPH (736903)

    That Rome simulation had some problems working with Nero.

  • "The data set consists of 150,000 images from Flickr.com"

    It was built with Slave labor. We'll just call it "volunteers" in this case.

  • I'm pretty sure Google Street View already does some of this. Browsing around, it seems to know where the sides of buildings are and let you zoom in on them.

    From what I can see they're not blowing their own trumpet as much as these guys, but it can't be far away that Google Earth will have quite comprehensive 3D models of cities (Tokyo is already amazingly complete, although I don't know if that's an automatic system or not).

  • By the way, Google StreetView has been mentioned, but if you wanted to do an entire city, wouldn't it be simpler to use a bunch of high res shots taken from an helicopter circling around the city?

    Also, could it be used by the military? To transform the photographic data from recon planes of an area into something that could be used in some simulation program? Imagine playing Call of Duty in the village you'd have a mission into in a few hours.

  • I don't think so.
    after a while you have a set of "high hit" images (ones that are found the most). start with that set
    if you have a location with 5000 images and after 50 of those images you stll don't have a hit for that location: move on to the next location.
    it would safe a lot of time.

Mirrors should reflect a little before throwing back images. -- Jean Cocteau

Working...