Rome, Built In a Day 107
spmallick writes "Researchers at the University of Washington, in collaboration with Microsoft, have recreated the city of Rome in 3D using images obtained from Flickr. The data set consists of 150,000 images from Flickr.com associated with the tags 'Rome' or 'Roma,' and it took 21 hours on 496 compute cores to create a 3D digital model. Unlike Photosynth / Photo Tourism, the goal was to reconstruct an entire city and not just individual landmarks. Previous versions of the Photo Tourism software matched each photo to every other photo in the set. But as the number of photos increases the number of matches explodes, increasing with the square of the number of photos. A set of 250,000 images would take at least a year for 500 computers to process... A million photos would take more than a decade! The newly developed code works more than a hundred times faster than the previous version. It first establishes likely matches and then concentrates on those parts."
legality (Score:4, Insightful)
As far as I can tell... (Score:5, Interesting)
As far as I can tell, after skimming TFA and watching the little demo video, they weren't actually copying the pictures, but using them to build a 3D model.
It would be kind of like aggregating a bunch of books in the library to come up with a letter distribution chart. You're not violating the copyrights of the authors, just compiling information from raw data.
Re: (Score:3, Interesting)
And I hope that they didn't mess up by getting pictures from this Roma [google.com].
It actually has a ruin of a monastery too [wikimedia.org], so it's easy to get confused.
And this name confusion has actually caused some mail to take the long way around by having a turnaround in Italy.
Re: (Score:1)
And let's not forget this Roma [google.com]. It also has (oldish) stone buildings [flickr.com] and it also results in name confusion. For instance, I've encountered several Australians who thought that the "Roma tomatoes" that they get in the supermarket are called that because they come from Queensland!
Maybe they did mess up by getting pictures from this Roma. That would explain why all their images seem to be full of flies.
Re: (Score:2)
Or, for that matter, this Roma [google.com]. Not to mention this Roma [google.com]. Tony, anyone? [google.com]
Re: (Score:2)
Or the Romany people, sometimes called Roma, who are the largest of the 3 main groups of gypsies.
Re: (Score:1)
2006 (Score:2)
Re: (Score:2, Interesting)
Some interesting questions arise here. Where exactly is the line between aggregate and specific? Would it be OK if they used the photographs to texture the model? Would it depend on how many photographs were averaged to generate the texel?
Re:As far as I can tell... (Score:5, Informative)
Re: (Score:2)
Re:As far as I can tell... (Score:4, Insightful)
Re: (Score:3, Informative)
Re:As far as I can tell... (Score:4, Informative)
Re: (Score:2)
They are using the photos to get relative position information on the things in the photo. That information would not be subject to copyright.
Re: (Score:2)
Re: (Score:2)
Cool, but... (Score:2, Interesting)
I wish this were done more with free software rather than with help from the Beast from Redmond. I'm certain the faculty at UW are completely familiar enough with free software that they could have made this work without MS's help.
Re:Cool, but... (Score:5, Insightful)
I'm certain the faculty at UW are completely familiar enough with free software that they could have made this work without MS's help.
150,000 photos. 21 Hours. 496 Cores. That makes it a labor intensive, computation intensive project. None of that comes "free as in beer."
Sure it does (Score:4, Insightful)
...None of that comes "free as in beer."...
150,000 photos.
From Flickr. It's not like some poor bastard was paid to be out there photographing for weeks.
21 Hours. 496 Cores.
Don't recall folding@home or seti@home paying me anything.
In short - who wouldn't pony up a few days of computing power to have a fully open 3D model of some of earths greatest landmarks? We only need someone to do the code to distribute, but the basic framework for distributed computation is already in place.
Re: (Score:2)
From Flickr. It's not like some poor bastard was paid to be out there photographing for weeks.
No. But Flickr simplifies the problem if you are building a model of a world destination-city like Rome or Venice.
What interests me more is the possibility of building models of cities and landmarks across time. Perhaps using sources other than photographs. Lincoln's Washington. New York City in 1939.
who wouldn't pony up a few days of computing power to have a fully open 3D model of some of earths greatest landmar
Re: (Score:2)
What interests me more is the possibility of building models of cities and landmarks across time. Perhaps using sources other than photographs. Lincoln's Washington. New York City in 1939.
Me too but that's not what Microsoft or these researchers are doing, so it's not really related to my response.
That's still a serious commitment - and we could be talking weeks or months
So what? The results last forever. And any one person doesn't have to be serious, it's not like I run Folding all the time - but in aggr
Re: (Score:2)
Most university research groups do not have funds to buy bits of computing time here and there. For a project like this, the research group more likely has a dedicated computing cluster bought with grant money or sponsor money.
Re:Cool, but... (Score:5, Funny)
Hell yeah! I come to Slashdot for the slightly outdated stories, stay to read the comments of disillusion Linux fanboys.
Re: (Score:1)
Re: (Score:2)
Possibly, but I'm not sure Hugin or another free equivalent could manage something quite on this scale. Microsoft appears to have customized matching algorithms and provided some pretty staggering amounts of computing power. Short of using something similar to folding@home, I can't imagine the school doing this.
Microsoft gets publicity and some new algorithms for image stitching, the school gets funding for a research project.
Re:Cool, but... (Score:4, Informative)
Re: (Score:2)
And how many of those servers are filled to the brims with grad students parallel processing versions of a glorified infinite loop?
Re: (Score:2)
Possibly, but I'm not sure Hugin or another free equivalent could manage something quite on this scale. Microsoft appears to have customized matching algorithms and provided some pretty staggering amounts of computing power. Short of using something similar to folding@home, I can't imagine the school doing this.
Microsoft gets publicity and some new algorithms for image stitching, the school gets funding for a research project.
Well, one of the authors works for Microsoft Research - so I doubt any FOSS projects were ever considered. Microsoft also donates bucketloads of money to UW's CSE department, and the faculty have lots of Microsoft ties. Bill Gates has been an invited speaker over here many times.
The CSE department is very high caliber, don't get me wrong. But it is widely perceived by the rest of the campus as a Microsoft shop.
Re: (Score:2)
Re: (Score:2, Informative)
Re: (Score:1)
Aint the future grand?
Re: (Score:3, Informative)
There are 2 opensource projects aiming to do similar 3d reconstructions:
http://code.google.com/p/libmv/ [google.com]
http://insight3d.sourceforge.net/ [sourceforge.net]
So while getting those 496 cores would still be a task for you, opensource software _is_ nearly there too.
Re: (Score:2)
Key parts of this software are available as free software [washington.edu].
.
Re: (Score:2)
Why the hell does it matter? Seriously, are you that anti MS that you can't handle them funding some cool research. Things are allowed to exist without being OS you know.
There have been plenty, plenty, plenty of fantastic benefits to mankind done with the help of private industry. Using Open Source or not does not equate to good vs evil
Re: (Score:2)
Done 'more with free software'? It's original code.
If you want to license the algorithms you can contact UW and they'll happily come up with an arrangement for you.
I don't see what bearing Microsoft has or does not have with this project except licensing some of their older technology for Photosynth. Most of the tech used in this project which UW isn't trying to license is open source.
Re:Cool, but... (Score:4, Funny)
Well, now that Microsoft has done somebody will try to copy them by driving around Rome in a car that takes pictures of everything around it. Oh wait, http://maps.google.com/maps?f=q&hl=en&g=colosseo,+roma&ie=UTF8&layer=c&cbll=41.891293,12.49059&panoid=haogKvGCLWGZlNYPmGLLPA&cbp=11,130.48,,0,-7.13&ll=41.891294,12.490585&spn=0.002588,0.009645&t=h&z=17 [google.com]
Re: (Score:2)
TED talk with a 2007 version (Score:4, Informative)
It would be nice to have photosynths of monuments, art, or architecture that have been damaged or destroyed (e.g. the Buddhas dynamited in Afghanistan, the churches that collapsed in the 2009 Italy earthquake) from tourist photos that may be floating out in the interwebs.
Re: (Score:2)
Re: (Score:3, Interesting)
http://www.david-laserscanner.com/ [david-laserscanner.com]
Re: (Score:2)
Ugh. Wish they knew what "free software" meant... 199 or 299 eur (or $292 / $440 respectively) for a piece of software which will be forever bound to either a single computer, or a single flash drive.
Cool idea. Would be better free.
Re: (Score:2)
Why O(n squared)? (Score:4, Interesting)
Previous versions of the Photo Tourism software matched each photo to every other photo in the set.
If you're building an entire digital model, wouldn't there be some point at which it would be more efficient to match each new photo to the digital model itself (instead of all the other individual photos)? At that point, the 3D model would be nearly complete, and matching new photos would be closer to O(n), as I see it. Additional photos would primarily only increase the detail/resolution of the existing model.
Re: (Score:2)
O(n) is so vague in your statement. The obvious "what is 'n'?" answer is the number of previous photos added.
Well, yes.
if there are n terms, and each takes n time it's O(n^2).
Yeah, that's what is being said in TFS.
I'm no graphics whiz, but it seems impossible that finding where in the model a certain picture is would be constant time. However, I wonder if comparing against the model would allow for a significant speedup by removing "fluff" - images that contribute nearly the same information. There are probably thousands of pictures of the Coliseum in their sample, yet a subset of that would be sufficient.
Basically, this. By comparing against the model, at a certain point you'd have mostly overlaps, and at that point, the model would no longer grow significantly, and comparing new pictures to it would be closer to a constant, which is why I suggested that it could approach O(n). Of course, this point would be hard to reach... although, if you intentionally took photos to map the entire 3D scene, you'd want to build the model fairly quickly and efficiently, at
Additionally (Score:2, Funny)
Would that work? (Score:2)
Seems doubtful. These photos, as varied as they are, are at least all pointing at a relatively stationary object. Obviously, not every photo of any human would be in the exact same pose.
UW website (Score:5, Informative)
The teams actual site has more pics and videos, including St. Peter's Basilica, Trevi Fountain, and info on Venice.
http://grail.cs.washington.edu/rome/ [washington.edu]
Puzzle solving techniques (Score:5, Funny)
It would have been even faster if they'd have started with the edges and leaved the sky for the end like in any other puzzle.
Just like a jigsaw puzzle. (Score:2)
FTFS:
It first establishes likely matches and then concentrates on those parts."
Sounds like when you are putting together a jigsaw puzzle and you find the edge pieces first and work in from there.
Video games (Score:5, Interesting)
Imagine if the God of War team could instantly recreate entire cities like this. Or the Fallout 3 team could snap a few thousand photos of Las Vegas and then digitize an entire city within a day and then work out the kinks. Or the Grand Theft Auto developers could recreate New Yo...ahem, Liberty City and then build a perfect 3D model and just slap textures on the buildings.
Sure it's not a perfect system but this has so much potential to help recreate cities or terrain within video games.
Re: (Score:1, Offtopic)
Wow.
Govender said the church would also seek a donation to be used in its work with young people. He did not specify how much the company would be asked to pay.
See, it's really about the money, not whatever "desecration" they claim. Blackmail is right.
"We are concerned about the amount of violence in these games," McKie said Monday. "It's real for us. We are living the reality here. It's not just a game."
Yes, because in reality, you're clearly fighting against Chimera.
Re: (Score:3, Interesting)
Re:Video games (Score:4, Interesting)
The way you get pics isn't really a big deal, the interesting part is software that takes them and makes a 3D model out of it.
But yeah, combining Street View with Photosynth is an obvious thing that comes to mind.
Re: (Score:2)
The way you get pics isn't really a big deal, the interesting part is software that takes them and makes a 3D model out of it.
The way you get the pics _is_ a big deal - it makes the second part (making the 3d model) much harder if you don't have the right information. That is one of the biggest obstacles this project attempts to address according to the paper. As the paper says, this has somewhat been done before in a controlled environment (eg. google earth) with mostly commissioned areal photographs where the camera is calibrated, shots are taken at predefined intervals, time is known, GPS aids location, etc ,etc. When you remov
Re: (Score:2)
"But yeah, combining Street View with Photosynth is an obvious thing that comes to mind."
Been done. I think I saw it in a demo video for an 'augmented reality' phone navigation program, possibly for android. The trouble is you only have two or three views of each point. What *would* be cool is to record a video panorama as you travel round a city. That would be better because
a) You have many more views, and
b) You know one frame is taken near the previous and next frames, so you can use optical flow algorith
Re: (Score:2)
Isn't where the new iPhones come in? They have GPS and a compass. But you're right that it would probably simplify things to make it more systematic, mostly when you already have all the StreetView data readily available, considered that it's full panoramas with sufficient increments of parallax for anything.
Copyright? (Score:2)
I know this sounds ridiculous, but this is the current (insane) state of copyright laws we have. If game companies recreate real cities from tourists' pictures and put them in games, they are violating the copyrights of those tourists. I assume putting pictures on Flickr does not mean assigning copyright to them nor gave blanket permission to 3rd parties to do whatever they want.
If game companies like the current "protection" of the copyright laws, they need to be bound by the same rules.
Rome WASN'T built in a day... (Score:3, Funny)
I don't know what else to say... (Score:5, Funny)
Aren't humans just awesome?
We build amazing structures that last over a thousand years of constant wear and we invent photography to capture the awe inspiring moments that such marvelous creations cast upon ourselves, then create computers to recreate their 3D Dimensions almost perfectly in a virtual environment using nothing but our pictures that we've taken and our impressive ingenuity.
If you can read this: Pat yourself on the back.
Re: (Score:2)
I just fed all of the photos ever taken of me into Photosynth, made a 3-D model of myself, and then made the model pat itself on the back. I'm WAY too lazy to lift my own arm that far.
Re: (Score:1)
Aren't humans just awesome?
We build amazing structures that last over a thousand years of constant wear and we invent photography to capture the awe inspiring moments that such marvelous creations cast upon ourselves, then create computers to recreate their 3D Dimensions almost perfectly in a virtual environment using nothing but our pictures that we've taken and our impressive ingenuity.
If you can read this: Pat yourself on the back.
Now imagine in a hundred years or so we'd able to neurally interface with a computer and explore said 3D structures in a pseudo-reality. Now THAT would be amazing.
Still just a point cloud? (Score:5, Insightful)
It is nice to see that they have optimized the algorithm, but what about the presentation? It looks like it is still just a point cloud, just as it was two years ago. Why isn't it a fully textured 3d model? It shouldn't be that hard to do that when you already have the points in 3d.
Re: (Score:1)
they already have colors for dots... I can't imagine it's much more expensive than what they are already doing.
Re: (Score:2)
I would like to know how they decide which color would be more appropriate for a point.
Meh, would you go very wrong if you just settled for the median or mean value?
Re:Still just a point cloud? (Score:4, Insightful)
You might have answered your own question: since developing an algorithm like marching cubes is a solved problem, slapping it on as a post-processing step wouldn't really count as research. These academics are trying to make a cool demo to show off their research, not create a finished product. If they waste too much time polishing it, they risk not getting enough real research done and losing their funding.
Re: (Score:1)
The impressive part of this isn't the 3d reconstruction (that's been done many times before, though perhaps not on this scale), it's that they've done it with such a disorganized, incomplete data set as flikr. Using Google Street View data (particularly with the locations already known) would be computationally much easier, but requires paying people to drive around with cameras on the roof.
Re: (Score:2)
You're completely right: the story here is beginning with disordered data. However, if Hans Rosling did an animated graphic of where all the photos in Rome are taken, it would look more organized than most African countries, and might put some ant hills to shame.
Using (panoramic) video as data source? (Score:2, Interesting)
The next step would be to use video as the data source, or even panoramic video like the Google Street View cars [autoblog.com] capture. With such a system, simply driving by a building would provide thousands of frames from a range of viewpoints already. Putting all that together would be immensely computational intensive, but the result would be 3D-models of everything the Google cars have ever filmed.
The obvious question... (Score:1, Funny)
Can this be used for Pr0n?
Microsoft (Score:2)
I hereby declare this It's-Okay-to-Like-Microsoft-For-a-Day Day. This is pretty cool.
Embarrassingly parallel? (Score:2)
If it takes a year for 500 computers, does that mean it'd take a month for 6,000 computers, or a day for 182,500 computers, or an hour for 4,380,000 computers?
Or, in other words, the original version would cost about $438,000 of EC2 [amazon.com] time.
The new version takes 21 hours on 496 cores -- again, could you do it in an hour on 10,416 cores? And that becomes $1,416 of EC2 time.
So, it's not 100 times faster, just 100 times cheaper.
Problem (Score:2)
That Rome simulation had some problems working with Nero.
Re: (Score:2)
So like Rome... (Score:1)
It was built with Slave labor. We'll just call it "volunteers" in this case.
Been done? (Score:1)
I'm pretty sure Google Street View already does some of this. Browsing around, it seems to know where the sides of buildings are and let you zoom in on them.
From what I can see they're not blowing their own trumpet as much as these guys, but it can't be far away that Google Earth will have quite comprehensive 3D models of cities (Tokyo is already amazingly complete, although I don't know if that's an automatic system or not).
Do it from heli? (Score:2)
By the way, Google StreetView has been mentioned, but if you wanted to do an entire city, wouldn't it be simpler to use a bunch of high res shots taken from an helicopter circling around the city?
Also, could it be used by the military? To transform the photographic data from recon planes of an area into something that could be used in some simulation program? Imagine playing Call of Duty in the village you'd have a mission into in a few hours.
a year?? (Score:2)
I don't think so.
after a while you have a set of "high hit" images (ones that are found the most). start with that set
if you have a location with 5000 images and after 50 of those images you stll don't have a hit for that location: move on to the next location.
it would safe a lot of time.