Finding Yourself With Photo Recognition 336
itchyfish writes "You are lost in a foreign city, you don't speak the language and you are late for your meeting. What do you do? Take out your cellphone, photograph the nearest building and press send.
For a small fee, photo recognition software on a remote server works out precisely where you are, and sends back directions that will get you to your destination.
Seems a little far fetched, but amazingly cool if it really works."
GIS technologies (Score:5, Informative)
To me, it would appear that an easier solution might be to use GIS data in combination with the cell phone signal and comparisons of rough morphological features of buildings. The instructions should simply be: Point your camera at a building near you so that you can approximate its outline and then send that image. This would scale much larger than the methods referenced in the article as you would not have to store every detail of the buildings surrounding you including pixel maps of textures and color. This approach could be handled for a large city by a few commodity servers whereas the other approach would require significantly more computational resources.
Imagine how difficult it would be to capture details like that in a major city such as NYC? I don't really need directions to find my way around Cambridge city center as you could almost throw a rock from the center and hit just about every building around, but London, Washington, Houston etc... are another story and the data required from their approach would require massive computational infrastructure.
massive computational infrastructure (Score:4, Funny)
Re:GIS technologies (Score:5, Insightful)
And I fear that there won't be enough "lost tourists" to make this a paying proposition.
But how much would it be worth to professional historians -- or to Hollywood, or just to you personally -- to be able to "walk" through a virtual representation of New York circa 1890?
Or London in Holbein's time?
This is one of those projects -- much like something called ARPANET, which had to rely on government handouts but later made some guy named Steve Case a fortune -- that will never fund itself but will be of literally incalculable value to posterity.
Let's be realistic -- physics and fanaticism not being mutually incompatible, eventually "freedom fighters" -- whether named Atta, McVeigh, or Patrick Magee -- will make bee-lines for our biggest cities, carrying suitcase filled with two precisely machined hemispheres of plutonium. And everything but the maps of those cities will be lost.
Hopefully the Cambridge researchers will by then have completed their -- apparently -- quixotic project, and we will at least have, in redundant storage, a rather precise picture of what will be lost to radioactive ruin, a snapshot of urban life in the twenty-first century.
Re:GIS technologies (Score:2, Redundant)
Re:GIS technologies (Score:4, Interesting)
Because of lens geometry, even though a picture was taken from the same spot and the same angle, the distortions from the lens would make the image appear magnified, or concaved or simply have varying degrees of image detail.
Since this system works by identifying geometric shapes and outlines in an image and then compare it to a database, the diffrences in the lens curveture ought to give results that don't reflect the true geometry of a building.
So it'd be interesting to find out how, if they solved this problem.
__
Re:GIS technologies (Score:3, Insightful)
This is one of those enigma style puzzles: you can solve because you know there is a solution. You know that the image is of a building and that the image contains long straight edges. Edge detection would pull out several long curves instead. The problem is reduced to finding the lens function to apply that produces the most lo
Re:GIS technologies (Score:4, Insightful)
For most people even photos aren't necessary. Using the data from nearest cells a mobile could be pinpointed within 100 m accuracy (in urban terrain, it degrades down do about 1000 m in rural terrain). It's enough if you are sent a map with the neighbourhood - the biggest problem then is the size of your mobile's screen.
Such services (sending a map of the neighbourhood, with interesting points, like ATMs, marked) already exist in Poland (Europe). I find them quite helpful.
So rather than send out... (Score:4, Insightful)
Sounds like a solution in search of a problem.
It's cheaper than the current solution.... (Score:5, Funny)
GPS? (Score:5, Insightful)
Re:GPS? (Score:5, Interesting)
I believe this could actually be really cool if we get it to work, especially in an urban environment, but not so much out in the desert or anywhere; it's not meant for that. Instead, it's for finding that office building in Portugal when you're about to be late for a business appointment, and yet you've never been to Portugal before.
Re:GPS? (Score:5, Informative)
Anyway most cellphone networks can triangulate your position to within a block.
As for photo recognition being MORE accurate, i cant see how. To get your position to within a few hundred feet you'd need to know the exact parameterization of the lens, the zoom, the angle of the camera... unlikely.
Getting GPRS to work correctly in a foreign country so that you can make such a request is hard enough to begin with.
Re:GPS? (Score:2)
It's not more accurate now, but at least this kind of new technology gives us a renewed reason to MAKE it more accurate, and to give us some reason to keep going in the field of image processing. So just because something isn't perfectly convienent now, doesn't mean with a few years work an
Re:GPS? (Score:5, Interesting)
The system being open-source is also not a big deal. Sure it'd be nice to have an OS library which could find similar images of buildings but the real value would be in the dataset which almost certainly wouldn't be free.
Also this makes no consideration for similar buildings. The company i work for has a campus where 5 of the 7 buildings are cookie cutter - how would it deal with that situation.
Nokia has a street in helsinki with a whole bunch of identical buildings... same problem.
What about mirror glass buildings?
Sure it might work great if you are lost outside the transamerica pyramid, or the flatiron building, or maybe the houses of parliament but god help you if you are lost in the latest "homely community for comfortable family living"
My real point... (Score:3, Insightful)
This is cool technology, and research into this kind of thing is cool. But it's just not commercial IN THIS FORM.
The best application i can think of is for publishers to be able to find a crappy image using google and then submit it to corbis or any other pro image library and ask for a high quality shot of the same scene... but i'm not that inventive.
Re:GPS? (Score:2)
GPS is flawed in that it's under governmental control, and it's outside of the world itself, where any number of things could go wrong.
The system being open-source is also not a big deal. Sure it'd be nice to have an OS library which could find similar images of buildings but the real value would be in the dataset which almost certainly wouldn't be free.
I'm spe
Re:GPS? (Score:3, Insightful)
Stellar travel is a fairly well solved problem, plenty software can predict what the stars
Re:GPS? (Score:2)
Re:GPS? (Score:2)
This is more apt to help you in China or Korea or Vietnam.
Re:GPS? (Score:2)
my brain works in weird ways at night...
Re:GPS? (Score:4, Funny)
Re:GPS? (Score:3, Insightful)
And this system can be shut down at any time on any local judge's injunction. Which is a lot more likely than the US Military degrading something large parts of the US has come to depend on.
IF there were a Civilian GPS, then this would
So you believe they'll turn off the military GPS on a whim, but will have absolutely no plan to deal with civilian GPS?
What if there's some other feature about the regio
Re:GPS? (Score:5, Funny)
Then I think you're going to be late for that meeting.
Depends.... (Score:3, Funny)
Why not GPS... (Score:2)
Well why worry about that and just use Cell-triangulation which is already used in many applications of this sort in Europe.
Great concept... but its already been solved much better. GPS adds accuracy but costs money to put in the phone (some do have it though). Location based elements are already accessible on Symbian devices and will be accessible on all next gen Java devices via the Location APIs.
This is a pointless solution to a problem that has been solved.
Re:GPS? (Score:2)
One of the premises for the solution is that you are in a large city where tall buildings interfere with the signal from a GPS. Furthermore they use basic cellphone positioning using the mobile network to find a "target" to search around.
I still find it a bit hard to believe that you could make this work on a large scale though. Computer vision seems to be one of those stupid hard things to get right. It's really fun though.
Personally I think their suggestion for use is a bit off though.
try it in my neighborhood (Score:4, Funny)
Problems (Score:5, Insightful)
Re:Problems (Score:2)
Re:Problems (Score:3, Insightful)
Re:Problems (Score:3, Insightful)
WTF is this supposed to mean?
Maybe when you finally grow the cojones to step out of your SUV, you'll discover people in New York and Chicago aren't the monsters you seem to think they are.
Re:Problems (Score:2)
Not only that, I've seen people here in my home in Berea being beaten to death with broomsticks *yeah, broomsticks can do a lot of damage when swung by baseball players*, and people do little more than sit and watch, or even just keep walking their ways.
It's not because it's New York, it's because people will be people no matter where you go. A person you can trust, but people are a cluster of chaos, and interaction with them should be intel
Re:Problems (Score:2, Insightful)
Re:Problems (Score:3, Interesting)
Re:Problems (Score:2)
Well, I don't speak Japanese at all, however I can say this: "YUTAKA BILD 1F Yokoyama-cho Hachioji-shi Tokyo -- mappu nanitozo" and give him the map and a pen. He will circle where you are, where you'd better be, and how to get there - all without saying a word. I'm curious, BTW, would a native Japanese understand what I wrote above?
Re:Problems (Score:2)
Re:Problems (Score:2)
My example is based on idea that the other human knows where he is (which is usually true), and you do not need to draw a map for him to know - you give him the map so that he can draw the route for you. So essentially my example can be distilled to the following requests:
Re:Problems (Score:4, Informative)
You would clearly have a library of objects (e.g. buildings) on the servers. When a picture is sent, the service would perform some sort of feature extraction, and calculate the invariants of the objects in the scene. It would then see if these objects nearly matched any in the database. If they did, it would project possible matches onto the image and look for edges around the model. If there was good correlation (accepting the fact that the match would not be perfect because of moveable objects) it would return the name of the building.
Prof. Cipolla lectures me on (suprise, surprise...) Computer Vision. You can find his lecture handouts here [cam.ac.uk]. (the projection handout, page 46 onwards talks about the process I have just described.)
Ideally (Score:5, Insightful)
I wouldn't invest to much into this technology, as I think it'll be obsoleted before it comes to fruition.
-PHiZ
Re:Ideally (Score:5, Interesting)
Meanwhile, investing in this technology gives us a reason to improve image detection and image processing. It gives us a reason to build the technologies needed to digitially map our world, which could be useful for anything and everything, including finding the best way out of a highrise during a fire, or even police task forces on drug busts... there's really no end to what a Digital Map can do, that GPS just will never have the capability of doing.
Re:Ideally (Score:2)
Once again, as we saw in 2002, GPS quality can be degraded at any moment, even taken wholly offline. Not to mention the "Act of God" possibilities to knock it offline *metorites, solar flares, etc etc*.
Yeah, but the doors of a building can get a new layer of paint. Which is more likely?
Re:Ideally (Score:2)
Updating something as silly as paint wouldn't effect much, but things like adding a new door or shutters might do the trick.. But once again, if enough of the points line up, then
Re:Ideally (Score:3, Insightful)
Re:Ideally (Score:2)
*I know I'm going to portugal, im landing in an hour fifteen. Let's go download the digital map of the region I'm going to be in, so that when I'm there, I have what I need. *
Re:Ideally (Score:2)
I would expect it to be cheaper than maintaining something in space, and more reliable once up and running. You've gotta remember that GPS is old technology and has a lot of it's bugs worked out, so it's obviously going to be ch
Re:Ideally (Score:2)
Won't work (Score:5, Funny)
Oh good lord. (Score:5, Insightful)
GPS Does This (Score:4, Interesting)
This seems infinitely more useful... (Score:3, Interesting)
Finding Yourself With Photo Recognition (Score:4, Insightful)
what kind of meeting is this, that is hold where you do not know the language, and have no clue to get around, did you parachuted to the meeting but missed the building?
what happened to phrasebooks?
man i'm bitter...
Re:Finding Yourself With Photo Recognition (Score:2)
Imagine driving up to Quebec and trying to find your way around. Most everyone speaks french. I know enough to say, "help me im lost".. but I doubt i could intepret their directions.
Re:Finding Yourself With Photo Recognition (Score:3, Interesting)
Just show them your non-US passport and it's impressive how good their english suddenly becomes
Well, this trick worked for years in eastern european countries where the only language in common with the people there was German, but they didn't really want to talk to you unless they were sure you weren't from Germany. Seems that they lost that curse now
So, let me get this straight... (Score:4, Insightful)
Does such a database exist? Could it possibly work without bringing up false positives? I mean, I don't have figures, but there are millions of buildings in any large urban area, and within those millions, they all have multiple sides, and then they all look radically different at different times of day. We're talking storage space that seems like it would be incredibly dificult to manage, let alone search efficiently and return good results from a cell-phone camera image.
Count me as a skeptic.
Re:So, let me get this straight... (Score:3, Informative)
Re:So, let me get this straight... (Score:3, Interesting)
This may speed up the search, but doesn't really change anything about the sheer amount of data (and the difficulty to collect it)
"Finding Yourself" (Score:2)
Better than GPS . (Score:2)
1. GPS doesn't work well in cities with tall buildings where sky is obscured by large buildings.
2. GPS has only 10m accuracy. This is important when you're giving pedestrians directions (eg cross the street and enter the second door on your right).
3. Unlike GPS or cell-phone base station approaches, this method gives information specific to the direction the user is facing (eg cross the street and enter the second
Re:Better than GPS . (Score:2)
Re:Better than GPS . (Score:3, Informative)
And how will this improve on 10m accuracy? Will you have to submit your camera lens's focal length as well in order to determine the distance from the photographed objects? Humans generally can't tell the difference between a 20mm lens photographing at 40m vs. a 35mm lens at 70m but this software can supposedly get 1m accuracy levels? I very much doubt
Why? (Score:2, Insightful)
implementation steps... (Score:2)
2) buy 50 world atlases
3) buy 50 multimedia phones
4) ?
5) profit
Little far fetched (Score:4, Interesting)
Hand gestures? (Score:5, Insightful)
Find the nearest native, start talking and gesturing wildly. Point at a map or street sign and say the name of the place you are looking for. They'll figure it out.
Sorry I just don't see this one.
Re:Hand gestures? (Score:5, Funny)
i read a copy of the directions ... (Score:2, Funny)
What if they don't have street signs (Score:2)
Which comes in handy today as a barrier for foreign corporations like FedEx, who need street addresses to operate.
What if... (Score:5, Funny)
Hell, you could be lost for days.
Re:What if... (Score:2)
How do you compile the dataset? (Score:3, Insightful)
Obviously a skilled surveyor could work it out, but that transforms this photographing job into a highly skilled position, making it many times more expensive.
If it weren't for that then you could probably pay students 10c a photo.
A great idea... (Score:2)
Seriously, this feat is practically impossible. I guess, if you try hard, you can cover a few downtown areas. However the resolution of those little cameras is ridiculously bad. Add variable lighting conditions (day/night, sunrise/noon/sunset), add random camera angle and tilt, and seasonal changes, and local construction, and all you end up with is a fuzzy picture of something.
GPS is
It will never be used... (Score:2, Funny)
This honestly seems pretty far-fetched. If you can't take the time to get directions, chances are that you deserve to die. You know, that whole thing of natural selection:-)
What do you do? (Score:5, Insightful)
You start thinking about what the hell is this that is so important that you go to a foreign country to have a meeting where people don't understand your language and you bet all your chances on the assumption that your cellphone will find the carrier that will allow you data transfer without a subscription plan. If the meeting is so important in the foreign land, I would think that you would do little more homework than to just depend on a cameraphone!!
Re:What do you do? (Score:5, Insightful)
If I am in a country which language I do not understand, my secret plan would be to take a taxi cab from the hotel (and back.) If the meeting is so important, I can not trust an inexperienced traveller (myself) to deliver me to the location and back.
Make sure you take a good picture... (Score:5, Funny)
Mod parent up, please. (Score:2, Insightful)
That last bit is what the article is really reporting on--research into intelligent computer vision. The fact that this research is being applied to giving walking directions to stupid humans has far more to do with securi
Overthinking the problem (Score:2)
Re:Overthinking the problem (Score:2)
... and here [pagesjaunes.fr] is the Paris visual streetscape map I mentioned above. It's awesome, and doesn't just cover Paris.
The London one is spread out over several unrelated sites.
Alternate use #1 (Score:3, Funny)
Many problems (Score:2, Insightful)
1) The database would have to be huge -- Not every meeting or event that I attend takes place in the city center.
2) Along the same lines, they need to store every face of these buildings.
3) The image processing better be really good at color correction and noise filtering (weather, blurry photos)
4) Wouldn't people just go buy a map?
5) Wouldn't distortions introduced from a cell-phone lens make the system less accurate?
- rabs
How? (Score:2, Funny)
Sounds like a solution ... (Score:2, Insightful)
... for which we do not yet have an adequate probelm
Great if you suffer from sudden memory loss (Score:2)
Alternate uses...? (Score:2, Interesting)
I'd rather use the phone to call whoever it is I have a meeting with and ask them how to get there. If they don't speak my language, what'm I doing meeting with them in the first place?
I'm wondering what the alternate uses of this technology might be, because I just can't see this as being a common problem. Could it actually be designed, say, for a missle to target a landmark by sight?
D'oh (Score:2, Funny)
I may be drunk (Score:2)
There are better solutions (Score:2)
What do you DO?!? (Score:3, Informative)
A: Do just like all of the other PHBs who were stupid enough to get stuck like that, i.e. screw the meeting, find the nearest bar, and start blowing the company expense account on cheap booze and hookers.
Why buildings? (Score:2)
Brilliant Ploy... (Score:2)
Am I the only one who had a few other uses for the tech pop into their head?
Why Bother? (Score:5, Informative)
This seems like an overly complicated solution. At the moment, my phone in Japan has a feature where I logon to Vodafone's website (from the phone) and click through a couple of links and then it tells me where I am. I assume it gets this information by figuring out which cell the phone is dialing from. From the subsequent menus, there are various options like "find the last train to station X", "find the nearest place to catch a taxi", and so on. A few months ago it was only available in Japanese, but now they've introduced a bilingual version - hoochie mamma.
Why bother using the fancy-dancy image recognition software when cellular telephony has a built-in system that basically acts like a constantly-updated "user location" variable?
(Actually, the answer is simple - to make geeks foam at the mouth. Come on now people!! Excess ain't rebellion.)
--
You take a picture of your friend... (Score:2, Funny)
*** Turns out your buddy was flagged for a terrorist. ***
darn.
party over.
but, it was a computer mismatch.
great.
and, you got a map of your location.
party on!
Sounds like Iraq to me (Score:3, Funny)
What do you do?
You pull out your night-vision goggles, target a nearby heap of rubble that used to be one of Saddam's "palaces." The goggles lock onto the complex geometric shapes and this information is automatically transmitted to a massive cluster of Cray's in New Jersey on loan from the NSA. Using state of the art satellite x-ray photography and next-generation neural-net AI (NGNNAI), your precise location is calculated and relayed back to you, all at the price of only 3 million dollars an hour. What, you didn't think it was the energizer bunny keeping all those Cray's up an running did you?
Lee
Same Same but different. you can do this today. (Score:5, Interesting)
Hold the phone up to the radio till it gets disconnected.
Wait.
A text message will arrive with the name of the song.
It costs about 50p. Disclaimer i do not work for or have any involvment in this venture, except friends who built it.
The inventor's 2 cents worth (Score:5, Informative)
Wow! Thank you very much for all your comments on this mobile phone navigation system. I thought I'd throw in my 2 cents worth since I'm one of the people who invented it! Forgive the lack of structure in what follows, but I'm trying to address several different issues raised throughout this discussion...
Yes, another way of doing this is radio signal triangulation (including GPS). But actually, this method doesn't work too well in cities because of things like multipath effects and satellite visibility (BTW our system isn't designed to work outside urban environments). GPS car navigation systems rely on a combination of GPS and inertial sensors, i.e. they take a sort of average of a large number of inaccurate readings to get a good fix on position. But the simpler positioning strategies are unlikely to give good enough acccuracy to establish on which side of the street you are standing (and in any case, they don't tell you whhat direction you're looking in). GPS is also expensive: most people would not be prepared to pay more for a phone with an in-built GPS receiver - but camera phones are already selling well.
No, we're not going to build a database of every building in the world! But a good place to start would be large city centres. FYI what motivated us to invent this system was the familiar problem of getting lost outside London tube stations. Obviously I know which tube station I'm at but I don't usually know which exit I took or what direction I'm facing. Of course I can retrieve a local map via my mobile phone. But the problem is I'm missing that critical "you are here" dot that tells me where to start. This is where our system comes in: by providing the dot (well, an arrow actually because it tells you which direction you're looking in too).
In practice, builing a database is easier than you might think. Probably we could do it with nothing more than a video camera attached to a car. Granted someone will have to drive down the streets of interest but only once (and this shouldn't be too difficult in somewhere like New York).
Finally, no, movable objects don't cause too many problems. The system uses a feature based strategy that is robust to 'clutter' in the form of things like cars, pedestrians, changing shop window displays, etc. That being said, there will always be ways of confusing it, e.g. by demolishing a building. But supposing that picture messages will one day cost as little as text messages do now, a system that works almost instantaneously and gets it right 99% of the time sounds as if it might have some commercial potential at least. And what if the hypothetical tourist isn't lost but just interested? For example, the system could return information about the history of any building of interest in the middle of Venice.
Yours,
Duncan Robertson
So what about winter ? (Score:3, Interesting)
Ofcourse, if it is raining on the day you take your picture you are left with a lot of noise, etc. etc.
I saw the field of high-level image recognition up close a few years ago. While the particular paper [utwente.nl] that the person who did the research wrote was about stereographic recognition of (simple) 3D objects, it shows a great deal about the processing power required to correct an occluded part of a scene, or to work under darker or lighter circumstances (p117-). I expect that in a 2D recognition the same problems rear their ugly head and make things a whole lot harder.
A better way... (Score:3, Insightful)
Ha!! I tricked you, it isn't a better way at all!!
Collaborative GPS mapping (Score:3, Interesting)
> However, the system's commercial future is uncertain.
> "The question is: how much are people prepared to pay
> for it, and how often will they use it?" says Rob Morland,
> of technology consultants Scientific Generics near Cambridge.
> "That's a tough one."
I've posted earlier on this...
The solution could be to use cell phones + cameras + GPS to effectively do collaborative cartography. i.e. units could be both consumers and producers of information - both raw picture data and processed maps - like much of the internet today.
A person could take pictures or video, each frame having a GPS timespace-stamp, and load it onto his computer at home, which could then participate with thousands of other computers in feature extraction using freely available mapping sources like TIGER data [census.gov]. Annotations to mapping information could include: GPS timespace stamps, voice or text annotation, accelerometer data (for data on observer orientation and acceleration). The latter could also help improve feature extraction from multiple images in a video (for eg: Intel OpenCV [sourceforge.net] vision library uses stereo cameras for feature detection).
Throw in concepts like local P2P exchanges by mobile units (for eg: my PDA has GPS, your cellphone has a camera & GPRS, both can communicate over bluetooth --> potential for a win-win situation for us both), distributed image storage and feature extraction, novel types of feature recognition (eg: ATM screens, McDonald outlets), multiple freenet-like distributed cartography servers --- the concept can get quit interesting. - for users, also potentially for cartography vendors even though they will have to improve their value proposition.
Hotels are good for directions (Score:4, Insightful)
You go into the nearest hotel and ask the nice English-speaking person behind the reception desk.
Even on Mars the hotel receptionists speak perfectly-accented English.
2 Simple reasons (Score:3, Interesting)
2. Triangular positioning. It's been in various media articles, the concept of parents being able to use this technology provided by the mobile phone company, to keep track of thier child, using 3 towers to calculate the position of the phone. Wouldn't be hard to implement a service where by which you dial that number, and you are provided with immediate locations.
The point is, as cool as the idea is, practically speaking, it's a very long way of solving a problem, that's allready solved!