Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
Google Businesses Graphics Software The Internet

Google VisualRank for Image Search 63

Google researchers are claiming that a newly developed approach to visual search may do for image searching what PageRank did for text search. "The research paper, 'PageRank for Product Image Search,' is focused on a subset of the images that the giant search engine has cataloged because of the tremendous computing costs required to analyze and compare digital images. To do this for all of the images indexed by the search engine would be impractical, the researchers said. Google does not disclose how many images it has cataloged, but it asserts that its Google Image Search is the 'most comprehensive image search on the Web.'"
This discussion has been archived. No new comments can be posted.

Google VisualRank for Image Search

Comments Filter:
  • by planckscale ( 579258 ) on Monday April 28, 2008 @05:26PM (#23229996) Journal
    Still no positive results for ["Natalie Portman" and "Hot Grits"]

  • image game data (Score:3, Interesting)

    by BoldAC ( 735721 ) on Monday April 28, 2008 @05:27PM (#23230014)
    It should be noted that a lot of the prelim data for this was gained through human interaction that google setup as a game. [google.com]

    I am still playing with the filter by date dropdown url manipulation [tech-recipes.com].
    • by BoldAC ( 735721 )
      Oh, yeah. The original paper is here:

      http://www.docstoc.com/docs/529160/PageRank-for-Product-Image-Search [docstoc.com]
      • >>>"'most comprehensive image search on the Web.'"

        Sub-title:

        Making topless Miley Cyrus photos easier to find than ever before!

    • by Ihmhi ( 1206036 )

      It should be noted that a lot of the prelim data for this was gained through human interaction that google setup as a game. [google.com]

      1) Thanks for introducing me to that game and evaporating what little free time I had left today. d:

      2) It is interesting to see the responses for someone who does not care. The image? A Mercedes dashboard thermometer. The labels?

      Partner's guesses: jacquelyn is the coolest person, jacquelyn, kayla is cool, kayla, ass, butt, butt cheke, butt ox, mom, dad

      • by Ihmhi ( 1206036 )

        Addendum: I have discovered that I can greatly amuse myself by offering commentary on the images themselves rather than actually try to label them, such as [Jessica Alba Picture] "I bet you are wanking now" and [trippy album cover] "I could do better if / I opened up MS Paint / and had a seizure."

        I might be the first person to be banned from this game... but as in the spirit of Watterson's Calvin I like to make someone's day a little more surreal. :3

  • Excellent! (Score:4, Funny)

    by Tree131 ( 643930 ) on Monday April 28, 2008 @05:27PM (#23230018)
    Sweet!!! More exact pr()n searches!!! Wohooo!!!!
  • Does anyone have the full name/DOI of the paper?
  • Talking about another uses... what about putting that techniques and the "enormous computing power" to some useful (for the society) jobs? It can be used to find mineral ores (maybe correlating aerial images with geological data?) or medical analisys (skin cancer? tissue identification?). It wouldn't give much direct economical revenue, but it will surely increases the Google "coolness" a lot (and from a shareholder point of view, it can be very very attractive)
    • from a shareholder point of view, it can be very very attractive
      What's attractive about lower dividend and a less economic company? Investors invest to make money, and if they use their money for charity later that's great, but I doubt many would like investment and donation merged without their consents.
      • GSOC?

        I think that the problem with what the GP suggests are 2 fold
        1) I imagine, analysing aerial images is much harder than your typical photo
        2) Medical analysis would require access to a lot of data, and people already have enough googlefoil hats
  • Low-hanging fruit (Score:1, Interesting)

    by Anonymous Coward

    Although image search has become popular on commercial search engines, results are usually generated today by using cues from the text that is associated with each image.

    Which is a good point. Sometimes you don't want the text associated with the image, you want the image itself.

    The canonical example would be image macros and comic strips. When you're looking for a particular LOLcat or demotivational poster, or even a specific comic strip based on a remembered punchline, the text in the image is what

    • by boyter ( 964910 )
      I actually did this as part of my graduate thesis. I even managed to get a high percentage of (about 80% of words inside my sample of 50 images I could recognise) recognition of text inside images. While you are right and that standard OCR techniques work very well, the bigger problem is extracting the text so you can recognize it. I don't know of any technique that can extract this text that does not also massively scale the problem. I used multiple techniques to do this such as multivalued image decomposi
    • by boyter ( 964910 )
      Sorry about the dual post, I forgot to add I was looking at launching a website which indexes web comic strip text since as you point out it is very easy to extract and identify, but even then you need a targeted approach for each strip.
    • by Ihmhi ( 1206036 )

      Oh No Robot [ohnorobot.com] [ohnorobot.com] has been doing this for years in the webcomics world - allowing users to assign text labels to comics. It's basically writing the script for a comic that already exists.

    • When you're looking for a particular LOLcat or demotivational poster, or even a specific comic strip based on a remembered punchline, the text in the image is what you want to be able to search for.
      Aim in ur image, gugling.
  • Is this the same technology that google recently indicated will help "Crack down" on child porn? Or is this yet another different form of doing the search? And if it is different does anyone know if they have plans to put these two technologies+existing methods together to make the engine even more robust?

    I don't expect an answer... but who knows maybe one of the goog guys that are in the know are reading.

  • Here's some background info on the guy: http://en.wikipedia.org/wiki/Mr._Magoo [wikipedia.org]
  • Image search this and that, sure, but why the hell is it still next to impossible to find product reviews using Google? Every time I try I only get product pages in online shops and not a single "real" review.
    • Image search this and that, sure, but why the hell is it still next to impossible to find product reviews using Google? Every time I try I only get product pages in online shops and not a single "real" review.
      Maybe you aren't good at writing your searches...
    • I have to agree, there are much more important things that google could improve in its product search, would it be that hard to remove accesories unless the user is clearly looking for one.

      e.g "Mp3 player" sorted by price [google.co.uk] doest show anything but deliberately mis tagged headphones and ipod cases.
    • by Whiteox ( 919863 )
      Must agree with you. But much of the problem are empty 'user reviews' that these shopbot pages encode, so Google thinks that it's worthy of inclusion.
      Frankly, there are a few other annoying bugs. Hopefully they'll be fixed one day. Annoyingly, a lot of other search engines are 'google powered' and have the same faults.
  • Maybe Google does something like this already, but I was thinking...

    Can't they tune their image search by matching what results for particular terms are clicked? Presumably, the images people click on are more apt to be accurately described by the search terms originally entered, so that's like a constant 'free' image classification going on constantly.

    For instance, if I put in "green field", I might get a bunch of images, and click on one that shows a grassy prairie. That image could be tagged with
  • The company said that in its research it had concentrated on the 2000 most popular product queries on Google's product search, words such as iPod, Xbox and Zune.

    iPod: look for lots of shiny white

    Zune: look for lots of brown

    Xbox 360: look for red dots in a ring

  • by Danny Rathjens ( 8471 ) <slashdot2@@@rathjens...org> on Monday April 28, 2008 @07:46PM (#23231390)
    I noticed this nifty little program in debian called findimagedupes. The algorithm for fingerprinting the files for comparing similarity is neat. From the man page:

    To calculate an image fingerprint:
    1. 1) Read image.
    2. 2) Resample to 160x160 to standardize size.
    3. 3) Grayscale by reducing saturation.
    4. 4) Blur a lot to get rid of noise.
    5. 5) Normalize to spread out intensity as much as possible.
    6. 6) Equalize to make image as contrasty as possible.
    7. 7) Resample again down to 16x16.
    8. 8) Reduce to 1bpp.
    9. 9) The fingerprint is this raw image data.
    To compare two images for similarity:
    1. 1) Take fingerprint pairs and xor them.
    2. 2) Compute the percentage of 1 bits in the result.
    3. 3) If percentage exceeds threshold, declare files to be similar.
    • Great, so any two images have a 1 in 256 chance of matching exactly, and an even higher chance of exceeding the threshold. I like those odds.
      • 1 false positive out of 256 is another way of saying 99.6% accurate. ;) Although in practice it works out to about 98% accurate according to the author; and my own tests don't dispute that. Also bear in mind that this tool is for looking through your own files for dupes, not comparing all images on the internet. :) There are obvious ways to expand the algorithm for larger datasets - and use of more processing power.
      • They have a (1/2)^256 of matching
        so 1 in 115792089237316195423570985008687907853269984665640564039457584007913129639936 that's 8.6e-78
        ofc the initial steps will make this number smaller but still much bigger than 256.
    • by sootman ( 158191 )
      Clever. I wonder if it could be adapted to find duplicate text [slashdot.org] files [slashdot.org]? :-)
    • Cool! I didn't know anyone had already done this.

      I had basically the same idea, but I was going to keep the colour information, blur, include a global colour/contrast value (obtained by resampling to "1x1"), use that to colour-correct the image, and then resample to maybe 5x5.

      I figured that for web searches, that should probably be good enough to find lots of alternative images from the same photoshoot or photoset as a sample picture, pictures taken by other photographers of the same scene, or still ima

  • Idea for Google (Score:2, Interesting)

    by Cantus ( 582758 )
    Here's an idea for Google that's been on my mind for several months. Yes, I'm giving it out for free.

    Let me upload an image in my hard drive to Google and have them check it against the zillion images on their catalog. Then give me a page with all the similar copies it could find, with a thumbnail and the URL from where it originates.

    One practical use I can think of: Someone you meet on the web sends you a photo claiming to be of him/herself. With this Google utility, you could upload that same image and ha
    • You just want Google to search for more porn for your collection. Lazy wanker...
    • by boyter ( 964910 )
      I seriously doubt google or anyone else has enough computing power to pull this off in real time. My guess is it would take several hours at least to run through any amount of images to make it worthwhile. It would be useful, but the scale of the problem is too large to be practical.
    • This would be really easy too just by taking an MD5 hash of the image file, then you could search for duplicates of images anywhere on the web. In fact it would be awesome to have this capability for ANY file; just type in your MD5 hash and get a list of links to different places hosting the file. Great for finding lost MP3s, remembering the source of where you downloaded that image, etc...
    • by dargaud ( 518470 )
      I wrote google with that exact same request something like a week after they unveiled 'image search' in, what, 2000 ? Fat good it did.
  • Hail Google! Self-proclaimed King of Everything! Go ye forth and Do No Evil (except in Russia and China -- oh, yeah, and that bit about caving to the Feds on your users' personal data)!
  • (NO WAI [google.com])
  • Is it just me or has anybody else noticed that Google doesn't make much effort to catalog the photos on Flickr, which is incidentally owned by Yahoo.

    Or is it that Yahoo is blocking Google?????

    Needless to say, if you search for a restricted set in Yahoo image search, you will pull up all of the Flickr photos. The same search in Google will often yield nothing from Flickr.

panic: kernel trap (ignored)

Working...