Forgot your password?
typodupeerror
Google AI Technology

Google's Second Brain: How the Knowledge Graph Changes Search 76

Posted by samzenpus
from the is-this-what-you-meant? dept.
waderoush writes "Last spring Google introduced its English-speaking users to the Knowledge Graph, a vast semantic graph of real-world entities and properties born from the Freebase project at Metaweb Technologies (which Google acquired in 2010). This month Google began showing Knowledge Graph results to speakers of seven other languages. Though the project has received little coverage, the consequences could be as far-reaching as previous overhauls to Google's infrastructure, such as the introduction of universal search back in 2007. That's because the Knowledge Graph plugs a big hole in Google's technology: the lack of a common-sense understanding of the things in its Web index. Despite all the statistical magic that made Google's keyword-based retrieval techniques so effective, 'We didn't ever represent the real world properly in the computer,' says Google senior vice president of engineering Amit Singhal. He says the Knowledge Graph represents a 'baby step' toward future computer systems that can intuit what humans are searching for and respond with exact answers, rather than the classic ten blue links. 'Now, when you encounter encounters the letters T-A-J-M-A-H-A-L on any Web page, the computers suddenly start understanding that this document is about the monument, and this one is about the musician, and this one is about a restaurant,' Singhal says. 'That 'aboutness' is foundational to building the search of tomorrow.'"
This discussion has been archived. No new comments can be posted.

Google's Second Brain: How the Knowledge Graph Changes Search

Comments Filter:
  • by Press2ToContinue (2424598) * on Wednesday December 12, 2012 @09:19PM (#42268059)

    Sorry, but I fail to see how this is so different from all those other messy "graphing" methodologies and so-called analytical tools that have laboriously forced themselves into my workspace only to writhe around awhile and die because they have overly-specialized utility, and waste more screen space than Outlook 2013 [arstechnica.net] i.e. mindmaps [google.com], flowcharts, music maps [google.com], radar graphs, bubble diagrams [google.com], et al, not to mention the hundreds of failed graphical programming languages [google.com].

    Call me skeptical, but I think it will end up in the Google Graveyard Of Flops [wordstream.com].

  • by ArcadeNut (85398) on Wednesday December 12, 2012 @09:27PM (#42268139) Homepage

    While Watson was somewhat specific to Jeopardy, I'm sure the same principals could be applied to Google Searches.

  • by bessie (212155) on Wednesday December 12, 2012 @09:31PM (#42268165)

    ... and what appears to be its associated features.

    Eg. When I search on my Android phone, there is *no way* to force it to do unweighted searches for keyword prevalence, or even a reasonable approximation thereof (while trying to avoid SOE-seeding keyword-heavy websites, for example).

    I always get stuff that Google "thinks I want", and I get that little nicely-formatted shorthand result set up-top as the first result (a map, a fact or figure, a schedule), and it waits around for awhile before returning the rest of the results.

    I don't want Google to give me what it thinks I want, or SHOULD want, or even what "most people" want - I want a pure result set basic on simple pattern matching in the dataset.

    I know it's a lot more complex than that under the hood, and subject to all kinds of definitions of what a "match" is. But now I am inundated with "apps you may be interested in" and other items for sale or marketing tie-ins or "latest and greatest", and not very often what I'm actually searching for.

    I wish Google would let you turn off all those pre-guessing "features" for folks like me who just want to search for particular, unweighted things.

    - Tim

    • by bessie (212155)

      'scuse me - SEO, not SOE. :-)

      - Tim

    • by gr8_phk (621180)
      You know you can quote things right? There are other ways to help google find what you want.
      • by islisis (589694) on Wednesday December 12, 2012 @10:04PM (#42268467) Homepage

        Currently I have to quote almost every keyword due to the issues drawn in parent and compounded by the change from + syntax in the old system.

        Search is not what it used to be, these days sites are more interested telling you what to search for than asking

      • by bessie (212155)

        You know you can quote things right? There are other ways to help google find what you want.

        Yes indeed. It helps, but still not as much as it used to before they changed the searching methodology a year or two ago. I liked the older syntax as well (using the plus instead of quotes) - not as bad on the desktop, but quoting things on a phone is a pain.

        - Tim

      • by Inda (580031)
        You know there's a "verbatim" option?

        http://www.google.co.uk/search?q=verbatim&tbs=li:1

        "&tbs=li:1" in the query string does it (I think).

        There must be a custom search for Firefox's search bar - if there isn't, it's a 5 minute job creaing one.

        I don't understand all this Google hate. They gave us so much in the search arena. They still cater for the nerds. They continue to offer me more than I offer them. Why all the hate?
        • by Inda (580031)
          No, I'm not sorry for replying to my own post.

          There are add-ons, option changes, defaults, search toolbars, customer searches, plug-ins and my aunty Fanny available for all the browsers if you want vanilla search results.

          Google just told me so.

          I'm done with the internet for today.
    • So use another search engine. Bing and Yahoo! still exist. Heck, AltaVista still exists. So do Metacrawler and Dogpile. Go back in time, my friend, until you are happy.
      • So use another search engine. Bing and Yahoo! still exist. Heck, AltaVista still exists. So do Metacrawler and Dogpile. Go back in time, my friend, until you are happy.

        Thanks to Google, the majority of results on most of those search engines is a steaming pile of fake linkfarm websites. That's not to say you shouldn't go try other search engines - Google is the main target of the SEO that leads to the linkfarms and it does a pretty poor job of avoiding them. But they were better back before Google, when well planned (often boolean heavy) searches were more likely to lead to relevant results.

        • As another commenter pointed out, all of the search engines I listed actually use a combination of Google, Bing, and Yandex results. All three search engines penalize and ban people for linkfarming, and at least Bing uses metrics based on something similar to Google's PageRank methodology. In fact there have been popular images in circulation that criticize Bing for doing exactly what GGP requested: not attempting a deeper semantic analysis of the input query (specifically regarding the query 'movie where n

      • by TheLink (130905)
        Yahoo uses Bing.
        Altavista uses Yahoo.
        Metacrawler uses Yahoo, Google and Yandex.
        Dogpile uses Yahoo, Google and Yandex.

        DuckDuckGo uses Yahoo but allegedly also does some other stuff.
    • by jenningsthecat (1525947) on Wednesday December 12, 2012 @11:08PM (#42268907)

      I don't want Google to give me what it thinks I want, or SHOULD want, or even what "most people" want - I want a pure result set basic on simple pattern matching in the dataset.

      This, exactly. For my purposes, Google has become significantly more inconvenient to use, and its results much less useful, over the past 5 years or more. I now have to use an 'allintext' operator for almost every search, and often the directive is simply ignored. And increasingly I have to put double quotes around every search term, because otherwise I get results that contain Google's idea of synonyms, (and not-so-synonyms), of my search terms; the 'synonyms' almost universally represent irrelevant junk.

      From the sound of it, with this new initiative Google is about to become entirely useless for 90+ percent of my searches. I really have no objection to them coming up with new filtering and relational algorithms - just let me turn off all of the preemptive, predictive, and utterly wrong crap-filled processing so I can get the info I need without trying to figure out how to game their intrusive filters and pointless predictions.

      FWIW, the fact that Google DOESN'T differentiate between "Taj Mahal' the palace and 'Taj Mahal' the artist without deliberate user prompting is a GOOD thing. Such ambiguities are the stuff of diversity, the pathway to new knowledge, and the breeding ground for new associations, ideas, and connections. Google's latest bit of shiny is a path to sterility, and a dumbing-down of the Web.

      Google has always been good at mining the Web for raw data, but they have always totally sucked at divining and predicting needs and intentions, and I really wish they'd just stop trying.

      • This, exactly. For my purposes, Google has become significantly more inconvenient to use, and its results much less useful, over the past 5 years or more. I now have to use an 'allintext' operator for almost every search, and often the directive is simply ignored. And increasingly I have to put double quotes around every search term, because otherwise I get results that contain Google's idea of synonyms, (and not-so-synonyms), of my search terms; the 'synonyms' almost universally represent irrelevant junk.

        I often start from the advanced search page when I don't feel like typing out behavior-modifying operators and I too have problems with things like allintext just being ignored in my results or some of my words being replaced with a list of supposed synonyms that don't make any sense for me.

      • by Anonymous Coward

        And the search should be repeatable - at least within a suitably short time frame e.g. the next day.

      • The problem is that the processing filled system appeals to the masses. Niche search - in this case defined as searches by people who know how to exactly specify what they mean with great precision - will never be as common as poorly specified searches that benefit from correction and prediction, even if it's done badly. Turning off such features would be wonderful but the demand for them to do so is probably pretty limited.

    • by skids (119237)

      I wish Google would let you turn off all those pre-guessing "features" for folks like me who just want to search for particular, unweighted things.

      Really I'm not at all bothered by Google putting in new methodologies and features. What bothers me is, as you mention, they don't give you much ability to tweak the site's behavior. That in combination with their tendency to just discontinue things on a whim really has me searching for alternatives more often. I need reliable tools, not this-month's surprise package.

      You might try DuckDuckGo. I find their coverage to be a bit thin still, but those that were using it before me say it's improving.

    • I don't want Google to give me what it thinks I want, or SHOULD want, or even what "most people" want - I want a pure result set basic on simple pattern matching in the dataset.

      Then you shouldn't use Google, as they have cared since their early beginnings for the things you do not want. Their secret page ranking and search heuristics is the main reason why they became the most popular search engine. (Not that I ever had any problems finding things with HotBot, wonder what happened to it BTW.)

    • I don't want Google to give me what it thinks I want, or SHOULD want, or even what "most people" want - I want a pure result set basic on simple pattern matching in the dataset.

      Well, google dwarfed Altavista by giving back the results people usually wanted. and NOT the results of simple pattern matching because even back in those days the results of pattern matching always were random junk, created to specifically match common patterns. It's called search engine spam.

      Creating a useful metric for "relevance" is what makes a good search engine.

    • by Sockatume (732728)

      The worst thing is when pressure from relevance starts to generate genuinely useless results. For example, Google considers "opens" and "closes" to be synonyms in certain contexts. (You can do a search with "closes" as a term and find "opens" boldened in search results as a match.) Bafflingly.

  • by 140Mandak262Jamuna (970587) on Wednesday December 12, 2012 @09:38PM (#42268233) Journal
    mm. Knowledge Graph eh? Google, Well played. well played.

    Now I counter your Artificial Intelligence with my natural stupidity. Check. Mate.

    Game Over. Boing!

  • by Anonymous Coward

    There's some intelligence on the side that formulates the query, too. Sometimes we know what were searching for, and don't need the system second guessing us all the damned time.

    • by bessie (212155)

      I agree. I felt Google results were the best and most accurate (for what I was searching for) about 2 years ago or so.
      I could be very specific and get back exactly the results I wanted.

      Now that's pretty impossible for more obscure searches.

      • by Kalten (20368)
        That would be why I have the "Google Searches Exactly What You Type" and "gooverbatim" Greasemonkey scripts. They mitigate a lot of the general crappiness of Google search these days. (I started using them after I tried searching for a way to convert from a WPF Visual to a Windows Metafile, and Google kept insisting that I must mean to be searching for 'wmf' and 'metafile' instead of 'wpf' and 'metafile'.)
        • by SnowZero (92219)

          There seems to be very little misunderstanding if I just type your actual question:
          https://www.google.com/search?q=convert+from+a+WPF+Visual+to+a+Windows+Metafile [google.com]

          One thing that I think trips up people who used web search for a long time is that you drop words you don't think are important for keyword searches, but that actually hurts now that search engines use more than keywords. Keyword spam killed keyword search a decade ago, and regular people could not use pure keyword search anyway; so

      • Yeah, in the last year or so especially they seem to have gone to shit.

        I have to quote half the words I type in, and then it still sometimes decides to only give me three or four results for what I wanted, then a second section full of crap that has nothing to do with it.

        The only new thing I like is the typo/misspelling detection, and that's only because it's actually helpful and very easy and straightforward to bypass entirely.

        I think they're trying to make it easier for people who don't know how to search

    • by Noughmad (1044096)

      First rule of software design: Never assume intelligence between keyboard and chair.

  • by wvmarle (1070040) on Wednesday December 12, 2012 @09:53PM (#42268375)

    A few years ago, I got my hands on a vegetable that I didn't know. And I was curious what it was, but how to search for something you don't know the name of? That's something that's really tricky.

    So I grabbed the vegetable, put it next to my computer, opened google.com, and typed in "what vegetable is this?", for not having any better ideas.

    Lo and behold, the search results came back, including some image results, and the first images were of a fennel - exactly the vegetable that I had on the table next to me. Perfect result, couldn't be better.

    • by LordLucless (582312) on Thursday December 13, 2012 @12:25AM (#42269293)

      That's a cool story, but it really has nothing to do with the article. It's basically a fortuitous coincidence that other people don't know what fennel looks like, and have blogged about it associated with the phrase "what vegetable is this?"

      This looks like it's primarily interested with homonyms - words with different meanings, but the same arrangement of letters. Like, say, "Prince". Prince could refer either to the title, a particular holder of that title, a brandname, or a bunch of other things. Think wikipedia's disambiguation page. This technology is basically giving google the ability to determine which particular meaning a given instance of the word is talking about, given context.

      For instance, if a page contained the phrase "Taj Mahal menu", Google would know internally that that page referred to the Taj Mahal restaurant because it has a sufficient knowledge of semantics and context to understand that monuments and musicians don't have menus, but restaurants do, and that the phrase "Taj Mahal" could refer to any of those things.

      • by wvmarle (1070040)

        That's a cool story, but it really has nothing to do with the article.

        You deserve "insightful" points for that much more than I deserve to be modded "informative" because of course it's a coincidence. Yet it's one of those jaw-dropping, how-could-this-be kind of coincidences, that actually are pretty funny if they happen.

        • by Anonymous Coward
          Last thanksgiving, at the dinner table, someone asked, "What is this vegetable?" It was fennel. Personally, I felt sad my 65+ year old relatives had no idea, while a 20+ year old recognized it. BTW, it seems the most popular image on Google's, "What is this vegetable?" is kohlrabi. In my personal experience in the deep south, the most checkout confusion is caused by buying rhubarb.
      • by dkf (304284)

        This looks like it's primarily interested with homonyms - words with different meanings, but the same arrangement of letters. Like, say, "Prince". Prince could refer either to the title, a particular holder of that title, a brandname, or a bunch of other things. Think wikipedia's disambiguation page. This technology is basically giving google the ability to determine which particular meaning a given instance of the word is talking about, given context.

        That's not a bad explanation, but the real magic is that it's ascribing a set of meanings to a word or phrase according to the nature of the clustering of web pages that mention it. Then, they can split up things like search results according to the potential significant meaning sets as one of the first things, without particular regard for just how popular the particular uses of the term are with respect to each other. In effect, it's automatically ascribing the meaning according to the potential context g

      • by Khalid (31037)

        No, this is not a coincidence, it means that fennel is sufficiently rare in english speaking countries, that at least a certain number of people will try to figure out, what vegetable is it, exactly as the author did. So his situation was not unique. I have encountered many situations where many people where asking the same questions as I did and looking for it in the web.

  • by Ralph Spoilsport (673134) on Wednesday December 12, 2012 @09:53PM (#42268379) Journal
    'Now, when you encounter encounters the letters T-A-J-M-A-H-A-L on any Web page, the computers suddenly start understanding that this document is about the monument, and this one is about the musician, and this one is about a restaurant,' Singhal says. 'That 'aboutness' is foundational to building the search of tomorrow.'"

    Or, type:
    taj mahal
    and then follow that with:

    "monument" or
    "musician" or
    "restaurant"

    Depending on what you're fucking looking for.

    • 'Now, when you encounter encounters the letters T-A-J-M-A-H-A-L on any Web page, the computers suddenly start understanding that this document is about the monument, and this one is about the musician, and this one is about a restaurant,' Singhal says. 'That 'aboutness' is foundational to building the search of tomorrow.'"

      Or, type:

      taj mahal

      and then follow that with:

      "monument" or

      "musician" or

      "restaurant"

      Depending on what you're fucking looking for.

      Well, yes... I think you missed the point though. The knowledge graph doesn't interpret what you type, it interprets the pages that are searched for. So if you search for "taj mehal restaurant" it will know that a page that only contains "taj mahal" but never actually mentions "restaurant" is actually about restaurants version based on the rest of the page's context.

      • by Tim12s (209786)

        Yes, and the concept could be in any language and it should be language insensitive.

      • Exactly, if I type "Taj Mahal Restaurant" I'm not particularly interested in a blogger telling me that he saw Taj Mahal last night at a restaurant.

        Sometimes I feel that i need to make my search terms too specific but when I really want to find specific information I like the new way a whole lot more.

  • I could've sworn that Google rolled something like this out, and then canned it (or at least removed it from the UI). If you searched for something ambiguous, like java, Google used to cluster the results, with one cluster being about Java-the-language, another about Java-the-island, and a third about java-the-coffee. So they clearly had some sort of ontology of terms and way of contextually attributing them.

    • Author here. Yes, from 2009-2011 or so they had a Google Labs project called Google Squared that presented results in tabular form (http://en.wikipedia.org/wiki/Google_Squared). I asked Shashi Thakur about this and he said they killed it because it wasn't deep enough to be useful. He told me there were actually pockets of structured, graph-like data popping up all over Google (in verticals like travel search and product search) but every team was doing it differently and it became clear the "the pockets wer
      • by Trepidity (597)

        That's pretty interesting too; somehow I missed that one, since I guess I wasn't paying attention to Google Labs. The one I was thinking of was mid-2000s, though. There's a blog post from 2006 about it here [dancohen.org]. I believe it was on by default, but did anything on a minority of terms, and at some point was removed again.

  • When I first learned to program the Taj Mahal was a public urinal in Wellington, Zealand. That building has since been turned into a Welsh bar (sounds interesting, I've never been though, never went back at all) but I do note that one brave couple have opened the Taj Mahal restaurant not so far away.

    But jokes aside, I've often wondered if first the telephone system, and then later the Internet once it was opened to the public and grew like a mad thing, were not the first artificial life forms. A few decad
    • Some Chilean biologist/philosophers called that idea autopoiesis [wikipedia.org].
      I can't find the book anymore in my library :-( but the example I remember best was of the humble earth worm.

      Earth worms eat the soil they live in and digest the humus in it (organic detritus). Soil with sharp bits of sand is of course not nice for the beast's stomach. But as everybody with a garden knows, if you have worms they make the soil looser and better aerated. They can also transport humus throughout the soil (from in front of the
  • Yeah, yeah, yeah. Correlation is not causation and most of it is mere coincidence and the entire statistical wizardry displayed by google is merely calculating correlation coefficients. But don't diss it or dismiss it. It gets to be amazingly powerful.

    Once I was looking for the lyrics of a song in the language Malayalam (BTW Malayalam is longest one -word palindrome in English). I don't speak Malayalam. I typed in Google, my best impression of the opening line of that song. Transliterated into ASCII Englis

"Out of register space (ugh)" -- vi

Working...