Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Businesses The Internet

"Understanding" Search Engine Enters Public Beta 192

religious freak sends word of the public beta of Powerset, a closely watched San Francisco startup that promises an "understanding engine" to revolutionize Web search. An article in SearchEngineLand points out that Powerset is reaching higher than for mere "natural language." Techcrunch has more details and analysis. For the beta, Powerset makes available all of Wikipedia to search — not all the Web. It's said that their understanding engine required a month to grok Wikipedia's 2.5M articles. The Web is currently at least 8,000 times as large.
This discussion has been archived. No new comments can be posted.

"Understanding" Search Engine Enters Public Beta

Comments Filter:
  • by Anonymous Coward on Monday May 12, 2008 @10:54PM (#23387180)
    "No results found for naked pictures of Natalie Portman. How does that make you feel?"
  • I'm Unimpressed (Score:5, Interesting)

    by eldavojohn ( 898314 ) * <eldavojohn@g[ ]l.com ['mai' in gap]> on Monday May 12, 2008 @10:55PM (#23387190) Journal
    Ok, so I like these new search engine ideas but I am grossly underwhelmed here. I tried the input:

    Who is David Bowie?
    Which it handled quite nicely. Biography, additional links and all that Wikipedia jazz.

    But come on, that's a simple question. Let's talk stuff I get into arguments over with my coworkers:

    Who played the villain in the first Die Hard?
    Which at least put Alan Rickman at #8 [powerset.com]. But let's try mutating that to make it harder but still understood by you and I:

    Who played the bad guy in the first Die Hard?
    Which resulted in very little but drivel [powerset.com] with no mention of the great Alan Rickman whatsoever ... although it did put Billie Jean King and Madonna in there for some hilarious reason.

    So maybe it can't understand 'bad guy.' Well onto another question:

    Who was the organist for The Beatles on Abbey Road?
    Which resulted in at least the first 20 having no mention of the great & oft forgotten Billy Preston [powerset.com].

    So you want to know what the kicker is? I put those same inputs into Google and found the name in the first or second result. Granted PowerSet doesn't do the whole web, I'm pretty sure that if it did, it wouldn't have the pretty results that it gave when I did what one of the articles told me to--ask it when earthquakes hit Tokyo. Just imagine the dates it would come up with if it hit a site with an html table of any seismic activity whatsoever in Tokyo!

    I think it's a novel idea to mine Wikipedia for a search engine so long as it isn't just plain old token matching like PowerSet seems to be up to. Be inventive, try a natural language parser written in Prolog that digests all of Wikipedia into a huge network/ontology of concepts ... no matter how flawed it might be.

    I find them talking about this in the articles:

    Powerset is different. It says that its technology reads and comprehends each word on a page. It looks at each sentence. It understand the words in each sentence and how they related to each other. It works out what that sentence really means, all the facts that are being presented. This means it knows what any page is really about.
    Yet, I'm not impressed. You can try to personify your software and convince me that Baby Alive really defecates like a human being all over so it feels like I have a real baby. But I know it's just software. You don't have to dumb it down if you're going to blog about it. What is this? A pattern matching implementation? A depth first search tree parsing implementation? An ontology builder? Could you at least drop one of the buzzwords of the natural language parsing field for me here?

    So does this story actually have more than a startup looking for a sugar daddy to buy it out?
    • Re:I'm Unimpressed (Score:5, Informative)

      by bluefoxlucid ( 723572 ) on Monday May 12, 2008 @11:05PM (#23387258) Homepage Journal
      Use site:en.wikipedia.org to have Google ask all of Wikipedia (English)
      • Re: (Score:3, Insightful)

        by iMaple ( 769378 ) *
        And the results are not too different. In the earthquakes question(when did earthquakes hit tokyo), where powerset seems to work like magic, google shows the same answer on the first page (though as the sixth link) ("Tokyo was hit by powerful earthquakes in 1703, 1782, 1812, 1855 and 1923").

        So even for the tailor made, best-case examples, google seems to be quite on par.
    • Re:I'm Unimpressed (Score:4, Interesting)

      by WaltBusterkeys ( 1156557 ) * on Monday May 12, 2008 @11:10PM (#23387308)

      Yet, I'm not impressed.
      Powerset is not an instant solution, it's a step in the right direction. Early Google wasn't perfect, but it got a lot better over time as the Pagerank algorithm was refined. Hopefully Powerset will show similar improvement over time.

      Heck, if Powerset is just watching what links people click on more often (Google does) then even that can help provide a training set for its algorithm. Using that kind of training set would make it vastly easier to figure out whether a change in the algorithm would be an improvement or not. That's priceless data and I hope they'll use it wisely.

      But, really, just remember that this is the first in a new breed of search engines. It won't be the last, by any means:

      -Search 0.9 was using the meta and description tags on a page to index (see Altavista). It broke when spammers figured out the algorithms.

      -Search 1.0 was using the text of inbound links to index (see Google). It doesn't know what the text means, it just knows that it has a bunch of keywords. It's breaking as people start to game their Google search results [reputationdefender.com].

      -Search 2.0 will try to find meaning in the web and understand what a page is really saying (see Powerset).

      I don't know yet what Search 3.0 will be, but we're still a long way from getting Search 2.0 to work right. But we're still making progress. Just because Powerset isn't perfect doesn't mean we should give up on the whole venture.
      • by spoco2 ( 322835 ) on Monday May 12, 2008 @11:47PM (#23387488)
        I asked 'Where do babies come from' and it just gave me back a bunch of articles with that string somewhere in their text.

        Pathetic, and you'd hope it's got a long way to go really because at the moment it does NOTHING of merit that I can see.
      • by Bill, Shooter of Bul ( 629286 ) on Tuesday May 13, 2008 @12:06AM (#23387596) Journal
        Which is why everyone started using it. It wasn't perfect, just better than anything else. Powerset isn't better than lycos.
        • Re: (Score:3, Informative)

          by Kamokazi ( 1080091 )
          Amen....I remember researching something usually meant using several different search engines (Yahoo was more concise but lacking, Altavista had EVERYTHING but took a while to find the good results, etc), and if you wanted something useful, you better know how to use your +,-, and ""s.

          Then Google comes around. You search for something and you find a good result (or three) on the first page, which was rare on Yahoo etc. unless you were looking for something really basic.
      • I don't know yet what Search 3.0 will be, but we're still a long way from getting Search 2.0 to work right. But we're still making progress.

        Actually, we aren't making progress -- *at all*. What these guys are trying to do is a subset of artificial intelligence. A subject people have banging their heads against since the 1940s, and we've made *zero* progress since then. We simply don't know how humans process information. We don't even have reasonable theories. We're at the equivalent of the "four elements make up the world" version of physics.

        AI researchers always get defensive when I say this, but it's simply true. All we have are better brute-force algorithms that sort-of simulate some of the things that humans do (i.e., voice recognition, character recognition, and other yawner tricks). There is no science of AI. Any sort of human-level understanding of information is far, far away in the future.

        • Re:I'm Unimpressed (Score:5, Interesting)

          by WaltBusterkeys ( 1156557 ) * on Tuesday May 13, 2008 @12:24AM (#23387692)
          Wait, you're saying that the MIT summer vision project [mit.edu] wasn't as easy as people thought?

          (Background: In 1966, some MIT computer science faculty thought AI was so easy that computer vision could be solved in one summer worth of work; it probably took 35 years to reach the milestones identified in the research abstract).
        • Re: (Score:3, Insightful)

          by Rakishi ( 759894 )
          That depends on what you mean by AI, we have a lot of algorithms that do interesting things. Doing something exactly like a human does them is not exactly . I can for example code a program that will beat almost any human in Othello or Checkers while using up a fraction of the computing power.

          Human brains have the computing power of a modern supercomputer and possibly a lot more of it, optimized for some specific applications such as data parsing/pattern matching. AI has had to for the past 40 years create
          • That depends on what you mean by AI, we have a lot of algorithms that do interesting things.

            What I mean by AI is Artificial *Intelligence*.

            I can for example code a program that will beat almost any human in Othello or Checkers while using up a fraction of the computing power.

            It's always been wrong to consider a game a demonstration of intelligence. It's the ability to *learn* any game that is a sign of intelligence. When they make a machine that I can feed in the rules, and by simple practice it c

          • I like the quote from Edsger Djikstra which always puts it in perspective:

            The question of whether Machines Can Think ... is about as relevant as the question of whether Submarines Can Swim.
        • Re: (Score:3, Interesting)

          by fsterman ( 519061 )
          Uhh, 1940 and no progress? Are you nuts? Cognitive scientists didn't theorize basic semantic networks until 1966, let alone artificial neurons. And no, that isn't just more brute forcing, yeah it is a *lot* more computation, but it's a completely different angle of attack than parsing sentence structure and swapping out words.
          • Uhh, 1940 and no progress? Are you nuts? Cognitive scientists didn't theorize basic semantic networks until 1966, let alone artificial neurons. And no, that isn't just more brute forcing, yeah it is a *lot* more computation, but it's a completely different angle of attack than parsing sentence structure and swapping out words.

            Yes, we've gone from banging stones together to theorizing that the world consists of four elements. It's progress in the sense that we have some ideas that we know are totally wro

        • Re: (Score:2, Interesting)

          by Anonymous Coward
          When I was but a mere lad, just staring on Computer Science, I really believed in the "Hard AI" position, viz, all we need is enough computing power and sufficiently clever algorithms, and we'd have AI. Ah, the arrogance of youth (or my youth, anyway). Since then, I've come to the conclusion that the "Hard AI" position is a total non-starter. As the parent poster says, thus far we have got nowhere in AI (AI research may have lead to useful stuff, but that stuff isn't really AI!). Personally, I am impressed
      • Yeah but what makes you think Google isnt doing the exact same thing?

        Other people have shown that Google already handles natural language questions exceptionally well.
        • This is always what I think when a new Google Killer comes along, why can't google, with it's vast team of PHD's replicated the competition?
      • by nguy ( 1207026 )
        Powerset is not an instant solution, it's a step in the right direction. Early Google wasn't perfect,

        No, it wasn't. But it was sufficiently better/easier to use than the alternatives to make using it worthwhile.

        I don't see that yet with Powerset.
    • by MillionthMonkey ( 240664 ) on Monday May 12, 2008 @11:12PM (#23387312)
      I asked it "who won the election in 2004?" and it understood the question, in a way:

      The current mayor is Jardir Silva Vidal who won the election in 2004 against Reino Martins de Oliveira
    • Re: (Score:3, Informative)

      by gadzook33 ( 740455 )
      I agree. I tried something that would betray understanding, such as "Why did Germany attack Russia?". Same result, barely any mention of WWII. All top google results, however, were relevant.
      • by onion2k ( 203094 )
        That doesn't betray understanding, that betrays a wider knowledge og history. If Powerset is merely understanding the words in the search question it should figure out that:

        "Why did" - Looking for a reason for something
        "Germany" - first comparitor
        "attack" - that's the thing
        "Russia" - second comparitor

        There's no way to know you meant WWII from the question ... you need a large data set before you can start to see which topics are most important given the terms. Powerset doesn't actually have a large data set
    • by ScentCone ( 795499 ) on Monday May 12, 2008 @11:29PM (#23387408)
      Your tests are interesting, but you're not really parsing the responses in the right context. They're problematic. Keep in mind this understanding engine understands the world in a way that was hatched out in San Fransisco.

      Who is David Bowie? I trust that it came back with, "aka Ziggy Stardust, normal family guy"

      Who played the villain in the first Die Hard? Well, obviously, the villain is "capitalism."

      Billie Jean King and Madonna ... like I said, it's San Fransisco

      Who was the organist for The Beatles on Abbey Road?

      You had it at "organ," and it got distracted. What they need is some dev guys from Toledo to collaborate, and provide a little cognitive counterweight to the understanding engine. OK, maybe not Toledo. Maybe Atlanta.
      • ...understands the world in a way that was hatched out in San Fransisco
        Is that why Powerset failed to grasp the question "Why won't my wife go to a football game at Lambeau Field in December?"
        whoops, something went wrong!

        Google was smart enough (#9) to provide this wonderful woman's answer [ivillage.com].
    • Re:I'm Unimpressed (Score:4, Interesting)

      by martin-boundary ( 547041 ) on Monday May 12, 2008 @11:31PM (#23387418)
      Since you didn't give the facts on your Google search, here they are, as of this comment's posting time:

      who is david bowie?

      en.wikipedia.org/wiki/David_Bowie
      en.wikipedia.org/wiki/David_Bowie_(album)
      www.bowiewonderworld.com/

      Result in the first three. Well done.

      Who played the villain in the first Die Hard?

      www.imdb.com/title/tt0095016/
      www.emanuellevy.com/article.php?articleID=6136
      wrestlingclassics.com/.ubb/ ultimatebb.php?ubb=get_topic;f=1;t=085316

      Result in the preview of the second only. Why they include a wrestling site though is beyond me.

      Who played the bad guy in the first Die Hard?

      www.imdb.com/title/tt0095016/
      www.imdb.com/title/tt0337978/usercomments
      www.empiremovies.com/movie/live-free-or-die-hard-/13109/review/01

      A lot of drivel, no name in the previews.

      Who was the organist for The Beatles on Abbey Road?

      paulmcgarry.com/cdcatalogue/details/5808.html
      www.beatles.ws/1969.htm
      www.sonicstate.com/news/shownews.cfm?newsid=4860

      First two, well done.

      It's interesting that Google and PowerSet are completely equivalent when your test data is available in Wikipedia. Now of course PowerSet is only searching Wikipedia, while Google has 8000(?) times more data, so it's not clear what is being tested.

      But what's strange is that Wikipedia and IMDB are returned so often. With all the hype about their huge index, I'd expect Wikipedia or IMDB to be rarely the best source in most cases, since more authoritative data is bound to be available to Google, kind of like the Abbey road example.

      • Re: (Score:2, Informative)

        It's because of Pagerank, both Wikipedia and IMDB are linked to from many thousands of sites and as such they have an insanely huge pagerank virtually guaranteeing there spot at the top of any listing. So although you may not agree with it they are at the top because many other people do use it as a reference.
    • Re: (Score:3, Interesting)

      And if Powerset really did parse and "comprehend" the content of each page (which it doesn't, judging by your trial searches), how would it deal with the significant number of error-ridden and unintelligible articles in Wikipedia?

      Not to mention non-English Wikipedias, which contain a good deal of information not available in the English one.

    • I searched for "why do people surf the internet?" and the second result was about ocean surfing, not Internet surfing. Google provided better results.
    • This might be more useful on Semantic Web pages. I mean, the hardest part is to figure out what the question is trying to ask for. Then it's a simple lookup of the web (or wikipedia) to pull up that item. The problem they are talking about (and don't appear to solve) is to translate your question into the best way to ask for what you're looking for. The problem is, there's no structure to a standard one line search. Maybe they could have you enter some more information as helpful hints. Say you're loo
    • by wass ( 72082 )
      Wished I could mod you up strictly for your mention of the late Billy Preston, one of the greatest, and IMHO underestimated, keyboardists. Here's a cool clip [youtube.com] of him in a live performance playing one of the funkiest songs ever : Outa Space. Sound quality not so great, but damn, that's some pure raw energetic soul funk straight from the source. (And just to tie this way off-topic comment back to something remotely related to 'news for nerds', you might recognize this tune from the Intel bunny-suit cleanr
    • Re:I'm Unimpressed (Score:5, Informative)

      by Threnody ( 35193 ) on Tuesday May 13, 2008 @02:04AM (#23388178) Homepage
      Thanks for testing us out with some real queries -- it's the best way to get the Powerset experience. But, if you only ask NL questions then you don't get to see all of Powerset's features.

      Powerset is not token matching. In fact, we read every sentence from every page in Wikipedia that we index. For examples of how we understand syntax, check out queries like "who did texaco acquire" vs. "who acquired texaco". Note that Powerset understands the difference between being acquired by and acquiring, that "buying" is equivalent to "acquiring", and that we are often able to highlight the actual answer to your question. Traditional search engines can do none of these things. Powerset is trying to match the meaning of your query to the meaning of a sentence in Wikipedia.

      However, Powerset is very aware that: 1) Users shouldn't be expected to use natural language and 2) We only search Wikipedia and 3) Our algorithms aren't perfect yet. Powerset's release isn't intended to replace your regular keyword search engine. But, we do hope that you come back to Powerset when you have a question that might be answered in Wikipedia.

      So, try some topical queries in Powerset, like "kurt godel." In the Factz section, Powerset knows that Kurt Godel proved theorems. If you click on "theorems," you'll see all the sentences in Wikipedia from which we derived that fact (be sure to click on "more"). Note that none of these Factz come from the Kurt Godel page. Powerset's ability to aggregate Factz from across Wikipedia is unique to our technology.

      Now try, search for the Presidency of Bill Clinton and click through to the enhanced Wikipedia page (http://www.powerset.com/explore/semhtml/Presidency_of_Bill_Clinton?query=presidency+of+bill+clinton). Note that we also have Factz in the article outline, which helps to summarize long articles. Check out the second term during the Lewinsky affair: the Factz are an amazingly accurate description of the situation.

      Sorry to be a bit lengthy, but I wanted to make it clear the Powerset isn't just about asking questions. We've got a video that identifies all of the features: http://vimeo.com/994819

      {mark} powerset product manager
      • Re: (Score:2, Interesting)

        by Kugrian ( 886993 )
        First of all, I congratulate you for making attempts to improve the worlds
        searching (and also on the look of your website - I love that blue!). How is
        this different from ask.com [ask.com] though (Powerset's
        search didn't give me an answer to that).
      • Just a quick observation: you've written 'Factz' where you meant to write 'Facts'. No need to thank me.
      • by mgiuca ( 1040724 )
        I got a good result for this one:
        How many points do you get for a goal in "australian rules football" [powerset.com].

        The second result had a snippet of text clearly highlighted "six points given for a goal and one point for a behind". (And the first had a nice picture so I can't complain).

        I do have a complaint though: Pleaze pleaze pleaze ztop zpelling thingz with a 'z'!!!
    • how many genomes were sequenced in 2007, not bad

      what is translating dna into mrna, excellent, first hit

      who is creator of lost, first hit

      who discovered penicillin, second hit (google: who discovered penicillin site:en.wikipedia.org - first hit)

      how many different amino acids are there, forth hit (google: how many different amino acids are there site:en.wikipedia.org - first hit)

      who is the most famous software developer from finland: not even the first page (google: same, poor Linus)

      who is the creator of file
    • Please tell me that out of you and your co-workers, you're the one who thinks it's Alan Rickman.
    • He's right [google.com].

      However, I still like the site, I think their interface is pretty cool.
  • by Anonymous Coward on Monday May 12, 2008 @11:00PM (#23387218)
    Since Powerset can only search Wikipedia, the logical next step is to put the entire web on Wikipedia. Who's up for the job?
  • by nog_lorp ( 896553 ) on Monday May 12, 2008 @11:00PM (#23387222)
    Any day now, Wikipedia will surpass The Web's growth rate, and set a course for the day when Wikipedia will be BIGGER THAN THE WEB.
  • by KGIII ( 973947 ) <uninvolved@outlook.com> on Monday May 12, 2008 @11:03PM (#23387236) Journal
    If I hear the word "grok" one more time I'm gonna have to kill someone...
  • by Sanity ( 1431 ) on Monday May 12, 2008 @11:04PM (#23387244) Homepage Journal
    True Knowledge [trueknowledge.com] actually interprets your question using Natural Language Processing, and then looks through a massive database of user-contributed facts, combining them using sophisticated inference rules, to give you the answer you need. Even the inference rules are user-editable.
  • 2 out of 10 (Score:5, Informative)

    by KNicolson ( 147698 ) on Monday May 12, 2008 @11:24PM (#23387384) Homepage
    I tried just "Osaka", where I am right now.

    First match was an obscure album, then a few "factz" that made no sense.

    Let's try again, "What is the largest city in Japan?"

    Tokyo doesn't feature at all on the first page! It fairs just as badly with other countries.

    It now seems to be slashdotted, so I better quit now.

  • by rindeee ( 530084 ) on Monday May 12, 2008 @11:36PM (#23387434)
    They're faster, more efficient and more accurate. Yes, they require learning yet there's a valid reason and a payoff to doing so. Do we really want to dumb things down any further? If you can't figure out Google, perhaps you should get off the Net.
    • Re: (Score:2, Interesting)

      by erikina ( 1112587 )
      They're very different. It's not expected that this natural language parsing will replace SQL (anytime in the foreseeable future).

      Every so often, I find myself wanting to use them natural language in google. Like today I wanted to find out about the symptoms of a codeine histamine reaction. Sure, I could search for 'codiene', read about it and follow links (on no doubt, wikipedia) until I find what I want - but being able to search with "What are the symptoms of codiene histamine reactions?" is quite pow
      • by Firehed ( 942385 )
        In my experience, Google fares quite well with natural language queries. Granted, it seems to happen by knocking out those words that every page has so your effective query becomes "symptoms codeine histamine reactions" which seems to return the same result set. I don't know enough about the subject to say whether it's what you want, but you're getting what you've paid for here.
    • by EmbeddedJanitor ( 597831 ) on Tuesday May 13, 2008 @12:00AM (#23387560)
      There is a fallacy that putting a ntaural language on something will make it easy. There are many specialised languages that people use every day.

      1 + 1 = 2 is a special notation/langauge that is both more consise and easier than writing "add one and one to make two". So is music score, which is far easier than reading make a high note for a bit then wait a bit and make a low note". Same with C, C++, SQL or Python: the hard bit in programming is algorithm design, not understanding the actual language itself.

      Is Natural language really a barrier to entry in using Google? I doubt it. My untechy wife and her friends find everything they need. Plugging natural language into Google gives reasonable results moset of the time.

    • Re: (Score:2, Interesting)

      I totally agree! What is the benefit of asking a computer questions using natural language? It is just going to be making an educated guess as to what you really mean. I am thinking of the stupid little dog in MS Office or the computer on the ship the Golden Heart in Hitchhikers Guide. "Perhaps you would like some tea." "Share and enjoy!" Those aren't the type of conversations we want to have with computers. That's what people are for. But really I don't think natural language works with people. I think we
  • Yeah right (Score:5, Insightful)

    What a marketing pile-of-poop. All it does is pull out phrases from Wikipedia; there is no attempt to understand the information at all. When I can type in a yes/no question ("Did they have looms in the 1400s?"), I'll be impressed. When it can make calculation ("How old was columbus when the first colony was founded?"), I'll be impressed. When it can make comparisons ("when did the earth's population match the current population of the united states?"), I'll be impressed.

    In other words, when it even attempts to answer a question that isn't already in Wikipedia as a phrase, I'll be impressed.

  • Needs some work. (Score:3, Interesting)

    by MrCrassic ( 994046 ) <deprecated@@@ema...il> on Tuesday May 13, 2008 @12:17AM (#23387644) Journal

    So I tried to search for the person who quoted, "What doesn't kill you only makes you stronger.". The search text was "Who said, "What doesn't kill you makes you stronger?"

    Google returned the closest match, who was Frederich Nietzsche, with several websites pointing to him. However, Powerset returned only instances of people who randomly said that quote. Google returned what I was looking for, while Powerset returned instances of the phrase (including one reference to Nietzsche).

    I can't really say which one is better. Google has the entire web to its advantage, while Powerset is just growing. It seems that the search engine has a lot of potential to grow, which is great as Google and company could use another competitor in the mix.

  • by Animats ( 122034 ) on Tuesday May 13, 2008 @12:19AM (#23387650) Homepage

    I've been trying various queries, and Google is doing better than Powerset even when I type in some actual question, like "How many Japanese died in WWII?".

    Question: "What is the planet closest to the sun?". First answer from Powerset: "Pluto".

    I think I see how this works. It takes the question and breaks it at noise words, ("closed class words" in linguistic terminology) constructing a query with both words and phrases. So "What is the planet closest to the sun" becomes "planet closest" sun. In fact, if you rewrite a natural language question in that form and use Google, it does better on question-answering than Powerset does.

    Remember Ask Jeeves? It worked like that? No technical breakthrough here, move along.

    • Re: (Score:2, Informative)

      by Threnody ( 35193 )
      Note that Powerset gets an exact semantic match in the second result. And, Powerset reads every sentence from every (English) page in Wikipedia.

      {mark} powerset product manager
  • by Ihmhi ( 1206036 ) <i_have_mental_health_issues@yahoo.com> on Tuesday May 13, 2008 @12:34AM (#23387750)

    ...it will take Google to buy out the company for an obscene amount and incorporate anything even slightly better than PageRank into their system.

  • by redtuxrising ( 1258534 ) on Tuesday May 13, 2008 @12:36AM (#23387762)
    Anybody got Google cache for this new search engine?
  • by Anonymous Coward on Tuesday May 13, 2008 @12:58AM (#23387846)

    Search and information retrieval is art and science. I work in the field and let me tell you that if I had a cent for every "make it work like Google" statement, I would retire somewhere in Malibu. Users, in my case they are not end users but integrators, always want to put responsibility on something else but themselves. Until we get people who can actually say "yes, we are responsible for this," we won't get too far with any search engine no matter how complex and cool it is.

    People are constantly asking questions about why it takes some time to insert a record into an engine that has 50 million documents and why a query *1*2*3* does not bring back any meaningful results (Google treats it like an arithmetic expression and gives you a '6' while many users expect '*' to be a wildcard). Then we have people who are not able to understand a precise query language that has a grammar and a set of rules you can't really fuck up. Now you give them an engine that can understand natural language and everybody in R&D and QA will soon go ape shit from all of the questions like, "I do know not to speak Inglish and engine is working but not corectly. Fix?" I am dead serious about this. Give people something genius and watch a handful of fools cause heart attacks across the search engine team.

    If you want to do something for you and your end users, learn how to ask correct questions in order to get correct answers. In the 21st century skills like keyboarding and being able to use a search engine are almost essential to one's survival. While I encourage all academic research possible in the field of information retrieval, I highly suggest people with extra money to put their ideas toward usability. Make things simple, make things precise and let users figure out the rest. Once we get to the point where everybody can make a semi-decent query, we'll move to natural language processing.

    • and why a query *1*2*3* does not bring back any meaningful results (Google treats it like an arithmetic expression and gives you a '6' while many users expect '*' to be a wildcard).

      No? [google.co.uk] Google does treat it as a wildcard expression, though it's not much use. If you mean 1*2*3 [google.co.uk], then there's a link on the page for "Search for documents containing the terms 1*2*3 [google.co.uk]." for when you don't want calculator interfering.

  • by hereschenes ( 813329 ) on Tuesday May 13, 2008 @01:58AM (#23388132)
    Who shot first? [powerset.com]
  • Your screen goes black; "Don't Panic"
  • Thoughtpuckey (Score:5, Insightful)

    by DynaSoar ( 714234 ) on Tuesday May 13, 2008 @02:22AM (#23388284) Journal
    The variance in quality of search results is noted elsewhere. I'm more interested in the fallacy of the claim of "understanding". That, as well as its synonym "comprehension" require metacognition, that is, knowing that you know. It is the basis of self-awareness. this program doesn't even pretend to give evidence of this, it simply return search results. Pretending to be self-aware was accomplised by CYC when it claimed to graps the fact that it was a computer program. For anyone interested in seeing the arguments about understanding and self-awareness, see Searle's "Chinese Room" http://en.wikipedia.org/wiki/Chinese_room [wikipedia.org] . As far as I can see, only the hype from the company, including the restatements of same in the referenced articles, make any claims as to "understanding". If there were any evidence of that beyond the hype, I have no doubt those in the field of consciousness studies would tear it apart, if they even bothered to waste their attention on it. If in being bashed it then produced a statement equivalent to "I can feel it, Dave" without being programmed to respond in that way, then I'll give it a look see. Until then it's simply a semantic parser (something already done) attached to a search engine.
  • It takes Powerset less than 3 days to index all of the english pages in Wikipedia. And we're getting faster and faster.

    {mark} powerset product manager
  • "It says that its technology reads and comprehends each word on a page."

    First let's get this straight - It doesn't comprehend anything. That's wishful thinking and marketing. It looks at verbs or certain keywords, flags them as important, references through synonyms, then proceeds to lump them under one category.

    It's a smart way to do things, but it's not comprehension. Comprehension would imply artificial intelligence whereas this system follows a set pattern of rules and doesn't 'think' on its own.

    I

  • Impressive (Score:4, Funny)

    by mrrudge ( 1120279 ) on Tuesday May 13, 2008 @05:44AM (#23389148) Homepage
    Q: What the hell is a 'factz'
    A: Did you mean 'What the hell is a fact?'

    Quite
  • Powerset is reaching higher than for mere "natural language."


    Supernatural language!
  • Getz thez factz (Score:2, Insightful)

    by Anonymous Coward
    Powersetz havez thez greatestz tipz.

    How seriously are we supposed to take a search engine that manages to misspell facts with a 'z' on it's front page?

    Why don't they go the whole hog and replace the explore button with "OMFGZ SEARCHEZ".
  • I mean really, using Wikipedia as your data set? It's so high signal-to-noise ratio it'll make all their search results look informative. Let's see how it does on the open internet, full of spammers and google-bombs.
  • It's said that their understanding engine required a month to grok Wikipedia's 2.5M articles. The Web is currently at least 8,000 times as large.
    Expect the alpha release in 2675.
  • ... every result is followed by Woody Allen's voice going "I know, I know..."

  • Not the pioneer (Score:2, Informative)

    Media seems to focusing a lot of attention on Powerset. But they seem to forget another startup which started innovating in the area of semantic search much before Powerset even arrived on the scene - Hakia. Read the following article which does a decent job of comparing the two startups. http://www.centernetworks.com/powerset-hakia [centernetworks.com]
  • by kcdoodle ( 754976 ) on Tuesday May 13, 2008 @10:32AM (#23390896)
    Google just indexes words, word fragments, and groups of words.

    This is an effort (like many others) to create a semantic web.
    This means they are trying to discover the MEANING of words and sentences.
    Very edgy, dangerous stuff. The MEANING, once extracted, is expressed in still other words.
    So SOMEONE determines what a word or group of words mean.

    This leads to classifying, identifying, sorting, drawing relations between ideas, concepts, events, animals, machines, planets, science, art, religion, basically everything you can express with words.

    This is what the human brain does. And every human brain does it a little bit differently. It is not the things we perceive that define our world and our place in it. It is the interrelations between things.

    I have been involved with several search engines, and the TAXONOMY OF KNOWLEDGE is exactly what is wanted/needed.

    Is it possible to create one? Sure.

    Is it hard? Yep, really, really, really, really hard.

    If you created one would it be correct? NO!
    It would only be ONE PERSON's vision of the relationships of knowledge, but NO ONE PERSON can speak for us all.

    Now all I have to say (after this rant) to creators of smart search engines is "GOOD LUCK"!!

Real Programmers don't write in FORTRAN. FORTRAN is for pipe stress freaks and crystallography weenies. FORTRAN is for wimp engineers who wear white socks.

Working...