Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
AI Software

Inside the 2012 Loebner Prize 68

An anonymous reader writes "Not a single judge was fooled by the chatbots in the 2012 Loebner Prize, which was won by the bot Chip Vivant. According to a journalist who was a human decoy in this year's Turing Test, interactions with the humans was a tad robotic while the bots went off on crazy tangents talking about being a cat and offering condolences for the death of a pet dragon."
This discussion has been archived. No new comments can be posted.

Inside the 2012 Loebner Prize

Comments Filter:
  • by Serious Callers Only ( 1022605 ) on Thursday May 17, 2012 @09:11AM (#40027435)

    Of course, all the *real* chatbots are too busy with their day job - posting spam to twitter and pumping out mass emails.

  • by infonography ( 566403 ) on Thursday May 17, 2012 @09:20AM (#40027537) Homepage

    This is funny since Slashdot has been the main testing ground for chatbots. We have all had to read posts from them here, do you really think all Anonymous trolls are really people? BTW. most of my enemies list are chatbots Real people would not be nearly as stupid as these clowns.

  • by Lucas123 ( 935744 ) on Thursday May 17, 2012 @09:25AM (#40027575) Homepage
    At first glance, I read it as "Inside the 2012 Lobster Pie".
  • Siri? (Score:5, Funny)

    by dutchwhizzman ( 817898 ) on Thursday May 17, 2012 @09:26AM (#40027585)
    Did they ask the bots what was the best smartphone? We all know it's a bot if they didn't answer the N900
  • went off on crazy tangents talking about being a cat and offering condolences for the death of a pet dragon

    Seriously, if a remote chat started talking to me like that, I'd say "Oh, hi Kim, I didn't know you were online".

  • by NicknamesAreStupid ( 1040118 ) on Thursday May 17, 2012 @10:09AM (#40028031)
    A friend of mine, who long ago worked for Thinking Machines, explained the weakness, "It is all about maintaining state." A stateless AI is far easier than a stateful one. Once the machine has to retain state, the algorithms become logarithmically more complex. Therefore, the way to test a bot is to say something like, "Remember this phrase, 'pink elephant'. I'm going to ask you after we have talked a while.." Then have several exchanges and ask, "What was that animal I told you to remember?" Most humans (except Alzheimer patients) will have no trouble with it, but the machine will fail. It they add a piece of logic to catch obvious clues like this, then a slight mod such as "have you ever seen a pink elephant? . . . what animal was I talking about?" will usually defeat it.

    Humans are actually very poor at remembering. Try to recall the color of the last Volkswagen you passed on the street. However, we have developed a natural ability to prioritize our memories based on context and our personal & social needs. We tend to remember most of what turns out to be relevant. Until AI develops a means to judge context, it will suffer the weakness of being out of touch with our reality.
    • While I am sure that your friend is mostly right, it ought to be easier for the bot to remember stuff than that.

      These are programs, right? Just allocate some data. Without trying to pun Facebook, just keep a file on every person from the bot's perspective, so if it understood "remember this animal" at all, then it just sets "Judge1 Likes Pink Elephants".

      I know "Devil is in the details" but I often feel I could design a chatbot that would never make certain kinds of mistakes. Getting totally lost, sure, that

  • by TedTschopp ( 244839 ) on Thursday May 17, 2012 @10:23AM (#40028195) Homepage

    Can someone please explain to me how to read the chat logs? I am confused as to the actual exchange that is going on. Which transcript is the Bot, which is the human and how am I to sync the two parts of the conversation up?

    • by Fwipp ( 1473271 )

      Agreed - I am having no success piecing together these conversations.

      • They are a pain trying to synchronize, since the judges and humans both try to trick each other, so they are not much more coherent than the bot convos. But the bots just eventually end up flailing about. They weren't kidding about "I am a cat" answer. I think the question was "What is your name?" I had to stop reading after the first 2 because people started looking at me funny wondering why I was laughing so much.

    • Contestants on the top, judge on the bottom. Take 3 second and you can guess whether the robots are on the left or right. If you take more than that you are a bot yourself :P

  • /. has had users like this for years!
  • by deksza ( 663232 ) on Thursday May 17, 2012 @11:52AM (#40029515)
    I'm happy for all the bots that got to compete this year, but I was a little unhappy on the preliminary round of this years competition compared to other years I entered. Only 4 entries can make it to the final round of the competition. There were 12 entries this year but 7 were disqualified due to contest management (Hugh Loebner) not having enough technical knowledge to get the entries working. Some well known bots based on ALICE AIML were disqualified, Cleverbot was disqualified, and my own Ultra Hal was disqualified ( http://www.zabaware.com/webhal [zabaware.com] ) Internet communication is prohibited so we all have to send the bots as self installing programs that can utilize the contests LPP protocol. My own bot is Linux based, which is a big hurdle for the preliminary round, but I sent it as a virtual box image to simplify it for contest management, but he didn't know how to deal with it.

    But luckily there will be another competition this year as part of Alan Turing's 100 year centennial at Bletchley Park on June 23rd and recognized by the Olympics http://www.reading.ac.uk/news-and-events/releases/PR445524.aspx [reading.ac.uk] Some of the disqualified bots including my own will be competing there.
    • my own Ultra Hal was disqualified

      Since I'm assuming it doesn't outdo Hal in terms of actual sentient intelligence, is it safe to guess that the "Ultra" comes from being less likely to try to murder you?

      Or is it that it is ultra likely to try to murder you?

      • by deksza ( 663232 )
        I'm pretty sure I didn't write any murder code, although I heard of customers writing plug-ins to control X10 devices based on conversational triggers, so you never know...
    • by dbjh ( 980477 )
      Did he provide that as *the* reason? I just tried your Ultra Hal because of what you wrote, but it failed rather spectacularly. The first two tries it failed in its first response. The third time I accepted some of the weirder responses, but Ultra Hal definitely cannot keep a conversation going (without clearly showing it is a chat bot). However, I was rather disappointed by the low quality of the contestants of the Loebner prize, so maybe it's just me.
  • by SmallFurryCreature ( 593017 ) on Thursday May 17, 2012 @11:53AM (#40029525) Journal

    This is NOT about AI, this is a bunch of whankers whanking.

    READ the chat logs, it is not a human trying to see if an AI can hold a realistic conversation but rather to see if it can trip a human/ai up. This is like trying to proof cats can't see color by poking its eyes out and then saying "AHA! See it can't see color". Or me proving you are a lousy at playing catch by shooting you in the head then mocking your mother for bearing such a clumsy child.

    Hell, read the chat logs and try to tell who the humans are. It is sad to say that AI chat programs are still not that much more advanced then Lisa or whatever the name was but whats the point in trying to test if chat AI can respond to insane questioning? Congrats, you are now the winner of the bot best capable of holding a conversation with an insane person. WHOO!

    Don't let these people near the flying car concept, their test track would include surface to air missles because you know, that is a good test of a flying car.

    • And I reply that a perfectly valid branch of AI is designing defensive routines against trick questions. I feel that is an area the contestants don't pay enough attention to.

      You know, like the one in one of the logs "time flies like an arrow, fruit flies like a banana, which is the simile?" should kick right back to the judge with "what's a simile?" and follow it with "nah, I never liked that english class crap".

      Or an even crazier example, something like "would Richard Stallman fit in a breadbox?" it should

Ocean: A body of water occupying about two-thirds of a world made for man -- who has no gills. -- Ambrose Bierce

Working...