Follow Slashdot stories on Twitter

Inside the 2012 Loebner Prize 68

Posted by timothy on Thursday May 17, 2012 @10:06AM from the actually-they're-just-toying-with-us dept.

An anonymous reader writes "Not a single judge was fooled by the chatbots in the 2012 Loebner Prize, which was won by the bot Chip Vivant. According to a journalist who was a human decoy in this year's Turing Test, interactions with the humans was a tad robotic while the bots went off on crazy tangents talking about being a cat and offering condolences for the death of a pet dragon."

This discussion has been archived. No new comments can be posted.

Inside the 2012 Loebner Prize

Load All Comments

Search 68 Comments Log In/Create an Account

Comments Filter:

- Re: (Score:1)
  
  by Anonymous Coward writes:
  
  From a programming perspective, algo trading is simple math, not AI. The challenges lie in A) Processing the ungodly amount of data necessary in a short enough time period to be relevant and B) Coming up with a formula that accurately identifies good trades. There is literally no connection to AI, you don't want the program to "pick" anything, only to quickly identify predefined situations.
  - Re: (Score:2)
    
    by crazyjj ( 2598719 ) * writes:
    
    From a programming perspective, algo trading is simple math, not AI.
    
    I don't think you appreciate just how complex trading software has become. The big players have moved WAY beyond "If stock X dips below Y price, then sell."
    - Re: (Score:3)
      
      by Dr_Barnowl ( 709838 ) writes:
      
      They make financial instruments complex on purpose as well. Can't understand it? Can't tax it!
      Then your system has to model trading these things... you either end up with fractal complexity, or dangerous assumptions...
    - - Re: (Score:3)
        
        by Sarten-X ( 1102295 ) writes:
        
        I currently work at a financial company, and I agree. One of our algorithms for "buy or sell" currently relies on a few hundred data points for a few hundred securities. The algorithm boils down to "did this particular statistic recently rise or fall more than a certain amount based on its history?". Sure, the code is a little hairy in places, but not too bad with adequate commenting. There is nothing I'd consider AI, as the program doesn't attempt to learn. When the investment folks see something that shou
        
        Re: (Score:1)
        
        by omitsura ( 1503131 ) writes:
        
        IV surface modeller? Vol below, buy; vol in 90th percentile sell?
    - Re: (Score:2)
      
      by TheLink ( 130905 ) writes:
      
      Meh, from what I see if you're one of the favoured companies when you screw up big time they'd roll back the trades for you. I bet even someone like me can make a lot of money if my trades got rolled back whenever I screwed up big. And if you screwed up really big you get a bailout and keep your bonus.
      There was also that infamous case when humans outsmarted the algo but they got prosecuted, convicted and lost their profits: http://www.ft.com/cms/s/0/f9d1a74a-d6f3-11df-aaab-00144feabdc0.html [ft.com]
      Why should they g
  - Re: (Score:2)
    
    by jeffmeden ( 135043 ) writes:
    
    From a programming perspective, algo trading is simple math, not AI. The challenges lie in A) Processing the ungodly amount of data necessary in a short enough time period to be relevant and B) Coming up with a formula that accurately identifies good trades. There is literally no connection to AI, you don't want the program to "pick" anything, only to quickly identify predefined situations.
    What do you exactly think is the difference between current AI and "an algorithm that quickly identifies predefined situations"? Aside from the obvious example of natural language interaction that was hilariously pointed out in the OP. AI is nothing more than an (occasionally very complex) attempt at hacking English into a variety of numbers, and then performing math on them in interesting ways (what words are statistically likely to show up near what other words, and in what order, for example.) Algorit
    - Re: (Score:3)
      
      by Samantha Wright ( 1324923 ) writes:
      
      You and the GP keep using the word "AI". I do not think it means what either of you think it means.
      Algorithmic trading does indeed involve a great deal of classical machine learning, which is one area of focus within the very broad field of artificial intelligence. What you're ascribing to "AI" is the specific goal of artificial human intelligence, regularly granted the misnomer of general artificial intelligence. The example you gave would be of more interest to translators of unknown languages than to peo
I've seen things you wouldn't believe (Score:5, Funny)

by Serious Callers Only ( 1022605 ) writes: on Thursday May 17, 2012 @10:11AM (#40027435)

Of course, all the *real* chatbots are too busy with their day job - posting spam to twitter and pumping out mass emails.

Share
twitter facebook
- Closer to home (Score:4, Funny)
  
  by Theophany ( 2519296 ) writes: on Thursday May 17, 2012 @10:26AM (#40027589)
  
  Hello sir, have you heard of CleanMyPC...?
  
  Parent Share
  twitter facebook
  - Re: (Score:3)
    
    by tepples ( 727027 ) writes:
    
    I haven't heard of CleanMyPC, but I've heard of MyCleanPC [xubuntu.org]. If you back up your files, wipe everything off your PC, and install MyCleanPC [xubuntu.org], your PC will be clean. Then you can install thousands of free applications [ubuntu.com] and reinstall many applications that are already your favorites [winehq.org]. The last time I cleaned a relative's PC that had about three fake antiviruses [wikipedia.org] on it and embedded deeply, I did just that: wiped it, installed MyCleanPC [xubuntu.org], and moved a couple things around to make it look like it used to, except cleaner
    - Re: (Score:2)
      
      by Coren22 ( 1625475 ) writes:
      
      Offtopic? Just used my points up or that would have gotten insightful. That was a great post.
      - Re: (Score:2)
        
        by mcgrew ( 92797 ) * writes:
        
        I woosh the moderators would have paid attention to where the link led.
        
        Re: (Score:2)
        
        by tepples ( 727027 ) writes:
        
        I woosh the moderators would have paid attention to where the link led.
        So I guess the "precise pangolin" part wasn't enough. I wonder what I should do to make it clearer before I install the last line of the post as my new sig sometime in June.
    - Re: (Score:2)
      
      by Hognoxious ( 631665 ) writes:
      
      Why do you cower?
      MyCleanPC = stagnated.
  - Re: (Score:2)
    
    by mcgrew ( 92797 ) * writes:
    
    That was the best satire of that spam I've seen yet, congrats! It actually made me laugh, unlike most of them.
    - Re: (Score:2)
      
      by Theophany ( 2519296 ) writes:
      
      I was tempted to do a LucasArts reference... "I'm selling this fine antivirus software..." but couldn't make it work...
slashdot should charge for the chatbot testing. (Score:3, Funny)

by infonography ( 566403 ) writes: on Thursday May 17, 2012 @10:20AM (#40027537) Homepage

This is funny since Slashdot has been the main testing ground for chatbots. We have all had to read posts from them here, do you really think all Anonymous trolls are really people? BTW. most of my enemies list are chatbots Real people would not be nearly as stupid as these clowns.

Share
twitter facebook
- Re: (Score:3)
  
  by Jeng ( 926980 ) writes:
  
  If you are being paid to post you post what you are paid to post.
  Money can buy stupidity.
- Re: (Score:2)
  
  by Coren22 ( 1625475 ) writes:
  
  How does that make you feel?
I must be hungry (Score:5, Funny)

by Lucas123 ( 935744 ) writes: on Thursday May 17, 2012 @10:25AM (#40027575) Homepage

At first glance, I read it as "Inside the 2012 Lobster Pie".

Share
twitter facebook
- Re: (Score:1)
  
  by Anne_Nonymous ( 313852 ) writes:
  
  Dang, now I'm jonesing some crawfish pie [cajuncrawfishpie.com], and don't be stingy with the butter.
- The Hungry Hungry Games (Score:1)
  
  by tepples ( 727027 ) writes:
  
  But are you as hungry [youtube.com] as the hippos [woot.com] in The Hunger Games [huffingtonpost.com]?
  - Re: (Score:2)
    
    by Hognoxious ( 631665 ) writes:
    
    I've eaten hippo. Not a hippo, obviously.
- Re: (Score:2)
  
  by Kadagan AU ( 638260 ) writes:
  
  I read "Inside the 2012 Lobster Prize" multiple times today. I thought it was connected to "Deadliest Catch" somehow..
Siri? (Score:5, Funny)

by dutchwhizzman ( 817898 ) writes: on Thursday May 17, 2012 @10:26AM (#40027585)

Did they ask the bots what was the best smartphone? We all know it's a bot if they didn't answer the N900

Share
twitter facebook
- Re:This FP for gNAA (Score:5, Funny)
  
  by makomk ( 752139 ) writes: on Thursday May 17, 2012 @10:48AM (#40027837) Journal
  
  I'm kinda conflicted - is this off-topic or on-topic?
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by rvw ( 755107 ) writes:
    
    I'm kinda conflicted - is this off-topic or on-topic?
    I think you would be an excellent judge for next years competition.
Sounds like a friend of mine (Score:2)

by PhilHibbs ( 4537 ) writes:

went off on crazy tangents talking about being a cat and offering condolences for the death of a pet dragon
Seriously, if a remote chat started talking to me like that, I'd say "Oh, hi Kim, I didn't know you were online".
The Achilles' Heel of AI (Score:5, Interesting)

by NicknamesAreStupid ( 1040118 ) writes: on Thursday May 17, 2012 @11:09AM (#40028031)

A friend of mine, who long ago worked for Thinking Machines, explained the weakness, "It is all about maintaining state." A stateless AI is far easier than a stateful one. Once the machine has to retain state, the algorithms become logarithmically more complex. Therefore, the way to test a bot is to say something like, "Remember this phrase, 'pink elephant'. I'm going to ask you after we have talked a while.." Then have several exchanges and ask, "What was that animal I told you to remember?" Most humans (except Alzheimer patients) will have no trouble with it, but the machine will fail. It they add a piece of logic to catch obvious clues like this, then a slight mod such as "have you ever seen a pink elephant? . . . what animal was I talking about?" will usually defeat it.

Humans are actually very poor at remembering. Try to recall the color of the last Volkswagen you passed on the street. However, we have developed a natural ability to prioritize our memories based on context and our personal & social needs. We tend to remember most of what turns out to be relevant. Until AI develops a means to judge context, it will suffer the weakness of being out of touch with our reality.

Share
twitter facebook
- Re:Maintaining State (Score:2)
  
  by TaoPhoenix ( 980487 ) writes:
  
  While I am sure that your friend is mostly right, it ought to be easier for the bot to remember stuff than that.
  These are programs, right? Just allocate some data. Without trying to pun Facebook, just keep a file on every person from the bot's perspective, so if it understood "remember this animal" at all, then it just sets "Judge1 Likes Pink Elephants".
  I know "Devil is in the details" but I often feel I could design a chatbot that would never make certain kinds of mistakes. Getting totally lost, sure, that
Serious Question (Score:3)

by TedTschopp ( 244839 ) writes: on Thursday May 17, 2012 @11:23AM (#40028195) Homepage

Can someone please explain to me how to read the chat logs? I am confused as to the actual exchange that is going on. Which transcript is the Bot, which is the human and how am I to sync the two parts of the conversation up?

Share
twitter facebook
- Re: (Score:2)
  
  by Fwipp ( 1473271 ) writes:
  
  Agreed - I am having no success piecing together these conversations.
  - Re: (Score:2)
    
    by bryan1945 ( 301828 ) writes:
    
    They are a pain trying to synchronize, since the judges and humans both try to trick each other, so they are not much more coherent than the bot convos. But the bots just eventually end up flailing about. They weren't kidding about "I am a cat" answer. I think the question was "What is your name?" I had to stop reading after the first 2 because people started looking at me funny wondering why I was laughing so much.
- Re: (Score:2)
  
  by Carewolf ( 581105 ) writes:
  
  Contestants on the top, judge on the bottom. Take 3 second and you can guess whether the robots are on the left or right. If you take more than that you are a bot yourself :P
  - Re: (Score:2)
    
    by TedTschopp ( 244839 ) writes:
    
    Well, I definitely failed that test, I guess I'm a bot. What do you do now that you wake up and know that you are an AI?
    - Re: (Score:2)
      
      by Beardo the Bearded ( 321478 ) writes:
      
      You fulfil your programming objectives, same as the meatbots.
      - Re: (Score:2)
        
        by Coren22 ( 1625475 ) writes:
        
        MY programming objectives include getting laid, unfortunately I am failing at that...
Bots that can carry conversations? (Score:1)

by otaku244 ( 1804244 ) writes:

/. has had users like this for years!
- Re: (Score:2)
  
  by Coren22 ( 1625475 ) writes:
  
  Your mum's face has users like this.
  - Re: (Score:2)
    
    by Hognoxious ( 631665 ) writes:
    
    In Soviet Russia, users have faces like your mum!
Disqualified bots/Alan Turing 100 competition (Score:5, Informative)

by deksza ( 663232 ) writes: on Thursday May 17, 2012 @12:52PM (#40029515)

I'm happy for all the bots that got to compete this year, but I was a little unhappy on the preliminary round of this years competition compared to other years I entered. Only 4 entries can make it to the final round of the competition. There were 12 entries this year but 7 were disqualified due to contest management (Hugh Loebner) not having enough technical knowledge to get the entries working. Some well known bots based on ALICE AIML were disqualified, Cleverbot was disqualified, and my own Ultra Hal was disqualified ( http://www.zabaware.com/webhal [zabaware.com] ) Internet communication is prohibited so we all have to send the bots as self installing programs that can utilize the contests LPP protocol. My own bot is Linux based, which is a big hurdle for the preliminary round, but I sent it as a virtual box image to simplify it for contest management, but he didn't know how to deal with it.

But luckily there will be another competition this year as part of Alan Turing's 100 year centennial at Bletchley Park on June 23rd and recognized by the Olympics http://www.reading.ac.uk/news-and-events/releases/PR445524.aspx [reading.ac.uk] Some of the disqualified bots including my own will be competing there.

Share
twitter facebook
- Re: (Score:2)
  
  by Chris Burke ( 6130 ) writes:
  
  my own Ultra Hal was disqualified
  
  Since I'm assuming it doesn't outdo Hal in terms of actual sentient intelligence, is it safe to guess that the "Ultra" comes from being less likely to try to murder you?
  Or is it that it is ultra likely to try to murder you?
  - Re: (Score:1)
    
    by deksza ( 663232 ) writes:
    
    I'm pretty sure I didn't write any murder code, although I heard of customers writing plug-ins to control X10 devices based on conversational triggers, so you never know...
- Re: (Score:2)
  
  by dbjh ( 980477 ) writes:
  
  Did he provide that as *the* reason? I just tried your Ultra Hal because of what you wrote, but it failed rather spectacularly. The first two tries it failed in its first response. The third time I accepted some of the weirder responses, but Ultra Hal definitely cannot keep a conversation going (without clearly showing it is a chat bot). However, I was rather disappointed by the low quality of the contestants of the Loebner prize, so maybe it's just me.
What's the point? Nobody cares (Score:4, Insightful)

by SmallFurryCreature ( 593017 ) writes: on Thursday May 17, 2012 @12:53PM (#40029525) Journal

This is NOT about AI, this is a bunch of whankers whanking.
READ the chat logs, it is not a human trying to see if an AI can hold a realistic conversation but rather to see if it can trip a human/ai up. This is like trying to proof cats can't see color by poking its eyes out and then saying "AHA! See it can't see color". Or me proving you are a lousy at playing catch by shooting you in the head then mocking your mother for bearing such a clumsy child.
Hell, read the chat logs and try to tell who the humans are. It is sad to say that AI chat programs are still not that much more advanced then Lisa or whatever the name was but whats the point in trying to test if chat AI can respond to insane questioning? Congrats, you are now the winner of the bot best capable of holding a conversation with an insane person. WHOO!
Don't let these people near the flying car concept, their test track would include surface to air missles because you know, that is a good test of a flying car.

Share
twitter facebook
- Re:AI (Score:2)
  
  by TaoPhoenix ( 980487 ) writes:
  
  And I reply that a perfectly valid branch of AI is designing defensive routines against trick questions. I feel that is an area the contestants don't pay enough attention to.
  You know, like the one in one of the logs "time flies like an arrow, fruit flies like a banana, which is the simile?" should kick right back to the judge with "what's a simile?" and follow it with "nah, I never liked that english class crap".
  Or an even crazier example, something like "would Richard Stallman fit in a breadbox?" it should
  - - Re:progress (Score:2)
      
      by TaoPhoenix ( 980487 ) writes:
      
      Naw, why one or the other? Just do both. We could have the manpower is we "wanted to". You know, instead of artificially limiting the field to 3 man operations.
      Say that 15 people in the 100 person team design the defensive routines, which are in some ways simpler because all trick questions have low legit semantic content. Plus speaking of humans it's what siblings and college students do to each other all the time.
      The other 85 members of the team can go back to regular language processing.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Re: (Score:1)

Re: (Score:2)

Re: (Score:3)

Re: (Score:3)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

I've seen things you wouldn't believe (Score:5, Funny)

Closer to home (Score:4, Funny)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

slashdot should charge for the chatbot testing. (Score:3, Funny)

Re: (Score:3)

Re: (Score:2)

I must be hungry (Score:5, Funny)

Re: (Score:1)

The Hungry Hungry Games (Score:1)

Re: (Score:2)

Re: (Score:2)

Siri? (Score:5, Funny)

Re:This FP for gNAA (Score:5, Funny)

Re: (Score:2)

Sounds like a friend of mine (Score:2)

The Achilles' Heel of AI (Score:5, Interesting)

Re:Maintaining State (Score:2)

Serious Question (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Bots that can carry conversations? (Score:1)

Re: (Score:2)

Re: (Score:2)

Disqualified bots/Alan Turing 100 competition (Score:5, Informative)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

What's the point? Nobody cares (Score:4, Insightful)

Re:AI (Score:2)

Re:progress (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals