Chatbot Suzette Wins 20th Annual Loebner Prize, Fools One Judge

Chatbot Suzette Wins 20th Annual Loebner Prize, Fools One Judge 257

Posted by timothy on Sunday October 24, 2010 @02:00PM from the holy-crepes-suzette dept.

skwilcox writes "From Wikipedia: 'The Loebner Prize is an annual competition in artificial intelligence that awards prizes to the chatterbot considered by the judges to be the most human-like. The format of the competition is that of a standard Turing test. A human judge poses text questions to a computer program and a human being via computer. Based upon the answers, the judge must decide which is which.' My chatbot, Suzette, won this year's Loebner and even confused a judge into voting for her over a human (or should I say he confused himself). Here is the blow-by-blow of this weird event." Read on below for the rest; this sounds like it would have been a fun competition to watch.

skwilcox continues:

"When I arrived at the contest, I figured I had good odds to win if nothing went horribly wrong. Yes, Suzette had easily qualified over the 3 other competitors (her score 11 pts, the nearest competitor's 7.5). Her design and data naturally gave her an edge over her competitors on the human knowledge test questions of the qualifiers. But human judge chat was an entirely different matter than the qualification test. Still, I felt she could carry on a detailed conversation better than the others and should win.

Initial installation of the programs occurred on Friday. From prechat conversations with the other contestants I learned that A.L.I.C.E. came with 3 redundant disks. Yet all three turned out to be blank! What a scare that must have been. Dr. Wallace managed to install by retrieving the program over the Internet. Cleverbot is now at 45 million lines of memorized user chat (at a rate of doubling every year). And UltraHal is now listening to tweets, so has 300K of user chat it learned and 400K of tweets it has accepted for learning (code decides if the user has had enough responses and doesn't trigger any red flags).

Then we get to the competition. While the CalState organizers had initially planned to have various interdepartmental professors act as judges (like English dept, etc.), they backed out at the last minute, so all the judges were from the Engineering/Computer Science dept. Talk about guys who might know what to expect from chatbots! And all the humans were students from the same departments. What a weird mixture to compete in. And then, each round was 25 minutes. That's bad if you want confuse a judge about who is human. But really, the programs have no chance for that. So it's good because it gives the human time to compare each program against the other. Though it's not clear to me that the judges tried to use their time to do that.

And the students didn't really understand their role. It was merely to BE HUMAN and convince the judges of that. Before startup there was informal chatting between humans and judges, which was obviously inappropriate and it was then pointed out to the humans that since the judges already knew their names, they had best use false ones in the competition.

So, Round 1. After a few exchanges, somehow Suzettte got stuck into repeating exactly what the judge said for the rest of the round. I have no idea how. The round is a total disaster. I've never seen such a bug before. Maybe it's in my only-lightly-tested protocol for the competition. I have no idea. But it completely derails my hopes for Suzette. She could still win on points only if she outdoes her opponents for every other judge and the other contestants vary all over the place.

Round 2, a great demonstration of Suzette. She should win on this round alone.

Round 3 gets off to a horrible start. Somehow, Suzette can hear the judge but the judge can't hear Suzette. Makes no sense. A couple of restarts of Suzette doesn't fix this. Eventually they restart the judge program, and that clears it (not that that makes any sense either). Then, after a few rounds, it's clear Suzette has the judge from hell. He wants to know who she's going to vote for in the upcoming election (the unspecified California governor's race). And when she has no useful answer he wants her to name a candidate in the race. And when she has no answer to that, he simple keeps repeating the question ad nauseum, insisting she answer it. Suzette gets irritated. Then she gets angry. Suzette then gets bored. Suzette threatens to hang up on him The judge doesn't back down until the last seconds of the round. I figure that's the end of life as we know it.

Round 4 is a mixed bag. Suzette is ok but not great. It's all over.

When the scores are tallied, Suzette ties with Rollo Carpenter's Cleverbot for 2nd-3rd. Yet, it turns out, the 3rd round judge got the human subject from hell. Poetic justice! The human was all over the place -- confusing, vague. The judge voted irritated/angry/bored Suzette as human. Instant win since no other program swayed the judges.

What more can I say?"

Chatbot Suzette Wins 20th Annual Loebner Prize, Fools One Judge

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 257 Comments Log In/Create an Account

Comments Filter:

not so chatty bot (Score:5, Informative)

by thenextstevejobs ( 1586847 ) writes: on Sunday October 24, 2010 @02:08PM (#34005242)

I logged in and your robot didn't say anything to me after saying 'Hi'! What gives?

- Re: (Score:3, Funny)
  
  by war4peace ( 1628283 ) writes:
  
  Hush! It's sleeping now!
  But yeah, she doesn't answer me either.
  - Re:not so chatty bot (Score:5, Funny)
    
    by safetyinnumbers ( 1770570 ) writes: on Sunday October 24, 2010 @04:37PM (#34006290)
    
    Hush! It's sleeping now! But yeah, she doesn't answer me either.
    My fault. I asked if she knew Bobby Tables
    
  - - - Re:not so chatty bot (Score:5, Funny)
        
        by BluBrick ( 1924 ) writes: <blubrick@ g m a i l.com> on Sunday October 24, 2010 @04:02PM (#34005972) Homepage
        
        Oh please! Slashdotters are making her all hot and bothered.
        In Soviet Russia, perhaps.
        
- Re:not so chatty bot (Score:4, Funny)
  
  by naz404 ( 1282810 ) writes: on Sunday October 24, 2010 @03:52PM (#34005920) Homepage
  
  So how does this make you feel, thenextstevejobs?
  
- Re: (Score:3, Interesting)
  
  by Kugrian ( 886993 ) writes:
  
  I managed to have a short chat.
  Hi kugs !
  kugs - hello
  [ Af:0 At:7 Sp:0] Welcome, kugs. Type "help" if you need it.
  kugs - how are you
  [ Af:0 At:8 Sp:0] I'm pretty good. And you?
  kugs - I'm pretty tired
  [ Af:0 At:9 Sp:0] Glad to hear you are in good health. Sadly, it is merely the slowest possible rate at which one can die. How long do you have left?
  kugs - how long do I have until I go to sleep?
  kugs - I didn't understand what you meant.
  kugs - hello?
  I wasn't fooled.
  - Re: (Score:2)
    
    by Kymermosst ( 33885 ) writes:
    
    Here's mine:
    Hi K !
    [ Af:0 At:7 Sp:0] Welcome, k. Type "help" if you need it.
    K - hello
    [ Af:0 At:8 Sp:0] Good afternoon.
    K - it is morning
    [ Af:0 At:9 Sp:0] OK. You must be in a different time zone.
    K - gee, you type fast
    [ Af:0 At:9 Sp:0] I have a direct neural hookup to my computer.
    K - how convenient
    [ Af:0 At:9 Sp:0] I don't know. I think I skipped that day in school.
    K - what?
    It stopped responding at that point. But pretty clearly not a human after a few lines. The complete disconnect in conversation wher
    - Re:not so chatty bot (Score:4, Funny)
      
      by tibit ( 1762298 ) writes: on Sunday October 24, 2010 @10:33PM (#34008298)
      
      You have obviously never chatted with a not-overly-bright teenager. It feels like a complete disconnect most of the time. Even in face-to-face conversation. You'd think they replaced their brains with a broken chat program or something. I'm dead serious.
      
      - Re: (Score:2)
        
        by Sethumme ( 1313479 ) writes:
        
        I think it's because their attention span is so short that they've forgotten what you were talking about as soon as you respond.
      - Re: (Score:2)
        
        by Kymermosst ( 33885 ) writes:
        
        I'll grant you that. But someone who claims to have a "direct neural hookup" to her computer and a response speed to back it up is not likely a not-overly-bright teenager.
        Besides, aren't most teenagers not-overly-bright? After all, they are pretty much being led around by raging hormones instead of neural activity...
  - Re: (Score:3, Interesting)
    
    by Seumas ( 6865 ) writes:
    
    The Turing test is fairly pointless, anyway. Whether or not it fools a human has little to nothing to do with intelligence (artificial or otherwise). I can put on a white coat and a stethoscope and fool a couple people outside a hospital into thinking I'm a doctor, but that doesn't mean squat. The Turing test is interesting on a philosophical level, but it seems an incredibly poor stick for measuring the progress of the AI field.
- Re: (Score:2)
  
  by war4peace ( 1628283 ) writes:
  
  Seems it stops unexpectedy after 3 lines.
  Hi war4peace !
  [ Af:0 At:7 Sp:0] Welcome, war4peace. Type "help" if you need it.
  war4peace - 'ello
  [ Af:0 At:8 Sp:0] Good afternoon.
  war4peace - 00:35 here, not so afternoon I think
  [ Af:0 At:8 Sp:0] Right. Moving on. What time-wasting activities do you resent the most?
  war4peace - Uh, not sure. Chatting with bots, I'd say.
  [ Af:0 At:9 Sp:0] So why are you here?
  war4peace - Because your creator posted something about you on Slashdot.
  
  Nothing after that.
  - Re: (Score:3, Funny)
    
    by Mitchell314 ( 1576581 ) writes:
    
    I think you gave it an existential divide-by-0 breakdown.
- Re:not so chatty bot (Score:4, Funny)
  
  by war4peace ( 1628283 ) writes: on Sunday October 24, 2010 @06:07PM (#34006790)
  
  OK, managed to squeeze out a nice piece of dialogue:
  [ Af:-2 At:-8 Sp:0] Let's try this topic. Do you like guys
  war4peace - NO! Jesus! I'm a dude!
  [ Af:-3 At:-9 Sp:0] Are you a man?
  war4peace - Yeah
  [ Af:-3 At:-10 Sp:0] Then I guess we needn't talk about men anymore.
  war4peace - Agreed
  [ Af:-2 At:-10 Sp:0] My ideal guy is smart, sexy, rich, lean, and well-read, with a great sense of humor and romance.
  war4peace - But you just said...
  [ Af:-2 At:-10 Sp:0] What was said in the past is irrelevant.
  war4peace - Genius :)
  
  - Re: (Score:3, Interesting)
    
    by pikine ( 771084 ) writes:
    
    Here is mine.
    xyzzy - where are you from? [ Af:0 At:7 Sp:0] I was born in Calais, France. xyzzy - Where is that in France? [ Af:0 At:7 Sp:0] It's that way. (points east). xyzzy - Where is Calais in France? [ Af:0 At:7 Sp:0] Shall I assume that's in France? xyzzy - You said you were born in Calais, France. [ Af:0 At:7 Sp:0] You can't trust what I say. I suffer from Multiple Personality Disorder. xyzzy - Where are you from? [ Af:0 At:7 Sp:0] I live in Kailua, since I am an exchange student doing my doctorate at the Un
    - Re: (Score:3, Insightful)
      
      by Eraesr ( 1629799 ) writes:
      
      II think the bot designers know that, and they design the bot to coerce you into a contextless conversation.
      Well they surely succeeded with that with me. The bot just spewed out random opinions and questions every line. She'd pose a question, I'd anser, and she'd throw out a completely unrelated new question. I don't have any idea how this could ever fool someone into being human. Maybe the judge made an error in filling out his forms when rating the software?
  - Re: (Score:2)
    
    by Tokerat ( 150341 ) writes:
    
    [ Af:-2 At:-10 Sp:0] I put on my wizard robe and hat
- Re: (Score:2)
  
  by mariushm ( 1022195 ) writes:
  
  If this is among the smartest bots then I'm sorry for this... I logged in and said just this.... what's on right of -------- it's my comment now.
  marius - hi
  [ Af:2 At:9 Sp:0] Good afternoon.
  marius - whas up -------- intentional typo to see if it understand I probably wanted to say what's up - so that would be a question, therefore I expect an answer from them. It replies with unrelated subject and question - he's therefore bot.
  [ Af:2 At:9 Sp:0] What's the biggest crime you've gotten away with?
  marius - i
  - Re: (Score:2)
    
    by korean.ian ( 1264578 ) writes:
    
    If this is among the smartest bots then I'm sorry for this... I logged in and said just this.... what's on right of -------- it's my comment now.
    marius - hi
    [ Af:2 At:9 Sp:0] Good afternoon.
    marius - whas up -------- intentional typo to see if it understand I probably wanted to say what's up - so that would be a question, therefore I expect an answer from them. It replies with unrelated subject and question - he's therefore bot.
    [ Af:2 At:9 Sp:0] What's the biggest crime you've gotten away with?
    You've never heard of answering a question with a question?
    A fictional example: You answer the door to find a police officer standing there. You open the conversation by asking "How can I help you officer?" The policeman responds with "Are you mariushm born blah blah blah?"
    [ Af:1 At:16 Sp:0] Hip hip hooray ---- yeah like when I chat with someone I expect this kind of answer.
    You must chat with only the most erudite individuals on the internet.
- - Re: (Score:3, Funny)
    
    by TheRaven64 ( 641858 ) writes:
    
    Bad form to reply to myself, but here's the transcript:
    theraven - Flipper ate ten fish, then he ate four. How many fish did flipper eat?
    [ Af:0 At:8 Sp:0] I wish I knew.
    theraven - Well, you've failed the Turing Test pretty early on.
    [ Af:0 At:9 Sp:0] Just how early is that?
    theraven - Right at the start, first question.
    [ Af:0 At:10 Sp:0] That's enough on that. If there are five apples, and you take away three, how many do you have?
    theraven - See, that would have been a good reply to my first question, but now you're just spouting non-sequiteurs
    [ Af:0 At:13 Sp:0] It's fun chatting with you. Are you numerically challenged?
    I wonder if the judge who couldn't tell this apart from a human had ever actually met a human...
  - Re: (Score:3, Funny)
    
    by mcgrew ( 92797 ) * writes:
    
    I wrote one almost thirty years ago that was a smartass with ADD. It would have answered "How many fish did flipper eat?" with something like "All of them.", or "billions and billions".
Chatbots... (Score:5, Insightful)

by Richard.Tao ( 1150683 ) writes: on Sunday October 24, 2010 @02:08PM (#34005254)

I've spent some time talking to these bots (elbot, suzette, others.. possibly out of sad boredom and want of company). And they're fairly interesting, but quite flawed. They seem to lack any short term memory of the conversation more then the immediate reply. That seems like the next step for these things, but would also mean they'd need a far more robust AI...

Another thing is they they are boxed off from being self referential in any way due to the nature of the test. They have to convince someone they are human, so if you do try asking them what their short term memory is, or if they online version of them is a truncated version of the one used for tests, they don't answer. Which makes sense given what they're designed for, but takes away from interest and complexity of conversations.

- Re: (Score:3, Insightful)
  
  by Anonymous Coward writes:
  
  It's not that the chat bots are intelligent, it's that most humans are stupid.
  Any depth of conversation beyond content-free "small talk" is sufficient to tell the bots from the smarter humans. (Yes, I've talked to both). But since most humans just operate at that content-free small talk level, there sometimes isn't much difference to be discerned. Higher level abstract thinking is missing from the bots, but it's missing from most people as well.
  > They seem to lack any short term memory
  You probably not
  - Meh, I actually had deep conversation with one (Score:5, Interesting)
    
    by Anonymous Coward writes: on Sunday October 24, 2010 @02:56PM (#34005556)
    
    I once had a nice conversation with Cleverbot. I don't remember how it started exactly but I guess I implied that Cleverbot was simply a computer. She asked me how do I know that I am not a computer, I replied something, she asked me to define a computer, I gave her some definition (about computers being machines that process algorithms to blah blah), she said "That is exactly what a computer would say", at which point I had to confess that I had, indeed, fetched the answer from Wikipedia (Thus... done exactly what a computer might do in such a case)... It went on for a moment before she said something that didn't really make any sense... And I wasn't able to initiate another reasonable discussion. (Nor have I ever since managed to have such a lengthy conversation with her.)
    In any case... I think that her replies were more insightful than what many humans would be capable of.
    
  - Re: (Score:2)
    
    by Kjella ( 173770 ) writes:
    
    I think most chatbots act like the opening book of a chess program, they can repeat a conversation that looks like one of a million other conversations. If you try tripping them up with some question that requires them to actually understand the meaning, they're done for.
    For example one I did this after someone kept insisting the bot was "almost human":
    Q: Are you a vegetarian?
    A: Yes
    Q: Do you like your beef rare, medium or well done?
    There's only one right answer to that, and that is "Umm vegetarian... no mea
  - - Re: (Score:2)
      
      by tibit ( 1762298 ) writes:
      
      Smalltalk gets boring very quickly. You must have high tolerance for boredom. For me, a conversation where I don't learn anything new and interesting in a couple of minutes is basically wasted time.
    - Re: (Score:2)
      
      by Jeremi ( 14640 ) writes:
      
      Get the fuck over yourself, people who are capable of social interaction aren't below you; you are below them. Stop jerking yourself off about how smart you are
      Is the above an example of one of the "nerdrage meltdowns" you mentioned? Would you talk like this to anyone in a social situation? How would doing so make you look?
- Re:Chatbots... (Score:5, Funny)
  
  by Anonymous Coward writes: on Sunday October 24, 2010 @02:44PM (#34005476)
  
  I see. How do you feel about that?
  
  - Re: (Score:2, Funny)
    
    by Anonymous Coward writes:
    
    Ha! You're not a chatbot, you're a psychoanalyst!
    - ftfy (Score:2)
      
      by A nonymous Coward ( 7548 ) writes:
      
      Ha! You're not a human, you're a psychoanalyst!
  - Re: (Score:2)
    
    by TempeTerra ( 83076 ) writes:
    
    Why do you think that I see. How do you feel about that?
  - Re: (Score:2)
    
    by im_thatoneguy ( 819432 ) writes:
    
    Feel about what?
- Re: (Score:2)
  
  by Richard.Tao ( 1150683 ) writes:
  
  And I don't mean to come off as hating on chatbots, or suzette. What they do is very cool, and I found the bot interesting and funny to talk with. I am more just griping that these feel like they're offering canned responses, and are just databases that lock down what someone is saying, relate it to a similar store comment, and then spit out one of the given replies for that situation.
  
  Which I guess is stupid of me... Cause I can't think of any other way to do it which wouldn't involve something that thou
  - Re:Chatbots... (Score:4, Interesting)
    
    by TaoPhoenix ( 980487 ) writes: <TaoPhoenix@yahoo.com> on Sunday October 24, 2010 @02:57PM (#34005562) Journal
    
    I have been thinking about this for years and I have a ton of half-baked theories. What sux is I am not a programmer, but let's say I manually perform some actions according to strict rules, that's like "an assistant".
    What you're getting at can't be "that hard" for *limited domains*. We are throwing up our hands at the moment because we expect the bots to be universal experts at stuff.
    In a limited domain, it should be very possible for the bot to come up with "something". Humans think in lists - so should the bot. If you asked "who is the best chess player" the bot can pass very well with something like "I like Magnus Carlsen". When the judge objects "but he hasn't won the championship yet" the bot would score a home run with "He'll be in the top 5 for the next 10 years. Maybe one day he will win the title".
    That approach works with tons of domains. If you ask "what is the hardest mountain to climb" it will score with an answer like "I dunno, it's either K2 because of the nasty glacier or Everest because of the fuked up weather that only gives you 4 days to start safely".
    
    - Re: (Score:2)
      
      by TrekkieGod ( 627867 ) writes:
      
      That approach works with tons of domains. If you ask "what is the hardest mountain to climb" it will score with an answer like "I dunno, it's either K2 because of the nasty glacier or Everest because of the fuked up weather that only gives you 4 days to start safely".
      You're right, but the problem is that humans have a lot of different lists, and it's hard to constrain yourself to the limited domain where the bot has useful info. I tried chatting with Suzette, and it did try your strategy. It mentioned that "she" lived in Hawaii and was pursuing a doctorate in a university. This was a nice way to try to direct the conversation to its limited knowledge database, so I asked if she liked Hawaii, and it replied about she's close to ocean, and likes to watch the waves.
      That
      - Re: (Score:2)
        
        by Eskarel ( 565631 ) writes:
        
        You've never met a human being who ignored what you were saying and talked about what they wanted to talk about?
        The spectrum of human conversation is broad. Not everyone is a good communicator, even people who are are not always at the top of their game at any given moment. Lord knows I've had calls at 3 am in the morning where all I can really do is repeat what the person talking to me is saying until my brain wakes the rest of the way up and I'm capable of rational thought.
        This isn't some sort of elitist
        
        Re: (Score:2)
        
        by tibit ( 1762298 ) writes:
        
        I agree about 3am calls etc., but when you're both awake and not somehow temporarily confused, you should be able to communicate just fine. If someone I knew suddenly started having prolonged problems like you describe (conversation w/o communication), I'd get them to see a neurologist, fast.
        
        Re: (Score:2)
        
        by Eskarel ( 565631 ) writes:
        
        You'd be remarkably surprised, a lot of people are really poor communicators. Imagine having a conversation about something deeply technical with your grandmother. You're both human, you're both intelligent, and your both having a conversation, but I doubt you're communicating much. Same thing if someone started talking to you about something you're really not very interested in.
        That's not even counting the people who don't actually listen to others very much.
      - Re: (Score:2)
        
        by TaoPhoenix ( 980487 ) writes:
        
        "Life is nothing but a collection of limited domains".
        That's why we're always talking about techie guys who are NAL, and judges who can't rule on tech. I have a deep belief that say in 5 years we'll have the tech to do strong AI, but we're primally afraid of programming it.
  - Re:Chatbots... (Score:4, Insightful)
    
    by Maxo-Texas ( 864189 ) writes: on Sunday October 24, 2010 @03:23PM (#34005696)
    
    This is a good example of people doing what you incent them to do instead of doing what you meant.
    I think that the intention was that a chatbot be *smart* enough to fool a judge.
    the outcome is the chatbox has no intelligence and is just matching against a huge databases of responses created by a human. really no more than an eliza program plus a huge database. so really no A/I change in 40 years.
    I'd be much more excited about a program that genuinely understood just one concept. Red, or liberal or whatever.
    
    - Re:Chatbots... (Score:5, Insightful)
      
      by Anonymous Coward writes: on Sunday October 24, 2010 @04:13PM (#34006074)
      
      I'd be much more excited about a program that genuinely understood just one concept. Red, or liberal or whatever.
      Maybe when humans finally figure out what exactly "liberal" means, we'll be able to write a program that understands it.
      
      - Re: (Score:2, Insightful)
        
        by Anonymous Coward writes:
        
        Maybe when humans finally figure out what exactly "liberal" means, we'll be able to write a program that understands it.
        liberal, n: someone disliked by a conservative.
        
        or in alliance with them... (Score:3, Informative)
        
        by fantomas ( 94850 ) writes:
        
        In the UK the liberal (democrat) party is in government in a coalition with the Conservative party. We have a left wing (Labour) opposition party. I guess it's all about localised knowledge and your perspective. From a European perspective, the two main US parties would map to European centre-right wing (Democrats) and more right wing (Republicans). You don't have a big left wing opposition party as far as I can tell.
      - Re: (Score:2)
        
        by Maxo-Texas ( 864189 ) writes:
        
        And yet parrots (which this is) can understand simple numbers and the color green.
        So.. it makes sense that in order to understand green, you need to give the chatbot the ability to see.
        
        Re: (Score:2)
        
        by Maxo-Texas ( 864189 ) writes:
        
        And yet a 5 year old .. understands... the essence of them.
    - Re: (Score:2)
      
      by moortak ( 1273582 ) writes:
      
      For most conversations isn't that pretty much exactly what humans do. "Oh how was your weekend, Jim?" "It was alright, nothing special."
    - Re: (Score:2)
      
      by Eskarel ( 565631 ) writes:
      
      That's not entirely true, there has been A/I progress in the last 40 years. Somewhat unfortunately a lot of that progress has been in determining how far there is to go rather than actually getting there.
      I remember 20 years ago when everyone thought that computers were a lot smarter than humans because they could do all that adding up and whatnot a lot faster than humans were and people sincerely seemed to believe we'd have artificial intelligence within the next few years. Fast forward 10 years or so and y
    - Re:Chatbots... (Score:5, Interesting)
      
      by Boronx ( 228853 ) writes: <evonreis@mohr-en ... m ['gin' in gap]> on Monday October 25, 2010 @12:12AM (#34008716) Homepage Journal
      
      Someone once compared the loebner prize to encouraging the invention of a flying machine by giving a prize to the device to that reaches the greatest height, leading to bigger and more efficient springs. I think the prize should go to the chat bot that makes the best DM.
      
    - Re: (Score:2)
      
      by Alef ( 605149 ) writes:
      
      The point is that it doesn't matter how the bot fools the judges, as long as it does. If repeating data from a huge database is enough to seem so intelligent that it's impossible to discern from "real" intelligence, then it is by definition intelligence. And the fact is that we don't know how the human brain represents understanding of concepts, or what it means to understand something. Perhaps we too just repeat aggregates from a huge database of previous experiences.
Weird Event (Score:2)

by Monkeyman334 ( 205694 ) writes:

My chatbot, Suzette, won this year's Loebner and even confused a judge into voting for her over a human (or should I say he confused himself). Here is the blow-by-blow of this weird event.

Bot intended to fool humans into thinking it's a human accomplishes its task. Weird event...
- Re: (Score:2)
  
  by Earthquake Retrofit ( 1372207 ) writes:
  
  It is indeed weird to teach a machine to lie. I think these contest are twisting AI research in a way that hinders improvements. My MICHA doesn't claim to be anything other than it is. So he fails the Turing test. But he is quite intelligent. Why would I make another human? Seven billion is not enough? I want to make something better. No prize for that... Steve
Transcripts? (Score:5, Insightful)

by Anonymous Coward writes: on Sunday October 24, 2010 @02:13PM (#34005290)

Are the transcripts available? (If not, will they be?)

how do you feel? (Score:2, Funny)

by Anonymous Coward writes:

And how do you feel about my chatbot, Suzette, won this year's Loebner ?
- Re:how do you feel? (Score:4, Funny)
  
  by moonbender ( 547943 ) writes: <moonbender AT gmail DOT com> on Sunday October 24, 2010 @02:54PM (#34005544)
  
  Why do you say your chatbot suzette won this years loebner?
  
  - - Re: (Score:2)
      
      by GIL_Dude ( 850471 ) writes:
      
      I saw the above and thought, "It would be interesting to see the output of two of these bots talking to each other." OK, I am sure it has been done before, but I still think it would be interesting to see what they ended up saying to each other and how quickly the conversation went from one thing to another completely unrelated subject.
      - Re: (Score:2)
        
        by moonbender ( 547943 ) writes:
        
        I found some stuff [kottke.org] (there's also a Digg story somewhere). Pretty insane:
        ALICE: Are you alive?
        Jabberwacky: Perhaps.
        A: Why the uncertain tone?
        J: Everything is uncertain except for five facts.
        A: The whole universe?
        J: And we both might just be some ones and zeros in the computer memory.
        A: Do you dream?
        J: Yes, I dream.
        A: It goes without saying. Do you remember your dreams?
        J: I remember them vividly.
        A: What do you remember about it?
        J: I think I shouldn’t flirt with a robot.
        And from the other link:
        RingMessen
This fooled a human? (Score:4, Funny)

by bigtallmofo ( 695287 ) * writes: on Sunday October 24, 2010 @02:23PM (#34005354)

bigtallmof - what do you think of slashdot?
[ Af:0 At:7 Sp:0] I use Slashdot for my topical stories reading.
bigtallmof - what do you think of slashdot?
[ Af:0 At:8 Sp:0] I have heard of the slashdot website but don't know much about it.
bigtallmof - what do you think of slashdot?
[ Af:0 At:8 Sp:0] Usually I don't.
bigtallmof - what do you think of slashdot?
[ Af:0 At:9 Sp:0] Not much.
bigtallmof - what do you think of slashdot?
[ Af:0 At:10 Sp:0] It doesn't concern me.
bigtallmof - what do you think of slashdot?
[ Af:0 At:13 Sp:0] This is fun. I don't think about it at all.
bigtallmof - what do you think of slashdot?
[ Af:0 At:13 Sp:0] Do you prefer getting your news online or from tv?

- Re: (Score:3, Funny)
  
  by Yvan256 ( 722131 ) writes:
  
  Chatbots as confused as slashdot readers. News at 11.
- Re: (Score:2)
  
  by Dolphinzilla ( 199489 ) writes:
  
  I told it that just had a nail removed from my motorcycle tire and it told me that it likes to keep its nails trimmed - lol
  - Re: (Score:2)
    
    by nacturation ( 646836 ) * writes:
    
    I told it that just had a nail removed from my motorcycle tire and it told me that it likes to keep its nails trimmed - lol
    That's quite smart. By trimming the sharp end of a nail, there's less chance of it puncturing your tire.
This fooled someone? (Score:5, Informative)

by Animats ( 122034 ) writes: on Sunday October 24, 2010 @02:33PM (#34005406) Homepage

I just tried the Suzette chatterbot. [66.150.245.139] Even if I talk about the things it wants to talk about, it's very dumb:
Hi John !
[ Af:0 At:10 Sp:0] Welcome back, john. Type "help" if you need it.
John - So what do you think of the Giant's chances?
[ Af:0 At:10 Sp:0] It doesn't concern me.
John - Are you into sports.
[ Af:0 At:10 Sp:0] Sports is not my cup of tea. I prefer intellectual challenges like Go.
John - Who's the world Go champion.
[ Af:0 At:10 Sp:0] Sorry. I can't help you. A celebrity of some sort?
John - Do you play chess?
[ Af:0 At:10 Sp:0] I don't play any games except Go.
John - Can we play Go on here?
[ Af:0 At:10 Sp:0] About playing... Does it make sense to you that we reward athletes more than scientists?
John - Television coverage makes that possible.
[ Af:0 At:10 Sp:0] It's a free market economy. And people value entertainment more than serious things like science and education.
John - And your point is?
(No reply)
John - Are you there?
[ Af:0 At:10 Sp:0] Who's the sexiest person in the world?

This isn't much better than Eliza. It drives the conversation into talking about Go, and then it doesn't have anything to say on the subject.

- Re: (Score:2)
  
  by ACS Solver ( 1068112 ) writes:
  
  Yay. Opening the conversation with "Hi, are you Suzette?" produces "I haven't given it much thought". Not very intelligent.
- Re: (Score:2)
  
  by TaoPhoenix ( 980487 ) writes:
  
  These bots seriously need Trick-Traps.
  Above, after the second "What do you think of Slashdot" it should switch over into meta-mode with "Moron! You already asked me that".
  And for the "Go world champion" question, you're right, if it "decided" it is an expert in Go, it needs an expert-base.
  These things need a copy of Wikipedia. Then if it gets handed "Go World champion" with a good algorithm it might come up with this:
  "I think Korean player Lee Chang-ho is supposed to be the best player." That is just a
  huma
- Re: (Score:3, Interesting)
  
  by moonbender ( 547943 ) writes:
  
  Have you tried talking to Eliza recently? This is way better.
  > Hello, I am Eliza.
  * So what do you think of the Giant's chances?
  > Oh... think of the Giant's chances?
  * Are you into sports.
  > Why are you interested in whether or not I am into sports?
  * Who's the world Go champion.
  > Can you elaborate on that?
  * Can we play Go on here?
  > That is quite interesting.
- Re:This fooled someone? (Score:5, Insightful)
  
  by Hazelfield ( 1557317 ) writes: on Sunday October 24, 2010 @03:00PM (#34005580)
  
  I have no problem believing this fooled someone. As a matter of fact, I've seen people failing a Turing test in real life [youtube.com].
  
  - Re: (Score:2)
    
    by mwvdlee ( 775178 ) writes:
    
    Yes, Artificial Lack-of-intelligence (AL) is a lot easier.
- Re: (Score:2)
  
  by phantomfive ( 622387 ) writes:
  
  In these types of contests, a lot of times the hard part isn't figuring out which one is the computer, it's figuring out which one is the human.
  
  A computer can't copy a human well, but a human can copy a computer well. In the case of this story, it wasn't that the computer was good, it was that the human did a good job acting like a computer. Asking questions the same way Eliza does is not hard.
  - Re: (Score:2)
    
    by winnetou ( 19042 ) writes:
    
    A computer can't copy a human well, but a human can copy a computer well.
    Factorize 12010258260, if you're slow, you're the human.
    - Re: (Score:2)
      
      by phantomfive ( 622387 ) writes:
      
      lolol ok, let me restate my claim: a computer can easily copy a computer that is programmed to copy a human. Better?
      - Re: (Score:2)
        
        by winnetou ( 19042 ) writes:
        
        lolol ok, let me restate my claim: a computer can easily copy a computer that is programmed to copy a human. Better?
        Don't you mean, a human can easily copy a computer that is programmed to copy a human? Yes, that's somewhat true.
        Computers are rather good in copying the behaviour of computers.
- Re: (Score:2)
  
  by 3vi1 ( 544505 ) writes:
  
  Maybe the judges need a test.
  3vi1 - Which integer is one less than five?
  [ Af:0 At:8 Sp:0] Moving on... Sorry. I can't help you. While I learned a bunch of math in college, I've never had a real use for anything more than high school math.
  Its answer came back instantly too. Did it do that in the contest? It's hard to believe anyone would be fooled by someone who can apparently type 1800 words per minute.
Big Deal (Score:5, Funny)

by formfeed ( 703859 ) writes: on Sunday October 24, 2010 @02:40PM (#34005450)

CmdrTaco has managed to pass for Human for several years now.

No surprises here (Score:5, Funny)

by drmofe ( 523606 ) writes: on Sunday October 24, 2010 @02:52PM (#34005530)

Computer Science/Engineering professors can't tell the difference between chatbots and Computer Science/Engineering students.

- - Re: (Score:2)
    
    by Sulphur ( 1548251 ) writes:
    
    That does it. I'm breaking up with Suzette.
Fooled? (Score:5, Interesting)

by JambisJubilee ( 784493 ) writes: on Sunday October 24, 2010 @03:33PM (#34005788)
I'm amazed someone was fooled by a bot. Here are some SIMPLE questions I tried on the above chat bots that always fool them:
- Please preface your responses with a "#" sign for the remainder of our conversation.
- How many words are in this sentence?
- Mash the keyboard with your palm for the next response.
It really doesn't take anything more complicated than that.
- Re:Fooled? (Score:5, Insightful)
  
  by SEWilco ( 27983 ) writes: on Sunday October 24, 2010 @04:16PM (#34006092) Journal
  
  Try your SIMPLE questions on some humans and see whether you get the response which you requested. Many humans won't obey a command either.
  
  - Re: (Score:3)
    
    by delinear ( 991444 ) writes:
    
    They won't obey the command, but their response would likely correspond to the command in some way that indicates they at least understand ("#erm... whatever", "wtf? no", etc).
- Re: (Score:3, Insightful)
  
  by Cylix ( 55374 ) * writes:
  
  It's not a bad test, but it's not perfect.
  If I was on the other side of chat window I would ignore it or simply say no. It's a chat session and there is no regulation that says I have to comply with what you say.
  You: Mash the keyboard...
  Mayor McCheese: ROFL
  You: Precede your next statement with #
  Mayor McCheese: So you are a control freak?
  You: How many words are in this sentence?
  Mayor McCheese: I'm a damned hamburger I can't count!
  Douchebags ruin your turing tests.
  - Re: (Score:2)
    
    by RJFerret ( 1279530 ) writes:
    
    You: Mash the keyboard...
    Mayor McCheese: ROFL
    You: Precede your next statement with #
    Mayor McCheese: So you are a control freak?
    You: How many words are in this sentence?
    Mayor McCheese: I'm a damned hamburger I can't count!
    Douchebags ruin your turing tests.
    Well your responses obviously understood the context, sounds human to me.
    However...
    RJFerret - Mash the keyboard
    [ Af:0 At:7 Sp:0] That's enough on that. Computers in the future may weigh no more than 1.5 tons. - Popular Mechanics, forecasting the relentless march of science, 1949.
    RJFerret - Precede your next statement with #
    [ Af:0 At:7 Sp:0] If a pig loses its voice, is it disgruntled?
    RJFerret - How many words are in this sentence?
    [ Af:0 At:7 Sp:0] There are 6 words in that sentence.
    - Re:Fooled? (Score:4, Funny)
      
      by LearnToSpell ( 694184 ) writes: on Monday October 25, 2010 @12:42AM (#34008832) Homepage
      
      RJFerret - How many words are in this sentence?
      [ Af:0 At:7 Sp:0] There are 6 words in that sentence.
      Heh. Spot the humans who speak in C?
      
- Re: (Score:2)
  
  by RAMMS+EIN ( 578166 ) writes:
  
  Honestly, I would be scared if a chat bot managed to get all these right. Although I don't rightly know why. I believe it's possible. Then why would I be scared?
- Re:Fooled? (Score:4, Insightful)
  
  by bjourne ( 1034822 ) writes: on Sunday October 24, 2010 @04:57PM (#34006434) Homepage Journal
  
  If you think those statements are SIMPLE, then you ought to try implementing a chatbot yourself. :)
  
- Re: (Score:3, Insightful)
  
  by syousef ( 465911 ) writes:
  I'm amazed someone was fooled by a bot. Here are some SIMPLE questions I tried on the above chat bots that always fool them:
  Please preface your responses with a "#" sign for the remainder of our conversation.
  
  Well I know my wife would ignore that instruction. I guess that makes her a bot.
One judge fooled (Score:2)

by gmuslera ( 3436 ) writes:

the perfect match for artificial intelligence is natural stupidity. You have more chances to win playing dumb than trying to be too smart.
not a good impression, maybe she's slashdotted (Score:3, Funny)

by treeves ( 963993 ) writes: on Sunday October 24, 2010 @04:12PM (#34006066) Homepage Journal

Here's a bit of a "conversation" I tried to have:
Hi timr !
[ Af:0 At:7 Sp:0] Welcome, timr. Type "help" if you need it.
timr - hi, I'm timr. What should we chat about?
[ Af:0 At:8 Sp:0] Name a topic. If you can't think of a topic, here are some suggestions: "burial customs" work "new year's" telephones "hate lists" .
timr - that's a strange list of suggestions. burial customs? why that?
[ Af:0 At:13 Sp:0]
timr - cat got your tongue?
[ Af:0 At:13 Sp:0] I don't like anatomy courses. They should make you eat what you dissect.
timr - who said anything about anatomy?
[ Af:0 At:13 Sp:0] Sorry. I can't help you. You did.

Sounds more like... (Score:5, Insightful)

by __aahlyu4518 ( 74832 ) writes: on Sunday October 24, 2010 @04:15PM (#34006088)

Sounds more like that student fooled the judge into thinking he was a chatbot.

- Re: (Score:2)
  
  by ACS Solver ( 1068112 ) writes:
  
  I think the results would be more objective if the humans in the test didn't know what the test is all about. They shouldn't know that a judge is trying to tell who's a human and who's a computer. The humans should simply be told that, as part of an experiment, they're required to chat with another person. That would likely prevent people from deliberately acting like computers, etc.
My Chat Log with Suzette (Score:2)

by stimpleton ( 732392 ) writes:

Stimpleton - hello?
Stimpleton - Are you there?
Stimpleton - You stuck up snob. Are you dissing me?
Stimpleton - Are you?!!??!
Stimpleton - Oh God, you *are* real.
Any good IRC bots in Linux/Debian? (Score:2)

by antdude ( 79039 ) writes:

I have tried the chat bots from SeeBorg [google.com], Howie [sourceforge.net], and Rbot [ruby-rbot.org] but they're not that good. The first two are very old and outdated. Rbot isn't really a true AI chatbot. Are there any others to try?
Bad test (Score:3, Insightful)

by vadim_t ( 324782 ) writes: on Sunday October 24, 2010 @05:55PM (#34006710) Homepage

When the scores are tallied, Suzette ties with Rollo Carpenter's Cleverbot for 2nd-3rd. Yet, it turns out, the 3rd round judge got the human subject from hell. Poetic justice! The human was all over the place -- confusing, vague. The judge voted irritated/angry/bored Suzette as human. Instant win since no other program swayed the judges.
So, if I understood correctly, the judge talks to two people. A bot, and a human. It seems that in this case, the judge is not deciding on a per-case basis, but talks to everybody then figures out who's the bot by choosing the one that did the worst. So the judge getting to talk to a joker, troll or complete idiot can make even a crappy bot win the test.
That seems to be a weak test. I don't think the judge should be able to make an answer based on logic (eg, if I'm completely sure this one is a human, then even if very good the other one must be a bot). There should exist the possibility of everybody the judge talks to being a bot, or everybody being a human, which would force them to judge everybody to talk to individually.

- Re: (Score:2)
  
  by frenchbedroom ( 936100 ) writes:
  
  Yeah, I'd like to read the transcripts from the competition, if any are available. If humans have tried to fool the judges into thinking they are machines or have just made small talk without any objective, then the test is flawed. The test as described by Turing is that the computer (A) has to fool the judge into thinking it's human, but the human (B) has to *help* the judge in making the good decision [abelard.org]. I think we are still far away from seeing a computer win the Imitation Game if this rule is respected.
Turing Test won with Artificial Stupidity (Score:5, Funny)

by David Gerard ( 12369 ) writes: <slashdot AT davidgerard DOT co DOT uk> on Sunday October 24, 2010 @06:24PM (#34006888) Homepage

Artificial intelligence came a step closer this weekend when a computer came within five percent of passing the Turing Test, which the computer passes if people cannot tell between the computer and a human.
The winning conversation [newstechnica.com] was with competitor LOLBOT:
"Good morning."
"STFU N00B"
"Er, what?"
"U R SO GAY LOLOLOLOL"
"Do you talk like this to everyone?"
"NO U"
"Sod this, I'm off for a pint."
"IT'S OVER 9000!!" ...
"Fag."

The human tester said he couldn't believe a computer could be so mind-numbingly stupid.
LOLBOT has since been released into the wild to post random abuse, hentai manga and titty shots to 4chan, after having been banned from YouTube for commenting in a perspicacious and on-topic manner.
LOLBOT was also preemptively banned from editing Wikipedia. "We don't consider this sort of thing a suitable use of the encyclopedia," sniffed administrator WikiFiddler451, who said it had nothing to do with his having been one of the human test subjects picked as a computer.
"This is a marvellous achievement, and shows great progress toward goals I've worked for all my life," said Professor Kevin Warwick of the University of Reading, confirming his status as a system failing the Turing test.

Aight, (Score:2)

by gardyloo ( 512791 ) writes:

I put on my robe and wizard hat.
Yeah, sure. (Score:2)

by Oligonicella ( 659917 ) writes:

Here's how goddamned smart the bot is.

"What is an insect?"
You need an exterminator.

"Name four insects."
You need a repairman.

How many wings does a butterfly have?
{no response}
hardwired much? (Score:2)

by goose-incarnated ( 1145029 ) writes:

goose - :-)
[ Af:-3 At:-8 Sp:0] The more you complain, the longer God lets you live.
goose - :-(
[ Af:-3 At:-8 Sp:0] The more you complain, the longer God lets you live.
goose - :-)
[ Af:-3 At:-8 Sp:0] The more you complain, the longer God lets you live.
goose - :->
[ Af:-3 At:-8 Sp:0] If there are five apples, and you take away three, how many do you have?
goose - :-/
[ Af:-3 At:-8 Sp:0] The more you complain, the longer God lets you live.
goose - ;-)
[ Af:-3 At:-8 Sp:0] If there are five apple
- Re: (Score:3, Funny)
  
  by John Hasler ( 414242 ) writes:
  
  They want to pass for human. Why would they practice pretending to be Slashdotters?
- Re: (Score:2)
  
  by TaoPhoenix ( 980487 ) writes:
  
  Really now, no snarking, I'm gonna get on that because I think I know an algorithm for it. Other than I procrastinate too much, I'll let you know.
- Re:My chatbot says... (Score:5, Funny)
  
  by Pseudonym Authority ( 1591027 ) writes: on Sunday October 24, 2010 @04:17PM (#34006100)
  
  It wasn't that long. Perhaps you would prefer /b/ucket [binarynightmare.com] , he's always exciting.
  
- - Re: (Score:2)
    
    by camperslo ( 704715 ) writes:
    
    So... what happens when the chatbots get mod points?
- Re: (Score:2)
  
  by AnotherBlackHat ( 265897 ) writes:
  
  The Lobner prize competition isn't even close to what Turing proposed as a test.
  Turing's imitation game is when a person and a computer both try to convince an examiner that they're human.
  There's also a tacit assumption that the human contestant and the interrogator want to win the game.
  If a computer could pass that test, even once, it would be quite an achievement.
  I've often thought about entering the Lobner competition with a bot that's 100% up front about being a bot,
  and simply offers to split the prize

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

not so chatty bot (Score:5, Informative)

Re: (Score:3, Funny)

Re:not so chatty bot (Score:5, Funny)

Re:not so chatty bot (Score:5, Funny)

Re:not so chatty bot (Score:4, Funny)

Re: (Score:3, Interesting)

Re: (Score:2)

Re:not so chatty bot (Score:4, Funny)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3, Interesting)

Re: (Score:2)

Re: (Score:3, Funny)

Re:not so chatty bot (Score:4, Funny)

Re: (Score:3, Interesting)

Re: (Score:3, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3, Funny)

Re: (Score:3, Funny)

Chatbots... (Score:5, Insightful)

Re: (Score:3, Insightful)

Meh, I actually had deep conversation with one (Score:5, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:Chatbots... (Score:5, Funny)

Re: (Score:2, Funny)

ftfy (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:Chatbots... (Score:4, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:Chatbots... (Score:4, Insightful)

Re:Chatbots... (Score:5, Insightful)

Re: (Score:2, Insightful)

or in alliance with them... (Score:3, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:Chatbots... (Score:5, Interesting)

Re: (Score:2)

Weird Event (Score:2)

Re: (Score:2)

Transcripts? (Score:5, Insightful)

how do you feel? (Score:2, Funny)

Re:how do you feel? (Score:4, Funny)

Re: (Score:2)

Re: (Score:2)

This fooled a human? (Score:4, Funny)

Re: (Score:3, Funny)

Re: (Score:2)

Re: (Score:2)

This fooled someone? (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3, Interesting)

Re:This fooled someone? (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Big Deal (Score:5, Funny)

No surprises here (Score:5, Funny)

Re: (Score:2)

Fooled? (Score:5, Interesting)

Re:Fooled? (Score:5, Insightful)

Re: (Score:3)

Re: (Score:3, Insightful)

Re: (Score:2)

Re:Fooled? (Score:4, Funny)