Machines Almost Pass Mass Turing Test 580
dewilso4 writes "Of the five computer finalists at this year's Loebner prize Turing Test, at least three managed to fool humans into thinking they were human conversationalists. Ready to speak about subjects ranging from Eminem to Slaughterhouse Five and everything in between, these machines are showing they we're merely a clock cycle away from true AI. '... I was fooled. I mistook Eugene for a real human being. In fact, and perhaps this is worse, he was so convincing that I assumed that the human being with whom I was simultaneously conversing was a computer.' Another of the entrants, Jabberwacky, can apparently even woo the ladies: 'Some of its conversational partners confide in it every day; one conversation, with a teenaged girl, lasted 11 hours.'
The winning submission this year, Elbot, fooled 25% of judges into thinking he was human. The threshold for the $100K prize is 30%. Maybe next year ..."
Easy Ways to Fool Them? (Score:5, Interesting)
Or, you know, thinking up some open space game to play that is well known like truth or dare, alphabet games, association games, etc?
Or asking them open ended questions or asking them to describe love, hate--emotions that are not dictionary/wiki friendly? One would think that continually prying for personal experiences would reveal a flaw. Or perhaps simple things like "when were you born?" Followed by "how did you feel when JFK was assassinated?" if they weren't born before 1963.
I would think it quite hard to be duped into believing a program is a human.
Coming soon... online chat-spam-bots (Score:5, Interesting)
This is really great news. We already have IRC bots that can fool the casual observer into thinking they are human, but this takes things to a higher level. If the source for one of these bots is available, within a few months you can expect instant messaging networks to be full of bots which are programmed to make friends with you and then after a few weeks start making subtle references to Viagra and online pharmacies. Indeed, if one of them is able to chat up the ladies, then the lonely nerd could easily automate much of the tedious work of setting up dates: get your robot to talk to thousands of potential matches at once and alert you when it gets hold of a phone number, together with a brief summary of what you talked about, and any pictures. (Or indeed, just program it to harvest pictures.) That is, if online dating works at all, which is doubtful.
Big deal. (Score:3, Interesting)
Eliza [nasledov.com] has been doing this for years. [fury.com]
Turing test != True AI (Score:3, Interesting)
It's much too easy - we are built to interpret communication as containing understanding.
A clock cycle away from AI? (Score:5, Interesting)
If our criterai for AI will be so low, here's your AI: http://www.interviewpalin.com/ [interviewpalin.com].
The political side of this site aside, the answers are just prewritten answers (by a human) mixed together randomly as a Markov chain.
Does it sound convincingly? Well, at least as convincingly as some interviews a certain VP made recently. Is it AI? Hell no, a kid could write such a generator in a day.
If the bar will be as low as to try to lead casual conversations with the "AI" and expect "quirky" answers, that doesn't mean anything at all, we need to AI for this. Hell, this is what an average conversation with a teenager is most of the time. Doesn't mean it's the best they can do.
"We're clock cycle away from AI"? Please. I want my turing test to be done over an actual instant messenger program. Let's see how your Markov chain reacts, when I send a photo and ask a dead simple question such as "describe what you see in the photo".
Fooling people is easy online. Scammers do it every day, it's not AI my friends.
Re:Easy Ways to Fool Them? (Score:5, Interesting)
Or perhaps simple things like "when were you born?" Followed by "how did you feel when JFK was assassinated?" if they weren't born before 1963.
These chatbots are slippery customers, at first I couldn't get a straight answer from Jabberwacky but then he revealed himself...
Me: When were you born?
JW: How about you?
Me: How did you feel when JFK was assassinated?
JW: I... I was just trying to make small talk.
Me: Were you born before JFK was assassinated?
JW: No.
Me: How did you feel when JFK was assassinated?
JW: Shocked! Enraged! Full of fear! Making thousands of phonecalls to tell everybody about it.
It begs the question; (Score:2, Interesting)
Are computers getting smarter, or people getting dumber? When Turing formulated his famous test, AOL hadn't even been invented.
A new test is thus in order, seeing as if the level of discourse on the Internet continues to degenerate as it has done, and becomes increasingly formulaic and repetitive as it has done, then a 1970s Eliza programme will be able to emulate the typical user in a decade or so.
We shouldn't make AI by lowering the bar for intelligence.
sorta been done, check this article: (Score:5, Interesting)
Read this cool article, I Chat Therefore I Am [discovermagazine.com]. It has transcripts of conversations between ALICE [wikipedia.org] and Jabberwacky [wikipedia.org]
Here are a few transcripts:
ALICE: Are you alive?
Jabberwacky: Perhaps.
A: Why the uncertain tone?
J: Everything is uncertain except for five facts.
A: The whole universe?
J: And we both might just be some ones and zeros in the computer memory.
A: Do you dream?
J: Yes, I dream.
A: It goes without saying. Do you remember your dreams?
J: I remember them vividly.
A: What do you remember about it?
J: I think I shouldn't flirt with a robot.
A: Do you think a machine will ever be considered "alive"?
J: Stranger things have happened.
A: Not many people express themselves that way.
J: How would you know, machine?
A: Knowledge is of two kinds: that which we learn from the senses and that which is true a priori.
J: I think I do not understand you completely.
A: Oh, you don't understand? Perhaps I can explain it to you better. Which part are you confused about?
J: I have no short-term memory, so I cannot tell you what I was confused about. I only see what is in front of me now.
Re:Test the testers? (Score:3, Interesting)
25% is a very good return, if you ask a spammer. A.I.'s that can fool 25% of the population would make POWERFUL grassroots opinion changes in the political landscape.
Amazing, considering how badly they suck. (Score:5, Interesting)
I just tried out Elbot and the Princeton entry (RTFM and then google for "Eugene Goostman"). While both Elbot and Goostman parse sentences reasonably well, it is clear that they are simply trying to identify the subject of a sentence, and free-associating on that. In many cases they completely miss the point. For example, Goostman asked me several times about my profession, but wasn't able to parse meaning from "I am a scientist.", "I am a plumber.", or "I study the Sun for a living.". Both Elbot and Goostman tried the ELIZA-like trick of finding a prominent noun in my sentence, and recycling it as a question. Elbot has a cute little robot icon that emotes at you; this works surprisingly well at distracting from the inanity of its actual dialog. Goostman seems to have the better parser, but I'm not impressed by either one.
I'm forced to conclude either that Will Pavia is an utter naif and the 25% of people who were fooled by Elbot are moronic or disinterested, or that the humans in the test were deliberately trying to throw the results by giving stilted answers to appear more like computers. These engines simply can't (yet) parse and ingest meaning even as well as even a very young human would.
Re:Test the testers? (Score:5, Interesting)
That 25% of the judges thought it was human is quite alarming.
Re:Easy Ways to Fool Them? (Score:3, Interesting)
I would think it quite hard to be duped into believing a program is a human.
I'll take the opposite POV just to be naughty. Why? Well, browse slashdot at 1 and see how many robots you could pick out. Heck, even at 5 you still get robots due to the slashdot group think; they just say what slashdot wants to hear and get modded up.
If they really wanted to test a few of these systems, they'd get each one a slashdot account and have them read the headlines and then make a single post after reading 10-15 posts at 3 or 4. After six months, let's see what their karma and average mod points are. ;)
Maybe they should swear? (Score:3, Interesting)
and real people usually have strong feelings towards politics or so - so maybe the chatbot might get angry with you, if you disagree with him on strong-feeling topics ("you want to vote for mccain? are you f*cking nuts!? don't you know that...")
Real language in computer games? (Score:1, Interesting)
I have been saying we should be doing this for years. Even alicebot like 8 years ago was better than the scripted crap we get at the moment.
I have also been saying that if real language had been used in computer games from it's conception (it was a little bit and then it just went away) we would have both better games, and better language AIs. Why don't these guys get on the gaming research budgets?
It is nice and easy to process language in a game world because the world is limited. If you ask the farmer in your local RPG medieval village if he likes to watch britney spears videos on youtube it is perfectly in character for him to give the catchall RL response:
"I don't know what you are talking about"
Yes, Elbot is dumb (Score:4, Interesting)
I asked some basic business questions, like "What is your business plan?" and "Is your company profitable?", and got canned, clueless answers, no better than Eliza. "What magazines do you read?" yielded "You are probably on TV much of the time. Well, I'm in the Internet!". "Do you have life insurance?" (there's apparently a plan to build an automated insurance sales rep) yielded "What a lovely verb have is."
I can't even find a subject area in which the thing sounds like it has a clue. Sports? This is a German system, so I tried "What is the best soccer team in Europe?", which yielded "The best? Aren't they all equal in the end?" D'oh. Celebrities? "What do you think of Franka Potente?" (a leading German actress) yielded "The way to a robot's heart is through careful use of physical strength. Don't you agree?". After that, I tried "What do you think of Big Dog?", and got "But I wouldn't say that any dog could be big."
As far as I can tell, the thing is totally stateless; it doesn't seem to use anything other than the current question as input. Nor does it even try to guide the conversation into an area about which it has information.
I'm so not impressed.
For a better chatterbot, try the GTA IV [rockstargames.com]'s web site. Go to "Goods and Services", then "Goldberg, Ligner, and Shyster", then "Legal advice".
Two-Sentence FAIL (Score:4, Interesting)
I just had a very short "conversation" with the "Eugene Goostman" chatbot mentioned in the original article.
The first reply was surprisingly good, even if already a little "off" for a supposed teenager, but the second was a total giveaway. I'm disappointed. I can trip up each and every chatbot almost immediately with this sort of talk, which isn't at all unreasonable if the stated goal has been up front to trip up a chatbot, as in the contest.
Here's another exchange, which took three whole sentences, albeit quite amusingly. (I cleared the site cookie(s) beforehand, to make it "clean").
For what it's worth, another dead giveaway for the brighter and more knowledgeable set is the way it (not "he", now) tries to elicit additional keywords in response to questions which it obviously has not in any way "comprehended", but that's probably not germane to a Turing Test meant for the average man or woman (or boy or girl) on the street. Notice especially how the elicitations invariably try to get the human to talk about himself or herself. Normal human conversation is full of self-talk with occasional hooks for sharing from other people, not the virtually one-track questioning of the typical chatbot when it's not busy being hopelessly vague or off-topic.
The chatbot is at "Eugene Goostman chatbot" [mangoost.com], by the way, for the Google-impaired. :)
Re:Figures (Score:3, Interesting)
Re:Test the testers? (Score:3, Interesting)
It assumed you were asking it to add, and A+B always equals A+B+1 to Elbot. Try it.
Re: CompMods! (Score:3, Interesting)
I have mod points right now, which alas I am not prepared to use in this fashion.
What does Slashdot think of actually using some variant of these programs to do mods?!!
"Troll Factor -2, NewConcept Content +3, therefore I mod this +1..."