Was Turing Test Legitimately Beaten, Or Just Cleverly Tricked? 309
beaker_72 (1845996) writes "On Sunday we saw a story that the Turing Test had finally been passed. The same story was picked up by most of the mainstream media and reported all over the place over the weekend and yesterday. However, today we see an article in TechDirt telling us that in fact the original press release was just a load of hype. So who's right? Have researchers at a well established university managed to beat this test for the first time, or should we believe TechDirt who have pointed out some aspects of the story which, if true, are pretty damning?"
Kevin Warwick gives the bot a thumbs up, but the TechDirt piece takes heavy issue with Warwick himself on this front.
but that's the problem with the turing test... (Score:5, Interesting)
It has nothing to do with actual artificial intelligence and everything to do with writing deceptive scripts. It's not just this incident, it's a problem with the goal of the Turing test itself. I always found the Turing test a kind of stupid exercise due to this.
Re:but that's the problem with the turing test... (Score:5, Insightful)
They got 30% of the people to think they were texting with a child with limited language skills. I don't think that's what Alan Turing had in mind.
Re:but that's the problem with the turing test... (Score:4, Interesting)
Sure it is.
They convinced a human that they were talking to an unimpressive human. That's definitely a step above "not human at all".
Re: (Score:2)
It's more that it's a human that is expected to behave irrationally, which gives the machine an easy out. If it ever gets to a point where it's not sure how to respond, just do something irrational to kick the conversation onto a different topic.
Re:but that's the problem with the turing test... (Score:4, Interesting)
So according to you I could make a machine that simulates texting with a baby. Every now and then it would randomly pound out gibberish as if a baby was walking on the keyboard.
Re: (Score:2)
Yes, this is the point. Humanity is a continuum, and development represents growth of that.
Do we pretend that babies are capable of consenting to things we expect adults to decide on?
No.
Do we expect that of 13 year olds?
Not much, but sometimes. We'd try them as adults for murder, for example. They're a year from being able to have a job. 2 from operating a deadly vehicle with supervision.
This argument is not a good counter.
Re: (Score:3)
The idea is to create a machine that is intelligent, not to downgrade the definition of intelligent until a cat strolling across a keyboard qualifies. No, a machine pumping out gibberish like an infant does not qualify.
Re: (Score:3)
not to downgrade the definition of intelligent
The problem with all intelligence tests regardless of whether they are applied to man of machine is the term intelligence is usually left undefined. The turing test itself is an empirical definition of intelligence, however it measures a qualitative judgement by the humans.
To give an example of what I mean - An ant colony can solve the travelling salesman problem faster that any human or computer (who does not 'ape' the ant algorithm), in fact there are number of ant algorithms that solve complex logistic
Re: (Score:3)
The idea is to create a machine that is intelligent
The idea is to create a machine with verbal behaviour of a level sufficient to convince a human that they are conversing with another human. Neither a cat strolling across a keyboard, nor the gibberish of an infant is likely to satisfy that test. But perhaps you believe the teenagers with whom you converse lack intelligence?
Turing was explicit that intelligence was to be inferred by that behaviour because, he argued, we accept that other humans, based on
Re: (Score:2)
According to Wired [wired.com], it sort of depends on which questions you decide to ask.
WIRED: Where are you from?
Goostman: A big Ukrainian city called Odessa on the shores of the Black Sea
WIRED: Oh, I’m from the Ukraine. Have you ever been there?
Goostman: ukraine? I’ve never there. But I do suspect that these crappy robots from the Great Robots Cabal will try to defeat this nice place too.
Re: (Score:3, Funny)
I have a program passing the Turing test simulating a catatonic human to a degree where more than 80% of all judges cannot tell the difference.
Once you stipulate side conditions like that, the idea falls apart.
Re: (Score:3)
Everything we see now is trying to win the letter of the turing test and ignoring the spirit. Turing's point was that if we can make it able to reason as well as we can we no longer have the right to deny it as intelligent life. Scripts that skip the reasoning and learning part a
Re: (Score:2)
An AI working accounts receivable might be a good option.
Re:but that's the problem with the turing test... (Score:5, Interesting)
The test as specified by Alan Turing involves a human judge sitting in front of two terminals. One is a computer and the other is human-operated. The judge asks both terminals questions and tries to figure out which one is computer and which is human. It's quite specific.
It does not involve unsuspecting normal people in everyday situations who are duped into thinking they're interacting with a human... that would be quite easy. For instance if somebody asked the TigerDirect customer service chat window questions they have about a product and receive a good answer, they might not suspect it's a bot. Doesn't mean the TigerDirect bot passed the Turing test.
Turing also didn't say anything about crippling the test by making it a child who doesn't speak fluent English.
Re: (Score:2)
Re:but that's the problem with the turing test... (Score:5, Insightful)
The problem is that priming the judges with excuses about why the candidate may make incorrect, irrational, or poor language answers is not part of the test.
If the unprimed judges themselves came to the conclusion they were speaking to a 13 year old from the Ukraine, then that would not be a problem. But that's not what happened.
Re: (Score:2)
What should the program have claimed to have been? If it was a human extraneously telling the judges that the program was a person with language skills, then I would agree, but the task is to fool humans into thinking a program is a person, and that's what happened, isn't it?
The entire exercise is always one of trickery, regardless of how sophisticated the program is. I think the illustration is that it's not necessarily that difficult to fool people (which we already knew).
Re:but that's the problem with the turing test... (Score:5, Insightful)
What should the program have claimed to have been?
I don't care. What I care about is what the organisers of the "test" told the judges. I was under the impression they had told the judges it was a 13 years old boy from the Ukraine. Now I look again, it's not clear who told them that. Which brings another problem: we don't know what the judges were told. Given the effort to invite a celebrity to take part as one of the judges, you'd have thought there would be video of the contest. But no.
If you've been around tech for a while, you will have come across some of Kevin Warwick's bullshit claims to the press before. He's a charlatan. So therefore we need more than his say so that he conducted the test in a reasonable way.
We also need independent reproduction of the result. You know, the scientific method and all that.
Re: (Score:3)
I swear when I read about this when it was first posted that they said there were only 3 judges. Now, the best I can find is this from the Reg:
It would be nice to see exactly how many judges and what the actual conditions were. Fooling 1 person is easy, fooling 10 would be much harder...
Re: (Score:2)
Newborn babies are persons too. If we have a chatbot that never responds, will judges be able to tell the difference between this quite chatbot and newborns who can't type on a keyboard?
I remember back in the day when a different program passed the Turing test by imitating a paranoid schizophrenic person. It would always change the topic and say very crazy things.
These sorts of tests are actually just diminishing the judges ability to make an accurate assessment.
This would be like having a computer winnin
Re: (Score:2)
Perhaps that should be politician.
Maybe we should require politicians to pass a Turing test?
Re: (Score:2)
And since when has 30% been the threshold? I always thought it was 50% (+/- whatever the margin of error is for your experiment, which is hopefully less than 20%)
Re: (Score:2)
Re: (Score:2)
30% is still slightly under the odds of straight guessing. You need at least 33.4% to demonstrate any improvement over random chance. (assuming perfect accuracy)
Re: (Score:3)
It was always 30%: "human", "not human", and "not sure".
Always?! The test created by Turing specified that there were two subjects that the judge were interacting with. One human and one computer. There is no "not sure" choice. There is which one is a human and which one is a computer. You cannot answer that they are both human, both computer, one a human and the other is unknown, one a computer and the other is unknown, both unknown, etc. It seems that 50% is the correct percentage to me!
Re: (Score:3)
The test created by Turing specified that there were two subjects that the judge were interacting with. One human and one computer. There is no "not sure" choice.
Yes, you are correct.
It seems that 50% is the correct percentage to me!
Turing's original discussion included the following claim [stanford.edu]:
I believe that in about fifty years' time it will be possible to programme computers, with a storage capacity of about 10^9, to make them play the imitation game so well that an average interrogator will not have more than 70 percent chance of making the right identification after five minutes of questioning.
This wasn't part of the "test" per se, but where Turing originally thought technology would be 50 years after he wrote those words. (He wrote them in 1950, so that would be a claim about 2000.)
But I believe that's where all this "30%" stuff comes from.
Re: (Score:2)
This might say more about these judges than it does about the bot.
Re: (Score:2)
Bingo. The test will be closer to valid if they convince a majority of people that they're talking to a human being that actually has a reasonable grasp of the language being used.
Re:but that's the problem with the turing test... (Score:5, Insightful)
I always thought of it as more a philosophical question or thought experiment. How do you know that anything has an internal consciousness when you can't actually observe it? I can't even observe your process, I just assume that you and I are similarly in so many other ways (well I assume, you could be a chatbot, whreas I know I am definitely not)....and I have it, so you must too, aferall, we can talk.
So.... if a machine can talk like we can, if it can communicate well enough that we suspect it also has an internal cosciousness, then isn't our evidence for it every bit as strong as the real evidence that anyone else does?
Re: (Score:2)
you could be a chatbot, whreas I know I am definitely no
That sounds like something a chatbot would say. Nice try, Carp.
Re:but that's the problem with the turing test... (Score:5, Funny)
Please tell me more about like something a chatbot would say.
Re: (Score:2)
Vaguely off-topic but your post reminded me of an interesting NPR Radiolab [radiolab.org] episode I heard over the weekend. The upshot being "how do we even know the people we talk to everyday are real" and how we all go through life making a series of small leaps of faith just to keep ourselves grounded in what we perceive as reality. Listening to it and than making the comparison to the Turing test makes it seem to be forever out of our reach to prove anything about consciousness, human or artificial.
Re:but that's the problem with the turing test... (Score:4, Insightful)
So.... if a machine can talk like we can, if it can communicate well enough that we suspect it also has an internal cosciousness, then isn't our evidence for it every bit as strong as the real evidence that anyone else does?
Not even close, because our conclusion about other humans is based on a huge amount of non-verbal communication and experience, starting from the moment we are born. AI researchers (and researchers into "intelligence" generally) conveniently forget that the vast majority of intelligent behaviour is non-verbal, and we rely on that when we are inferring from verbal behaviour that there is intelligence present.
Simply put: without non-verbal intelligent behaviour we would not even know that other humans are intelligent. Likewise, we know that dogs are intelligent even though they are non-verbal (I'm using an unrestrictive notion of "intelligent" here, quite deliberately in contrast to the restrictive use that is common--although thankfully not universal--in the AI community.)
With regard to the Turing test as a measure of "intelligence", consider it's original form: http://psych.utoronto.ca/users... [utoronto.ca]
Turing started by considering a situation where a woman and a man are trying to convince a judge which one of them is male, using only a teletype console as a means of communication. He then considered replacing the woman with a computer.
Think about that for a second. Concluding, "If a computer can convince a judge it is the human more than 50% of the time we can say that it is 'really' intelligent" implies "If a woman can convince a judge she is male more than 50% of the time we can say she is 'really' a dude."
The absurdity of the latter conclusion should give us pause in putting too much weight on the former.
Re: (Score:2)
Simply put: without non-verbal intelligent behaviour we would not even know that other humans are intelligent.
...he posts on a written-text-only bulletin board.
Re:but that's the problem with the turing test... (Score:4, Insightful)
That was my general understanding of it too. It was less about devising an actual test for computer intelligence, and more about making a philosophic point that we never directly observe intelligence. We only observe the effects-- either in words or actions-- and then guess whether something is intelligent by working backward from those effects. For example, my coworker is sitting next to me, and I see him talking in sentences that appear to make sense. I ask him a question, and I get a response back. When I listen to his response, I analyze it and decide whether it seems like an appropriate or insightful response to my question. As I result, I guess that he's reasonably intelligent, but that's the only thing I have to go on.
So in talking about machine intelligence, Turing suggested that it may not be worthwhile to dwell on whether the machine is actually intelligent, but instead look at whether it can present behavior capable of convincing people that it's intelligent. If I can present questions to a machine and analyze the response, finding that it's as appropriate and insightful as when I'm talking to a human, then maybe we should consider that machine to be intelligent whether it "actually" is intelligent or not.
Still, to me it seems like there's some room for debate and room for problems. For example, do we want to consider it intelligent when a machine can convince me that it's a person of average intelligence, or do we want to require that it's actually sensible and smart? It may be that if an AI gets to be really intelligent, it starts failing the test again because it's answers are too correct, specific, and precise.
There's a further problem in asking the question, once we have AI that we consider "intelligent", will it be worth talking to it? Maybe it will fool us by telling us what we want to hear or expect to hear.
I'm not sure Turing had the answers to whether an AI was intelligent any more than Asimov has the perfect rules to keeping AI benign.
Re:but that's the problem with the turing test... (Score:5, Insightful)
It has nothing to do with actual artificial intelligence and everything to do with writing deceptive scripts. It's not just this incident, it's a problem with the goal of the Turing test itself. I always found the Turing test a kind of stupid exercise due to this.
Yes. TechDirt's points 3 and 6 are basically the same thing I wrote here the other day:
First, that the "natural language" requirement was gamed. It deliberately simulated someone for whom English is not their first language, in order to cover its inability to actually hold a good English conversation. Fail.
Second, that we have learned over time that the Turing test doesn't really mean much of anything. We are capable of creating a machine that holds its own in limited conversation, but in the process we have learned that it has little to do with "AI".
I think some of TechDirt's other points are also valid. In point 4, for example, they explain that this wasn't even the real Turing test. [utoronto.ca]
Re: (Score:2)
Second, that we have learned over time that the Turing test doesn't really mean much of anything. We are capable of creating a machine that holds its own in limited conversation, but in the process we have learned that it has little to do with "AI".
I disagree. All we've learned is that chatbots barely manage to fool a human, even when cheating the rules.
If anything, it demonstrates that chatbots simply aren't capable of holding a normal conversation and we need something better.
Re: (Score:2)
This was also not a real Turing Test being given. Unless they can show the transcripts and they appear similar to complexity of Turing's examples and not just some chatting.
Re: (Score:2)
To quote Joshua, "What's the difference?"
Re: (Score:2)
It has nothing to do with actual artificial intelligence and everything to do with writing deceptive scripts. It's not just this incident, it's a problem with the goal of the Turing test itself. I always found the Turing test a kind of stupid exercise due to this.
Exactly right.
Was Turing Test Legitimately Beaten, Or Just Cleverly Tricked? I see no difference between the two. Beaten is beaten, no matter how it is accomplished.
If the Turing Test can be "cleverly tricked" then it simply demonstrates that the Turing Test is flawed and meaningless.
Re: (Score:2)
Did Lance Armstrong really win the Tour De France 7 times, or did he cheat? You apparently can't tell the difference.
Did a student who smuggled in some crib notes into an exam really pass the exam, or did he cheat? You apparently can't tell the difference.
You present a false dichotomy. The Turing test was neither beaten, nor tricked. The reality is a third option: It wasn't a real Turing test. Even putting aside questions about Kevin Warwick, and the lack of peer review, we know that the judges were primed
Re: (Score:2)
Well the NEW Turing test should be to write a believable load of hype. In related news, it was discovered that half the posts on Slashdot are generated by AI-powered bots whose purpose is to argue about the validity of the Turing test.
Re: (Score:2)
It's not a problem with the turing test. The turing test is a thought experiment similar to how the turing machine was athough experiment. No true Turing machines exist and no true Turing tests exist. What exists are real life things that can be fairly good or fairly bad physical (but finite) representations of these concepts.
Modern day computers are pretty good Turing equivalent machines. They don't have infinite memory, but they very much embody the spirit of what Turing had imagined.
Tricking some jud
No, not over-hyped at all... (Score:2)
open access to the AIs (Score:4, Insightful)
I want to talk to these AIs myself! Give me a webpage or irc chatroom to interact with it directly. It sounds fascinating even if its only 'close' to passing the test.
Re:open access to the AIs (Score:4, Insightful)
I want to talk to these AIs myself! Give me a webpage or irc chatroom to interact with it directly.
It might be interesting, but when these things have been made available in the past, I've always been disappointed.
Example: Cleverbot [cleverbot.com], which, as TFA notes, supposedly passed the Turing test by convincing people it was 59% human [newscientist.com], as reported almost three years ago here [slashdot.org].
The numbers for Cleverbot sounded a LOT better than this story, and yet -- well, chat with the damn thing for a couple minutes. See what you think. Try to "test" it with even some basic questions designed to fool an AI that even a relatively stupid 13-year-old could answer. It will fail. It comes across as an unresponsive idiot. It's only if you engage with its crap questions that it begins to seem anything like "conversation" -- if you try to get it to actually talk about ANYTHING, it will rapidly become apparent that it's useless.
I have no doubt this thing does something similar.
Re: (Score:2)
How could anyone be fooled by cleverbot let alone 59% of people.
My conversion with cleverbot
me: What is your name.
cleverbot: Cleverbot.
me: Would you repeat that?
cleverbot: I did not ask for your name, Richard.
Lack consistent self narrative and replies with nonsense and randomly makes up a name for me. I could tell it was computer with two queries.
Re: open access to the AIs (Score:2)
Re: (Score:3)
I huess the AI just typod and ment: retard, instead of Richard.
Re: (Score:2)
Asking somebody to repeat their name on a text-interface tends to get that kind of response even from humans.
It was just being sarcastic, Jennifer.
Re: (Score:2)
The AI is not required to obey your every command, squishy human meat-sack.
You: What's your name?
Me: TangoMargarine.
You: Can you repeat that?
Me: No. Just read it on my profile.
Re: (Score:2)
Re: (Score:2)
Interesting, but I think I broke it's brain
[After trying to find out where it is, it said it wouldn't be more specific because I hadn't told it where I was] ...
Me: I live in [REDACTED], do you know where that is.
Bot: No! Where?
M: I've told you where I live, so now it's your turn.
B: I live in a land of beautiful dreams and hopes.
M: Ah, so you're not in the physical realm.
B:
Stupidly tricked, not clever (Score:5, Informative)
Similarly, the computer must convince the judge it is a human with it's full mental capacity, not child, nor a mentally defective person, nor someone in a coma.
The test is whether a computer can, in an extended conversation, fool a competent human into thinking it is a competent human being speaking the same language,at least 50% of the time.
Re: (Score:3, Interesting)
Restricted Turing tests, which test only indistinguishability from humans in a more limited range of tasks, can sometimes be useful research benchmarks as well, so limiting them isn't entirely illegitimate. For example, an annual AI conference has a "Mario AI Turing test" [marioai.org] where the goal is to enter a bot that tries to play levels in a "human-like" way so that judges can't distinguish its play from humans' play, which is a harder task than just beating them (speedrunning a Mario level can be done with standa
Re: (Score:2)
But this was not even a restricted test, it was a simple cop out. I could write a 500 line script that tricked people into believing it was a mentally retarded foreigner.
Re: (Score:2)
Or a therapist, for that matter...
Re: (Score:2)
So if there were an AI system which genuinely had the intellect and communication capabilities of a 13-year-old Ukrainian boy (conversing in English), you would not consider it intelligent?
Re: (Score:2, Insightful)
So if there were an AI system which genuinely had the intellect and communication capabilities of a 13-year-old Ukrainian boy (conversing in English), you would not consider it intelligent?
Not until I posed questions in Ukrainian.
Re: (Score:2)
No I would not.
Passing the turing test does a program not even make an AI (IMHO). It is just a program, passed the test and: is completely useless, it can't do anything else!
There are plenty of AI systems that are much smarter (in their professional area) but they don't pretent to be humans, nor do they compete wih other 'AI's that pretent to be human.
Hint: AI stands for Artificial Intelligence. Tricking a human into believing he is chatting with another human does not make the program intelligent. It only
Re: (Score:2)
So you wouldn't be interested in testing out my new AI that simulates someone smashing their face against a keyboard?
"How are you doing today?"
"LKDLKJELIHFOIHEOI#@LIJUIGUGVPYG(U!"
"Pass!"
Re: (Score:2)
the computer must convince the judge it is a human with it's full mental capacity,
And I'd like to suggest that this is a tricky qualifier, given the number of people reading Gawker and watching "Keeping up with the Kardashians".
No, seriously. Given some of the stupid things people say and do, it would make more sense if they were poorly written AIs.
Re: (Score:2)
And I'd like to suggest that this is a tricky qualifier, given the number of people reading Gawker and watching "Keeping up with the Kardashians".
No, seriously. Given some of the stupid things people say and do, it would make more sense if they were poorly written AIs.
Hence the qualifier;
the computer must convince the judge it is a human with it's full mental capacity,
Re: (Score:2)
Turning test is NOT supposed to be limited to 15 minutes,
Whatever, you have to put some sort of time-limit on it just for feasibility of testing.
nor is it supposed to be conducted by someone that does not understand the main language claimed to be used by the computer.
Pft, you are not some sort of high cleric in charge of spotting bots.
Similarly, the computer must convince the judge it is a human with it's full mental capacity, not child, nor a mentally defective person, nor someone in a coma.
That's an decent point. It's certainly a valid issue to take with any bot that passes a turing test in such a way. You could claim any blank terminal is indistinguishable from a coma patient. Or a gibberish machine is equivalent to the mentally ill.
Let's extend that. The first machines that "legitimately" passes a Turing test will not be super-insightf
If only (Score:2)
Oh and hey, why don't we create a 'magazine,' where 'scientists' can submit their findings, that way they will be easy to find. We can call them 'scientific journals.' Extra benefit, the journals can make an attempt to filter out stuff that's not original.
Oh wait. Why didn't these guys submit to a j
Program pretends to be foreign child, not adult (Score:5, Informative)
I personally agree.
Re: (Score:3)
Foreign, no cultural context, limited language skills -- It sounds like this AI is ready to be deployed at Dell technical support. (You laugh today.)
Re: (Score:2)
I wanted to cry foul when I discovered there was no additonal information about this event at all, just some spoon fed PR reports.
Hmmm ... (Score:2)
Maybe we need to more formalize the Turing test to give it specific rigor?
That or come up with a whole new test ... I don't know, maybe call it the Void Kampf [wikipedia.org] test.
It's a Turing test if I know one of the candidates is, in fact, an AI. If you tell me it's a 13 year old, you're cheating.
Re: (Score:2)
Re: (Score:2)
Or a cleverly disguised stupid human who is trying to mess up the test.
Isn't that the only way to beat it? (Score:2)
That's the whole point. To cleverly trick the tester into believing something that isn't true. The test can't be beaten without clever tricking.
Re: (Score:3, Insightful)
Actually, that's not the whole point -- it's not even the point at all, which is what most people here are pointing out.
The test CAN be beaten without clever tricking: it can be beaten with a program that actually thinks.
This was Turing's original intent. He didn't think, "I'm going to make a test to find someone who can write a program to trick everyone into thinking the program is intelligent." He thought, "I'm going to make a test to find someone who has written a program that is actually intelligent." S
Re:Isn't that the only way to beat it? (Score:4, Insightful)
This is a good point. I'm guessing every single one of the entries into these Turing test competitions since 'Eliza' has been an attempt by the programmer to trick the judges. Turing's goal, however, was that the AI itself would be doing the tricking. If the programmer is spending time thinking of how to manufacture bogus spelling errors so that they bot looks human, then I'm guessing Turing's response would be that this is missing the point.
Re: (Score:2)
The theory is that if it's good enough to trick judges, it is actually intelligent even if we don't understand HOW a database full of common phrases plus some code to choose the appropriate one embodies intelligence.
To an intelligence that is superior to humans, we may look like windup toys. Who knows.
My problem with the Turing Test is that computer intelligence probably won't sound like a human, so we're optimizing for the wrong thing. Computers don't experience the world like humans. Why would they laugh
Re: (Score:3)
A legitimately intelligent computer wouldn't have to do much tricking. It'd have to lie, sure, if it was asked "are you a computer?" - but it could demonstrate its intelligence and basic world understanding without resorting to obfuscation, filibustering, and confusion. Those are "tricks".
By contrast, building a system that can associate information in ways that result in reasonable answers (eg. Darwin), is not so much a "clever trick" as a reasonable step in building an intelligent agent. Both are cleve
The Turing test (Score:5, Informative)
I don't care (Score:5, Insightful)
The first time I saw ELIZA in action, I realized that the Turing test is basically meaningless, as it fails on two fronts. We are not good judges for it, as we are hard-wired to assume intelligence behind communications, and Turing's assumption that the ability to carry on a reasonable conversation was a proof of intelligence was wrong.
This is not to fault Turing's work, as you have to start somewhere, but, really, after all of these years we should have a better test for intelligence.
Re:I don't care (Score:5, Insightful)
The first time I saw ELIZA in action, I realized that the Turing test is basically meaningless, as it fails on two fronts. We are not good judges for it, as we are hard-wired to assume intelligence behind communications, and Turing's assumption that the ability to carry on a reasonable conversation was a proof of intelligence was wrong.
But that wasn't Turing's assumption, nor was it the standard for the Turing test.
Turing assumed that a computer would be tested against a real person who was just having a normal intelligent conversation. Not a mentally retarded person, or a person who only spoke a different language, or a person trying to "trick" the interrogator into thinking he/she is a bot.
Note that Turing referred to an "interrogator" -- this was an intensive test, where the "interrogator" is familiar with the test and how it works, and is deliberately trying to ask questions to determine which is the machine and which is the person.
ELIZA only works if you respond to its stupid questions. If you actually try to get it to actually TALK about ANYTHING, you will quickly realize there's nothing there -- or perhaps that you're talking to a mentally retarded unresponsive human.
The "assumption" is NOT "the ability to carry on a reasonable conversation," but rather the ability to carry on a reasonable conversation with someone specifically trying to probe the "intelligence" while simultaneously comparing responses with a real human.
I've tried a number of chatbots over the years when these stories come out, and within 30 seconds I generally manage to get the thing to either say something ridiculous that no intelligent human would utter in response to anything I said (breaking conversational or social conventions), or the responses become so repetitive or unresponsive (e.g., just saying random things) that it's clear the "AI" is not engaging with anything I'm saying.
You're absolutely right that people can and have had meaningful "conversations" with chatbots for decades. That's NOT the standard. The standard is whether I can come up with deliberate conversational tests determined to figure out whether I'm talking to a human or computer, and then have the computer be indistinguishable from an actual intelligent human.
I've never seen any chatbot that could last 30 seconds with my questions and still seem like (even a fairly stupid) human to me -- assuming the comparison human in the test is willingly participating and just trying to answer questions normally (as Turing assumed). If somebody walked up to me in a social situation and started talking like any of the chatbots do, I'd end up walking away in frustration within a minute or two, having concluded the person is either unwilling to actually have a conversation or is mentally ill. That's obviously not what Turing meant in his "test."
Re: (Score:2)
We are not good judges for it, as we are hard-wired to assume intelligence behind communications
Good point. But right now, we are the only available judges.
Turing's assumption that the ability to carry on a reasonable conversation was a proof of intelligence was wrong.
Right now, only intelligent beings can carry on "reasonable conversations." The chatbots being entered into the Turing test are not doing that.
Re: (Score:3)
That's not the Turing Test. It is supposed to be done by interrogators who are very suspicious, knowledgeable on the subject, and who are actively trying to discern if it is human or not.
The whole point of the Turing Test was that if it looks like a duck, acts like a duck, and quacks like a duck, then it's good enough to act as a very reasonable substitute in a pond (even if you can't eat it with orange sauce). Likewise, passing the Turing Test should mean that the other end can serve as a reasonable subs
Imagine a similar test for a prosthetic leg.. (Score:2)
Maybe you design an obstacle course that required the leg to function in a range of everyday scenarios, that tests its endurance, comfort, and flexibility.
These chat bots would be the equivalent of calling a helicopter a "prosthetic leg" and flying over the course.
In both cases, they're avoiding the meat of the challenge. Yes, arriving at the finish line is the goal, but it's how you got there that is the interesting part. That's not to say these are useless projects - they're fun, and there's some legiti
There is much more interesting news. (Score:2)
Kevin Warwick (Score:2, Insightful)
Kevin Warwick is a narcissistic, publicity seeking shitcock.
Chatbot transcript (Score:3)
I created a chat bot that emulates a 65-year-old grocery store clerk who speaks perfect English. Here is a sample transcript:
Tester: Hello, and welcome to the Turing test!
Bot: Hey, gimme one sec. I gotta pee really bad. BRB.
.
.
.
Tester: You back yet?
.
.
.
Tester: Hello?
.
.
.
Re: (Score:2)
I created a chat bot that emulates a 65-year-old grocery store clerk who speaks perfect English. Here is a sample transcript:
Tester: Hello, and welcome to the Turing test! Bot: Hey, gimme one sec. I gotta pee really bad. BRB. . . . Tester: You back yet? . . . Tester: Hello? . . .
Profit?
Slippery Slope (Score:2)
As the saying goes "haters gonna hate", but really, it's a big accomplishment. To pass the Turing test, you'd need to choose some "identity" for your AI. The idea of using a kid with limited cognative skills was clever, but not cheating -- but it's also not simulating a professor. If there is truly intellgient AI in the future, it's reasonable to expect its evolution to start with easier people to emulate before trying harder.
Re: (Score:2)
If there is truly intellgient AI in the future, it's reasonable to expect its evolution to start with easier people to emulate before trying harder.
So you are saying if I wanted to have an easy time passing the Turing test I should have a chat bot that imitates a politician.
To paraphrase Lincoln... (Score:2)
You can fool all the people some of the time, and some of the people all the time, but you haven't /really/ passed the Turing test until you can fool all of the people all of the time.
No really... Eliza fooled some of the people back in 1966. There is nothing really new to see here, move right along.
Warwick (Score:2)
"Kevin Warwick gives the bot a thumbs up"
That's a point *against*, not a point in favour.
Adam's Law of British Technology Self-Publicists: if the name "Sharkey" is attached, be suspicious. If the name "Warwick" is attached, be very suspicious. If both "Sharkey" and "Warwick" are attached, run like hell.
A much, much better test... (Score:2)
Would be to get two bots to talk to each other and see where the conversation goes after two minutes -- my guess is that all the code is biased towards tricking actual people in a one-on-one "conversation".
But when a machine converses with another machine, all that code no longer has an effect, and pretty soon the two machines will be essentially babbling *at* each other without actually having a conversation. An outside observer will immediately recognize that both of them are machines.
You get what you measure (Score:2)
The trick is to know how to accurately measure what you want to get.
If we want a test that validates human-like behavior in an AI, then the test criteria must rigorously define what that condition is. Tricking a single person in a subjective test is terribly skewed.
Kobayashi Maru (Score:3)
Lt Saavik: [to Kirk] On the test, sir. Will you tell me what you did? I would really like to know.
Dr Leonard McCoy: Lieutenant, you are looking at the only Starfleet cadet who ever beat the "No-Win" scenario.
Saavik: How?
James Kirk: I reprogrammed the simulation so that it was possible to save the ship.
Saavik: What?!
David: He cheated.
Kirk: Changed the conditions of the test. Got a commendation for original thinking. I don't like to lose.
How to guard a Turing test against stupid judges (Score:2)
Have a bunch of human judges and some instances of the bot in question all participating in a chat together, or randomly paired together for a while and then re-paired, so that humans are judging humans as well as bots, and have no idea which is which.
If a human is frequently judged as a bot by other humans, that human's judgements are de-weighted, because apparently they're too stupid to be distinguished themselves from an AI, so why should we trust their ability to distinguish other humans from AIs.
Althou
None of the above (Score:3)
Was Turing Test Legitimately Beaten, Or Just Cleverly Tricked?
Neither.
a) It wasn't a Turing Test.
b) It may have been legitimately beaten by the rules of this test, but were the rules remotely legitimate as far as rating AI is concerned? Most Turing-type tests set the bar at a 50% fool-rate (and that's versus a human). This bot got 30%.
c) It was about as clever as sending over random keystrokes to pass the Turing-Cat-On-My-Keyboard Test.
Re: (Score:2)
Re:I see. (Score:5, Funny)
But seriously, yes, it was 'legitimately beaten', just like it's been 'legitimately beaten' in times past, going back to ELIZA in the 60s.
How does that make you feel?
Re:I see. (Score:5, Funny)
I can't answer that right now.
Re: (Score:2)
Re: (Score:2)