TuVox Voice Interface 120
pablos writes: "NYTimes has an article about Tuvox who set up Handspring and Activision with voice interfaces for tech support. Apparently they can do away with the annoying 'press # now' menus. I've used things like TellMe, which played an ad everytime it didn't understand you, but I'm wondering if this sort of thing is starting to work anywhere. Anybody called Handspring for tech support lately?"
First ever slashdotting of a phone number? (Score:4, Funny)
Re:First ever slashdotting of a phone number? (Score:1)
Re:First ever slashdotting of a phone number? (Score:1)
Re:First ever slashdotting of a phone number? (Score:2)
ATT Has a similar service (Score:2, Informative)
AT&T has been doing this for a while (Score:3, Informative)
"For digital cable, press or say 1" etc.
A lot of times to avoid complicated and looping voicemail, I just don't press anything to fake like I have a rotary phone and get transferred to the first available agent.
Well, that trick is no more! Since even rotary phone users can say their choices, not doing or saying anything disconnects you. Pretty crafty.
- JoeShmoe
.
Re:AT&T has been doing this for a while (Score:1)
Trouble is the time when you most need "hands-free" is when driving, but the background noise makes it difficult to use (unless your in some fancy limo).
Now a HAL style lip reader
Re:AT&T has been doing this for a while (Score:3, Interesting)
And voice activated dialing - same person (this time at a club) tried to voice dial another friend - ended up calling his parents at 2:00am. They were not happy bunnies.
In the club this could be expected, but the pub was not too loud. The technology that Orange is using for Wildfire is just not up to scratch for normal use.
PS. There are some interesting 'features' in Wildfire (these phrases will not be exact, but play around with them): 'Do me a favour' gets the response 'What kind of favour?' you can then say 'I'm feeling depressed' which gets the response 'Why don't you tell someone who cares' or 'What does a cow say?' which gets the response 'MOOOOO!'
Re:AT&T has been doing this for a while (Score:2)
Re:AT&T has been doing this for a while (Score:2)
Simple solution.. just use gibberish. works for me :)
To sign up for new services, press or say 1 ... To find out about existing...
brr?
I'm sorry. I do not understand that command. Please try again. To sign up for new...
ack!
I'm sorry, I still can not understand your command. Please stay on the line and one of our operators will assist you.
Re:AT&T has been doing this for a while (Score:1)
Wouldn't that be a great job? Spending all day handling complaints / fixing problems for people who don't properly pronunciate and articulate.
Sounds like my old tech support days. :-)
Re:AT&T has been doing this for a while (Score:1)
Systems right now do ok in noisy environments, but they're improving all of the time. Just think of where text-to-speech was a few years ago, and look where it is now... recognition will catch up.
Todd
Re:AT&T has been doing this for a while (Score:1)
Re:AT&T has been doing this for a while (Score:1)
Todd
Re:AT&T has been doing this for a while (Score:2)
Link for No Registration (Score:3, Informative)
Voice recognition puns (Score:3, Funny)
"Thank you for calling 999, which service do you require?"
FIRE!!
"[pause]Your request has been passed on. In order to optimise future use of this service, please repeat the following list of words in a steady voice: cat, dog, bar, sky, foo..."
Even in the future... (Score:2, Funny)
Sure it's working... (Score:2, Insightful)
>I'm wondering if this sort of thing is starting to work anywhere
Voice recognition works great in real world applications. Directory assistance in the city I live uses voice recognition to find out what language you speak, the city for which you want a listing, and it can even do voice recognition on common businesses. (No doubt for a fee) All without any operator intervention. It's pretty cool.
Had a good experience recently with one myself (Score:2, Informative)
Re:Had a good experience recently with one myself (Score:1)
Re:Sure it's working... (Score:2)
It can be done (Score:2, Interesting)
I would have to agree that the technology is getting closer to replacing human beings...maybe I should go check my retirement plan now
SJ's system (Score:5, Funny)
OtherHalf (in very clear voice): Stockholm
Computer : click, click,... Kiruna!
OtherHalf : Stockholm!
Computer : click, click,... Moscow!
OtherHalf : Stockholm!
Computer : click, click,... Alpha Centauri!
etc...
To be fair, it does eventually work, it just takes a while. It probably also takes less total time than the alternative (short conversation with a human, but a long wait to get to talk to them).
The best thing about them was a recent radio program. They had done some reseach to find out what words sound (to the system) like destinations. During the show they'd phone SJ up and say things like "I want to go to FsckingBastardVille", to which the computer would reply "Northern or central Stockholm?" and other such amusements.
Hours of fun
Re:SJ's system (Score:1)
Re:SJ's system (Score:1)
Re:SJ's system (Score:1)
Re:SJ's system (Score:1)
Cheaper solution (Score:2, Funny)
Re:Cheaper solution (Score:1)
Re:Cheaper solution (Score:2)
I think I laughed more at that one throwaway scene than any other Rob Schnieder movie
More technology - Less jobs (Score:2, Insightful)
I'm enough of a realist to understand that the evolution of swapping jobs with technology is unstoppable but still: With the current recession, that's not really a thing to be looking forward to.
Re:More technology - Less jobs (Score:1)
For example, it costs sprintpcs and att wireless approximately 40/month to service an account and half that is support. In many cases, a lot of specific domain knowledge is lost because of the high turn over rate. Building knowledgebases to maximize the quality of support is a very difficult task. I know some one that built a natural language support system using knowledgebase and expert system shells. It was far from trivial and isn't a full proof solution. In most cases, the actual deployment doesn't lead to drastic cuts in support staff. Rather it allows them to work more efficiently. Usually support calls comes in herds and is totally overwhelming.
I'm sure there are people that are displaced by support systems, but from my knowledge, it's not as bad as one would guess.
InterVoice Brite product... (Score:2, Informative)
This isn't.. (Score:3, Interesting)
It's been a while since there was really much media hype about voice recognition technologies. Sure, the whole voice activated menu's "1, 2 etc." has been around for quite a few years, but I suppose there is a huge difference between repeating a few numbers than describing technical problems. I mean, is this literally a flowchart menu with various diagnostic paths or does it actually try and understand a sentance? If it's the former, then that is nothing more advanced than what is currently available and probably in use elsewhere.
I wonder what would be more frustrating, repeating yourself twenty times to a computer to battle through a menu, or sitting for twenty minutes trying to explain your problem to a ex k-mart 1st line support engineer. The choice is yours
Re:This isn't.. (Score:2, Interesting)
Somewhere in between, but more the former than the latter.
TuxVox uses Nuance to perform the speech reco; I'm currently integrating Nuance into a new product for the division of IBM that originally developed Sprint's "Voice Command" voice dialing system.
In this sort of speaker-independent voice reco system, you provide a grammar of the utterances you expect the caller to say at each step of the way. For example (in JSGF format, which isn't what Nuance uses but is more BNF-like):
It's fancier than a menu, but it's far from free-form speech. For example, the above would understand "please destroy Redmond", but not "let's nuke New Jersey". Still, with clever grammars, you can do pretty well.
airlines and long distance (Score:2)
Re:airlines and long distance (Score:1)
Re:airlines and long distance (Score:1)
As I seem to recall, it was using IBM's ViaVoice and interfaced with the reservations system. Since it was a pilot product, it was only offered to specific WorldPerks members (about a thousand). Any time that the recognition failed, it kicked them out to our support lines for a manual reservation. Although, it did get better over time. Sitting at the call center alone at 2a gives you a pretty good feel for call volumes...
And now Sprint PCS is going the same route with their support, as well. There's Claire, your Personal Digital Assistant. All you need to do is press *2, *3, or *4 on a Sprint PCS phone to get into it. Pretty lame, really, as it just hinders the process of getting actual support. But their process is rather unique in that you can say, "I want Customer Service," and it will forward you to the real support queue. Rather ineffectual, if you ask me. For a demo, you can go to either a local Sprint PCS store or your local Radio Shack (they both have phones available for demo calls, and remember that long-distance is free
Heh... (Score:1, Interesting)
That's the good thing about a Handspring.. You have no need to call tech support
Dutch railways has the same (beta) (Score:2, Insightful)
It's nice to realize that they've made an attempt to recognize polite customers: words like "please" are ignored.
Re:Dutch railways has the same (beta) (Score:1)
I couldn't believe it when the article ended with
"We're actually getting people saying thank you -- to this robot," he said.
He didn't say whether his system answers this with "You're welcome!" The more that systems like these resemble normal human speech, the better. If I use pleases and thank yous, the system should be - I don't know - just a little bit nicer.
If the system could detect the tone of the caller, that would be even better. A caller who seems frustrated could be transferred to a human. An angry caller could have a different tree of prompts.
As hard as it must have been to talk to answering machines when they first appeared, it doesn't compare to talking with a computer. Ten years ago we had the technology to interact with our computers with voice, but most of us felt silly doing so in front of others. We still do, in certain cases. This is changing as more of us use these system in public.
The social barriers to voice human-computer interactions seem to be larger than the technological ones.
On TuVox's director of worldwide customer relations calling his system "a robot": about 15 years ago I worked directory assistance for the local phone company. At this time they had the technology to let the computer read out the number once we had found and confirmed it. One woman, after confirming the address, added "And don't give me to the robot!" I think she actually pictured me passing the phone to a tin robot beside me. "[whirr] [click] The new number is 555-3111 [bzzzzzzzzt] [click] [whirr] [bzonk]"
yo
Leading Edge NLSR (Score:3, Informative)
For people interested in seeing how far NLSR (Natural Language Speech Recoginition) can be pushed for specific applications go and look at VeCommerce [vecommerce.com.au] and their demo clips. The betting system I helped build can take betting sentences of over 100 words with 96% accuracy. (Data from a live system with 1200 lines)
Customers HATE DTMF based systems, this sort of thing is the way of the future.
Customer Relations Management is all about... (Score:5, Informative)
CRM is *expensive*. Forrester Research did a study a while back on the average cost of handling customer calls by various means:
Telephone: $33.00/incident
Email: $9.99/incident
Chat: $7.80/incident
Message Boards: $4.57/incident
Knowledge Base: $1.17/incident
The technology of this article shifts a call from the top to the bottom of this list. They admit that the advance is not in AI or voice tech, but in making the experience "resemble a conversation". So at its best, this will still let grandma have *some* access to the information she could have had before from a live human. At its worst, it's a puppet show to distract us from the fact that we're not getting very good service.
First sad trek post (Score:1)
you: "I have a problem with my handspring treo"
Tuvox: "I fail to see your problem ensign. It would appear you have selected an inferior technology."
graspee
Re:First sad trek post (Score:1)
Tuvix: "That is not logical. The Handspring has no problem, therfore reasoning suggests that you have a problem, enisign. Please report to sick bay at once!"
Re:First sad trek post (Score:1)
Might need to save a king...(Simpsons ref) (Score:1, Offtopic)
[tries to walk on his leg, falls back] Uh, I'll save you by
calling the police. [dials 911]
Voice: Hello, and welcome to the Springfield Police Department Resc-u-
Fone[tm]. If you know the name of the felony being committed,
press one. To choose from a list of felonies, press two. If you
are being murdered or calling from a rotary phone, please stay on
the line.
Bart: [growls, punches some numbers]
Voice: You have selected regicide. If you know the name of the king or
queen being murdered, press one.
Thanks, SNPP
AI Vox Voice Interface (Score:1, Troll)
Ay, 'tis the grand convergence [virtualentity.com] of TuVox [tuvox.com] and Tellme Networks [tellme.com] and all these other speech technologies [sourceforge.net] leading us inexorably onwards towards the Technological Singularity. [caltech.edu]
Speech technology [scn.org] for Open Source Artificial Intelligence [sourceforge.net] is now at a critical point, because the free Open Source Robot AI Mind [scn.org] has become capable of immortally self-rejuvenating perpetual mentation [scn.org] and therefore any Linux maven with speech-tech know-how may vie for the distinction of hosting the longest-running Artificial Mind [sourceforge.net] and of equipping the AI with truly phonemic speech recognition and generation.
-1, please resection MLP (Score:1)
some airlines use this type of sytem (Score:3, Informative)
It then asked me to speak the destination city and the departure city, then asked for the claim number I got when I reported the bags and it would let me know that theyd still not found my luggage.
This was 2 or 3 years ago and it worked pretty flawlessly, and I'm pretty sure the technology has come along since then too. There were times I had to repeat myself, but that's better than sitting on hold forever just to be told by the person on the other end who's day, in their minds, is worse than yours that you should stop worrying about it and get on with your life.
I wonder if it would understand "Bairisch"... (Score:1)
Regardless of whether TellMe or something like this would be able to understand German, I'm still preferring the keyboard which doesn't spam someone like me who speaks a rather strong Bavarian accent ("Scheissglump vareckts, sacklzement, hoit dei Mai und moch hi...") which is getting even worse if something annoys me
Odeon cinema (Score:2, Funny)
Sprint PCS (Score:1)
-Dan
unixpunx.org - punks, computers, technology
Good example in the UK ... (Score:2)
"Welcome to the Odeon Film Line! To pick the cinema you want just say the name!"
To which you do and, in my experience, its got it right every single time. Including stuff like "Odeon Leicester Square", "Mezzanine", "Wimbledon" and "Manchester".
From what I understand they use software by Vocalis [speechtml.com].
Re:Good example in the UK ... (Score:1)
Re:Good example in the UK ... (Score:1)
Re:Good example in the UK ... (Score:1)
You may be interested to know that it will only ask you to confirm if either (a) you're calling from outside what is deemed the catchment area of the cinema, (b) if you're calling from a mobile or (c) if it wasn't confident on its interpretation of what you said.
UPS has a great one (Score:2)
duane
(Note, I still don't like them. The package I was complaining about had been left in a puddle near my garage, and the guy wrote "delivered at front door" on the slip.)
Risks of voice recognition (Score:3, Interesting)
The moral of this story: make sure that there's a touch-tone menu to fall back on.
Re:Risks of voice recognition (Score:1)
Todd
Tevok and TellMe (Score:2)
By contrast, I once called SprintPCS and ended up on a similar system, but it was terrible. The VR was flakey, and it did not degrade gracefully when it didn't understand, leaving me disoriented. I confirmed from a friend at TellMe that SprintPCS used someone else.
I don't know anything about Tuvox, but I question whether they will have success against TellMe, which not only has good tech, but is very well backed. If they're betting on their "AI", they're probably dead as soon as people find out it sucks. If they're just trying to be a better TellMe, they have a challenge--but I hope they come out with a competing public service to get publicity!
Re:Tevok and TellMe (Score:1)
SprintPCS Voice Dial (Score:2, Interesting)
For those of you unfamiliar with the system it works like this.
User
SPCS
User
SPCS
User
Done
Its super convient when you are in the car or running through an airport and don't have the time to look down at the phone. The reason I'm impressed with it is because you don't have to "train it" to your voice.
Re:SprintPCS Voice Dial (Score:2)
In my experience (as recent as yesterday), Claire is both hard of hearing and pretty darn stupid. The most frustrating thing is that the entire system is designed to prevent you from ever getting to a real person, so if Claire can't help you, you're SOL. I did notice that after about a half-dozen failed attmepts to go through Claire, I got a different answer (I was dialing 611 to try to get them to do the PRL update they've been unable to do since August), this time asking for the last 4 digits of the SSN of the account holder. I'm not sure if this was programmed, or if Claire just happened to go off-line at that time.
I *really* hate companies that use IVR systems to *prevent* you from getting customer service. Sprint is probably the worst offender I've encountered at this, but then they have by far the worst customer service I've ever encountered in a service company. (Although Home Depot clearly takes the cake for worst customer service overall - they direct all complaints to "Ben Hill". Like Arlington Hewes [tpc.int], Mr. Hill does not exist - the phone is answered by a rotation of store assistant managers, who do not track or follow-up on complaints. It's essentially a well-desiged and deliberate bit-bucket for Home Depot's customer complaints. Once they have your money, they're not really interested in hearing if you have a problem - the height of poor customer service.)
Oops (Score:3, Funny)
Re:Oops (Score:1)
Seriously, I do wonder how much tech support they hand out... My visor installed flawlessly from step one. I've even bounced it off the pavement once. Still chirping at me!
The good and bad of current voice recognition (Score:2, Interesting)
They seem to be pretty much exclusively based on grammar files. Basically, you write out a grammar that lists all the possible things you think the person speaking would utter and then match them up to different branches in your system. Unfortunately, you can't easily take free form speach and store it as anything other than a sound file. This makes it difficult to do something such as allow the user to speak a message to send as an e-mail. The VXML engines have a great deal of heuristics to handle differences in speach style and tone, but without the grammar, you pretty much need to go through voice profile training to get decent results.
If anyone knows of kewl advances in this particular area, I'd love to hear them!
www.1-800-555-TELL.com (Score:1)
What the.. (Score:1)
Open Source Voicecoders Needed (Score:3, Interesting)
SpeechWorks' OpenVXI [cmu.edu], originally promoted as an open source VXML interpreter, has turned out not to be a good one. Speechworks developers maintain the code, and refuse to incorporate the patches and requests of the open source community, in favor of keeping OpenVXI tied to Speechworks products. The codebase could be forked, but it's really not worth investing the effort in such a brittle product tied to proprietary solutions.
Bayonne [sourceforge.net], the GNU telephony server, is great and getting better all the time. It currently supports a strong scripting language for DTMF applications, and Bayonne's XML plugin structure and built-in support for multiple telephony cards makes it the logical choice for open source VXML.
All that's needed at this point is to finish integrating Bayonne with an open source Text-To-Speech engine (most-likely candidates are Flite [cmu.edu] or Festival [festvox.org]), Automatic Speech Recognition engine (in this case, Sphinx [cmu.edu]) and write the XML plugin. But there is a shortage of coders with the skill and time to do this.
I really think small business and the average Slashdotter could benefit from an open source VXML solution. Small businesses could create professional telephony apps that could make them much more competitive (from accepting credit cards securely over the phone to providing dedicated 24-hr support numbers for their products), while creative coders could use it for everything from Eliza-style chatbot answering machines to having your boxen call you up and describe a hack attempt as it's being made.
I'd love to see a VXML enabled Bayonne blow TellMe and others out of the water. If you're intrigued and you'd like to get involved, check out Bayonne's Sourceforge site [sourceforge.net] and sign up for the mailing list.
UPS has a nice system (Score:1)
Rattle off a legit UPS tracking number fairly rapidly (still understandably, though), and it repeats back to you what it thought it heard. Even calling from my car, with a handsfree car kit, it's never once gotten it wrong.
Fairly impressive, if I may say so myself...
I've used this! (Score:2)
Can't Understand Themselves (Score:2)
To write these tests you often recored what the user would say to the system "1017" for flight 1017 then play it back at the correct time in the menu. We like to get our customer to recored the message so there is no question about how it sounds or is said. But sometimes we recored the system itself. Often the system has trouble even understanding its own voice.
It is also amazing how fast people who test these systems manually learn to speak so that the system understands them. Automated testing using recorded prompts makes a difference.
We collect some of the prompts we get back from systems
http://performance.empirix.com/VoiceIndex/hear-
How well does it work? (Score:1)
Voice Interfaces (Score:1)
Road noise from the car, the cell phone cutting in and out, etc. all screw-up the experience. The human ear (and brain) is great at piecing together something meaningful heard when someone is cutting in and out, but unfortunately, voice interfaces aren't there yet.
Until mobile phones become as reliable as landlines, I think voice interfaces will be handicapped. It's like trying to type SMS messages on a number keypad....
the technology has been around for a while (Score:1)
the primary shortcoming in developing these type of applications is access to the equipment needed. we used voxeo for voice applications, and simplewire for sms services. (we had a couple of other suppliers for various other services, but those are the two that i remember.) the applications themselves are very simple to write. voxeo (and i think most of their competitors) all use voiceXML. any competent html author can learn how to write an interactive voice application without too much trouble- if they have an agreement with voxeo or somebody else (tellMe and beVocal come to mind) most of these have free developer programs, but to actually deploy an application on them for commercial use costs money.
in the end we got out of the application development because the engineers spent all of our time writing one-off demos that never got used, and our sales guy spent all of his time setting up "strategic partners" (translated: we do work for them so we can say we have clients, but they dont give us any money) and trying to sell to other companies that were as broke as we were. we changed our focus to create a platform to develop the applications we were spending all of our time writing. we developed a visual studio-like ide for converged applications, with wizards, template applications, tutorials, etc. more importantly, we made access to all of our secondary providers transparent. in one environment you could send sms messages through simplewire, run voice applications through voxeo, deliver wap and pqa applications, and have them all pass information from one to another more or less transparently.
for a while we were giving out free logins for people to sign up and write applications and give us feedback on the system. we almost posted it here on slashdot once upon a time, but at the time, we weren't sure our system would handle that much traffic (or that voxeo wouldn't start charging us more for that much traffic) that and there were a few security concerns in our system that we weren't ready to expose (you could embed php, perl, or c code in your application inline. very handy to have for trusted developers, but certainly not someting you want to hand out to everyone...)
unfortunately at this point our sales person still hadn't sold anything, despite the fact that everyone who saw our environment was drooling over it. we fired him when he (literally) told our ceo "i couldn't give this away". plus we were almost out of money, and (i'll still never understand why) he was the highest paid person in our company. on top of this, we were unable to get any further invstment because (1) we didnt have any paying clients and (2) our ceo never actually demonstrated any of our working technology to the investors, and was too busy talking about how zondigo (our company) leveraged adMonitor technology -adMonitor was written by our ceo, frank addante, the former ceo of l90 (check out l90 on fuckedcompany for a good laugh). half of our seed round of funding went towards a licence to use admonitor technology. of course the only thing we ever used it for was tracking hits to our corporate website. and even that was at the insistence of frank, as that didn't tell us anything we couldn't have found from the server logs.
anyway, around june, the company ran out of cash and shutdown. all of our technology was purchased by a company named voxicom that is using it to develop a voice recognition baed email service (similar to the wildfire somebody else mentioned somewhere)
anyway, the point of this gargantuan post is that
1) the technology to create these applications is readily available, and usable by anyone competent at writing html
2) the technology to create these applications has been readily available for at least a year now
3) the technology to create these applications is only available through a few select vendors due to backend hardware and software requirements
4) my old ceo and head of sales are both idiots.
article without login (Score:2, Informative)
the article without registration [nytimes.com]
Hello from TuVox (Score:2, Informative)
AMEX - American Express - uses Voice Recognition (Score:1)
Old Example.... (Score:1)
FedEx package pickup (Score:1)
I've never had the system misunderstand me, unless I obviously mumbled into the phone. I guess my only complaint is that its a tad slow when processing what you say, but otherwise its pretty convenient.
Many companies are involved in this... (Score:2)
Telephony-based voice-recognition is going to be the Next Great Thing (tm). The main companies that are involved in this stuff are SpeechWorks [speechworks.com], Nuance [nuance.com] (both work on the main speech recognition/software stuff), HeyAnita [heyanita.com] (which works with Sprint [sprint.com]), and TellMe [tellme.com].
Re:Many companies are involved in this... (Score:1)
more pr (Score:1)
Fucking voice prompts. (Score:2)
Sorry, but there is no substitute for a human at the other end of the line. Charge more if you have to, but answer the phone!
Re:Fucking voice prompts. (Score:1)