The Future of Speech Technologies 101
prostoalex writes "PC Magazine is running an interview with two of the research leaders in IBM's speech recognition group, Dr. David Nahamoo, manager of Human Language Technologies, and Dr. Roberto Sicconi, manager of Multimodal Conversational Solutions. They mainly discuss the status quo of speech technologies, which prototypes exist in IBM Labs today, and where the industry is headed." From the article: "There has to be a good reason to use speech, maybe you're hands are full [like in the case of driving a car]. ... Speech has to be important enough to justify the adoption. I'd like to go back to one of your original questions. You were saying, 'What's wrong with speech recognition today?' One of the things I see missing is feedback. In most cases, conversations are one-way. When you talk to a device, it's like talking to a 1 or 2 year old child. He can't tell you what's wrong, and you just wait for the time when he can tell you what he wants or what he needs."
the footer offs peach take no allergy (Score:3, Informative)
(the future of speech technology must understand context)
Re:the footer offs peach take no allergy (Score:4, Informative)
Actually... (Score:2, Informative)
Re:can it replace court reporters? (Score:2, Informative)
The problem with the field is that with fewer reporters to meet an increasing demand, the lack of capable court reporters is forcing more electronic recording -- good results or not.
Now, for medical transcription, it's a great product. After about six months of use, the doctor (or anyone that dictates a lot) has gotten the computer trained to his voice and can go at a pretty good clip (150 words per minute or more). But this is one voice and a limited, task-specific vocabulary.
Re:MOD PARENT UP (Score:3, Informative)
And where exactly is new speech technology supposed to come from inside Apple anyway? They fired all the people who knew anything about speech in the 90's and shut down the labs.
Doctors are going to use speech recognition (Score:4, Informative)
http://www.tietoenator.com/default.asp?path=1;93;
Re:IBM Speech - Needs Superhuman sales to survive? (Score:1, Informative)
As I said, Nuance (Scansoft) bought them all up; not just SpeechWorks and Nuance, but Draggon, Lernout & Haupsie, etc. They still sell a bunch of (Windoze) retail SOHO packages for a hundred bucks or two.
Microsoft has some crappy .NET-based stuff, but I'd give it a pass, if I were you. It's neither SOHO nor enterprise. Not sure what it is...
It's not really soup yet, but there is also a free solution. See http://www.speech.cs.cmu.edu/ [cmu.edu]. At least one commercial vendor has taken the source, hacked it up and is using it in a commercial product. At least it runs on Linux and (I think) *BSDs
- The AC OP
Re:Language Acquisition... (Score:1, Informative)
Re:So why is voice input in decline? (Score:4, Informative)
Re:Language Acquisition... (Score:2, Informative)
Open source speech recognition engines (Score:3, Informative)
http://www.speech.cs.cmu.edu/sphinx/ [cmu.edu]
image+speech recognition
http://sourceforge.net/projects/opencvlibrary/ [sourceforge.net]
Desktop voice commands
http://perlbox.sourceforge.net/ [sourceforge.net]
Others
http://www.tldp.org/HOWTO/Speech-Recognition-HOWT
http://www.cavs.msstate.edu/hse/ies/projects/spee
Do you know about other usable open source speech solutions?