Is Speech Recognition Finally 'Good Enough'? 313
jcatcw writes "Speech recognition software is fast, but it still may not be accurate enough. Clerical jobs usually ask for 40 wpm, but speech recognition software can keep up with someone speaking at 160 wpm. In Lamont Wood's demo it did very well at too/two/to and which/witch, but will it still render 'I really admire your analysis' as "I really admire urinalysis'? At 95% accuracy, people aren't jumping on the bandwagon. Wood's typing speed is about 60 wpm with 93% accuracy, so he found that using speech recognition was about twice as fast as typing. Those who type at hunt-and-peck speeds will experience results that are even more dramatic. There's really only one product on the US market: Dragon NaturallySpeaking from Nuance Communications. The free versions from Microsoft aren't up to the task and IBM sold ViaVoice to Nuance, where it's treated as an entry-level product."
We use it. (Score:3, Interesting)
We're using an older version of Microsoft's product and it seems the microphone quality is important.
Re:Good enough for what? (Score:3, Interesting)
sierra lima alpha sierra hotel delta oscar tango (Score:3, Interesting)
Any speech recognition software worth the $ should be able to detect and translate NATO letter names [wikipedia.org]: "hotel tango tango papá colon slash slash sierra leema alpha sierra hotel delta oscar tango dot org".
great prevention for repetitive stress injuries (Score:2, Interesting)
Re:Mod parent up! (Score:3, Interesting)
* for transcripts of a single speaker
* informal free-thought when not surrounded by other people
* when you have horrible typing skills
You had me at "* when you have horrible typing skills".
Parent post mentions their 4 year old making pancakes.
At some point, most likely, you expect the kid is going to grow up and get better at making pancakes. There will be a learning curve. Maybe 4 is too young; I haven't met the kid. But part of the point of teaching a kid to make pancakes is to get the learning curve out of the way, so they can get better at it on their own time, preferably before they are 30.
My crude analogy is that a naturally speaking soft dragon is a bit like a 4 year old pancake maker. It can be worthwhile to get used to an imperfect tool now, so that you'll have the learning curve out of the way as the tool gets better over time.
Or it can be better to wait another year. Your mileage may vary.
Here's another potential application: Get the dragon for your kid. It may be useful as she or he learns to read and write.
I for one welcome our new naturally speaking dragon overlords.
I want the throat mike module, so that it types what I'm subvocalizing.
I'm hearing a business model here:
1 form a corp to offer voice to text software
2 wave hands
3. sell out to nuance
4......
Medical transcription (Score:2, Interesting)
"I am using PowerScribe, which is a radiology speech dictation system. It is fairly accurate in the doming [domain] of medical transcription, and particularly in the doming of radiology, but it not very useful for free pexed [text] speech.
For example, there [here] is a sample of the typical chest report: Hazy groundglass opacities noted with both lungs, particularly the right middle lobe as well as the left lower lobe, with no evidence of effusion, pneumothorax, or consolidation. [this is pretty much verbatim what I said].
[But here's a free text example:] However, if a Type II right a regular letter to a friend, [if I try to type a regular...] for example setting the following, [for example, saying the following...] Yesterday was a very nice state [day]. The clots [clouds are] gone, and only a little brain [rain] remains. Today it is supposed to be even warmer outside, I think elbow [I'll go] injected [and check] with the right knob. [the weather right now]"
The biggest problem with this system, particularly for medical transcription purposes, is that it only gets about 95-97% right. That means, it's wrong at least 3% of the time. Worse yet, whenever it's not sure, it just inserts random garbage! Whatever the closest match is, which is often wrong, and sometimes fundamentally changes the meaning of what I intended.
Human transcriptionists, on the other hand, will insert a blank if they're not sure, to alert the dictating physician. This fscking system has no clue when it's wrong, which makes it very dangerous in my opinion!
Re:This comment written by MS speech recognition (Score:3, Interesting)
It was really entertaining, but I fell into what I call "The Missing Remote" syndrome: If you've ever lost your remote, you will spend 10 minutes looking for it so you can turn off the TV and go to bed, rather than get up and walk over to the TV and turn it off. I think I must have spent 5 minutes saying "Close Window" in various different ways and speeds rather than just click on the damn close box.
Of course, what I really miss in Apple's speech recognition are the avatars...
Patents (Score:1, Interesting)
Speech Recognition is more than dictation (Score:3, Interesting)
"Good enough" is very vague as applied to voice recognition. For command stuff, "good enough" has been here for about 7+ years. Even MS's free engine does a great job at that.
I used Via Voice years ago and it worked pretty well. But here's the thing: Have you ever tried to dictate something? It's definitely a skill. I'm sure some people have a natural ability for it, but I certainly didn't. I tried dictating stuff and it's tough. You hit a pause mid-sentence trying to figure out how you want to phrase something and suddenly there's a period and you're beginning a new sentence. Try dictating several sentences of original material and keeping it going without pauses and "um"s and so forth and you'll see, it's not quite as easy as it seems. I suspect one of the reasons voice recognition hasn't been a hit, is that people don't expect that. They try it for a few days think, "Hell,it's easier just to type," and give up. That's why I don't use it for writing. I can type faster and more accurately than I can dictate. I'm sure if it's something I wanted to work on, I could develop the skill, but my point is, I think that's probably why a lot of people give up on it.
I honestly think that voice recognition in command mode could be really useful at speeding things up, if software were designed to take advantage of it. But it's not easy to add it as an afterthought and it adds significant work, even if it's done with forethought. It's a chicken and the egg thing. If a lot of software supported it, I think people would see a gain in productivity using whatever software they use daily. I don't mean just using voice recognition, but in combination with a mouse and keyboard. For example: "Execute Browser. google dot com. flying burrito brothers. google search". Saying that would be a pretty fast way of opening your web browser, typing "google.com" and then typing "flying burrito brothers" and then clicking the "Google Search" button. Replace "Google Search" with hitting the enter key and even faster.
But as I said, it's a chicken and the egg thing. Software doesn't support it because there's no demand and there's no demand because people haven't really experienced software that supports it.
Another issue (and I'm sure this has been mentioned by others), is background noise. I like to listen to music or watch TV while I work. Those don't mix well with voice recognition, at least not at the volumes I listen to them. Until voice recognition can get around that and recognize my voice amidst background noise and do it accurately AND software out there generally supports it, it's not going to go mainstream.
Until we get hard ai along with it no. (Score:4, Interesting)
Find all mp3's that were created by Trent Reznor and pipe them to
I can't program in it can I?
if(i_can_write_code_I_mean_speak_code_to_the_comp
i_might_use_it_a_bit();
else
system("find
endif
But that is just me.
Re:Hmmm.... (Score:4, Interesting)
Type Faster? (Score:2, Interesting)