GOOG-411's "Biddy-Biddy-Boop" Sound Backstory 194
Chris Albrecht writes "The bippedy-bippedy-bippedy sound you hear when using 1-800-GOOG-411 is actually a senior voice designer at Google. (Here's the sound.) The technical term for that noise is the 'fetch audio,' and it's more complicated to design than you'd think. For the first time, the voice of GOOG-411 talks about how he came up with it, how important that sound is, and how people now ask him to 'perform' it."
Huh? (Score:3, Informative)
If you don't know what this is about (Score:5, Informative)
http://en.wikipedia.org/wiki/GOOG-411 [wikipedia.org]
Basically, GOOG-411 is an experimental Google telephone service. Users can call and use speech regocnition to do local business search. I think American phones have letters on the number buttons, so 1-800-GOOG-411 means 1-800-466-4411.
Re:"senior voice expert"? (Score:5, Informative)
> "senior voice expert"?
> that gives me flashbacks to the
Ummm, obviously you don't work in telecom.
Almost every automated system has the equivalent of a voice expert or a speech scientist whose job is to do things like this.
Every time you call an IVR or reach an automated speech system, someone's worked at it to make it not just functional, but also usable and friendly.
Re:free phone call? (Score:2, Informative)
Re:free phone call? (Score:4, Informative)
Re:"senior voice expert"? (Score:5, Informative)
Secondly, a speech scientist or a voice expert is quite different than a sound engineer - the latter's task includes making sure that the IVR has the same or similar sounding voice patterns all over, that the accents and terms used are standard, simple and understandable to that region, that the TTS (text to speech, if used) is set to configurations that are acceptable to the target audience and that volumes and amplitudes are all normalized (this one is probably the only thing that a sound engineer could also probably do).
Also, a speech scientist works on the voice recognition piece of things, including deciding which language models to use, designing the grammars for recognition, utilizing various tools to tune the recognizer, using various machine-learning techniques to help evolve the language models (e.g. SLMs [wikipedia.org]) and so on.
On top of this, you have to do usability analysis to see how best your system is working out. If a lot of people are zeroing out, or if there is an alarmingly high percentage of recognition errors, then there is something wrong with your system. Also, the ease of use in accomplishing a thing is also considered (e.g. how many steps does it take to get a task done and can you minimize this somehow?). Additionally, you have to ensure that unique elements being used in your IVR (e.g. the biddy biddy boop) is understandable in the context to the target audience.
Other task include determining where voice is appropriate and where DTMF would work and finding ways of notifying the user of what's going on at the background without resorting to Beethoven's Moonlight Sonata for the 37th time (which could be a challenge in its own way).
So, no, I doubt if you could equate a sound engineer with a speech scientist. Most of the speech scientists [jhu.edu] that I work with would probably feel insulted by that term.
Re:Next week on Googledot... (Score:3, Informative)
No joke.