Google Wants Your Voice Data 138
00_NOP writes "Peter Norvig, Google's director of research, has told New Scientist that one of the reasons the search engine launched Google Voice is that it needs more human voice data to perfect the sort of 'big data, simple algorithm' probabilistic approach to translating voices to text that drives Google Translate. Norvig says that no one is listening to your calls on Google Voice — it is simply their servers trying to get the translation right."
Um, how (Score:3)
I will say that the translation of my voice mails is terrible. Although, how can you tell if it is translated correctly if you don't listen to it? You can look for proper English, but even some of my translations are proper English yet still incorrect. (names, etc come out wrong.) Though most of the time it it's just a jumbled mess that I can't deduce the actual meaning of.
Re: (Score:1)
Your voicemails/transcriptions have a button you can check to mark whether or not it was accurate. Presumably, that is what they mean. Nobody besides *you* listens to them. On the other hand, if that is somehow not the case . . . . then . . . fuck no.
Re: (Score:3)
In that case, they can use Youtube videos for this, right? Their automatic translation is quite horrible - they could use the good/bad check there too.
Actually they can translate the same videos everytime people sees it - and until quite a high percentage of people say yes, they can test it again.
Also, when they have more than 200 Million videos in youtube, why do they need to store data from Google Voice - which is much more personal and important.
Re: (Score:2)
If the message doesn't require a double check there's no reason for them to store it as they don't have any way of knowing whether or not it was accurate. However, for ones that they do have to go back and analyze, there's a good reason why they'd want to store them. In a word regressions. Without a body of samples which were tough, they don't have any way of gauging whether or not they're truly making progress as improvements could just as easily be in the quality of the samples that they are trying to tra
Re: (Score:2)
No"BODY" is listening, but computers are analyzing every call and transcribing it to text?
Hmm, I would guess there would need to be at least some spot-checks that the transcription is working properly.
And isn't there some kind of federal wiretapping law preventing this or is it a "well, we told you we were listening in on every call"?
And methinks it just might be easier for the gov't to get these transcriptions instead of the actual audio recordings. And more convenient to, because it's much faster to read
Re: (Score:2)
No"BODY" is listening, but computers are analyzing every call and transcribing it to text?
Hmm, I would guess there would need to be at least some spot-checks that the transcription is working properly.
You do, when you click the check-mark or the red X after listening to it. However, that is a good point -- here Google says nobody will listen to your voicemail, but they still let you listen to it. So much for don't be evil! (This is sarcasm.)
;)
As for other people getting ahold of the transcriptions... not to be too facetious, but given how bad the quality currently is, nobody should worry about that yet.
Re: (Score:1)
Well, wouldn't it be worse if the transcription was wrong and the SWAT team comes to your house because they think you are making a bomb instead of just noting that the movie bombed...
Re: (Score:2)
Hmm, I would guess there would need to be at least some spot-checks that the transcription is working properly.
The only time somebody at Google listens (well, is allowed to listen) to your voicemail or recordings is when you click the red X and consent to their review.
Re: (Score:2)
Your voicemails/transcriptions have a button you can check to mark whether or not it was accurate.
And once you click that button, you have the option to donate your message so they can use it to improve their software. Example: http://img808.imageshack.us/img808/242/unled90.png [imageshack.us]
Re: (Score:3)
Re: (Score:2)
Re: (Score:2)
They don't listen in, at least not initially. You do. If it's not translated correctly, there's a box for you to check that gives them permission to listen.
Re: (Score:3)
They've gotten some pretty good data from me. My Google Voice number isn't currently published, so all of the voicemails I get are wrong numbers (or tests). They are completely incomprehensible to me, and to Google Voice - although one did a fair approximation of jibberish English (I think it was in some African dialect). Most seem to be in African languages, although a few are central European sounding. Good luck getting a good translation - but that's the magic that Google is trying to accomplish: tra
Re: (Score:2)
Oh, and I do get a fair number of advertizements and service calls. If you had an appointment with Comcast last Thursday, the tech called the wrong number - that's why he didn't show up. Google did a good job on the translation though...
I have similar experiences, only with email instead of voicemail.
This confuses me as I've owned my own domain for just under 12 years (and was the original registrant), and am the only recipient at the entire domain. It's my personal address and a few generic role accounts (postmaster@, abuse@, etc.) that forward to my personal account. There is no reason why someone named "Diane" should use my email address (pete@[my slashdot username].com) when scheduling an Apple Store appointment in South Carolina (not
Re: (Score:2)
My experience with a Nexus One...
If I used voice transcribing for to the phone directly: as in I spoke to my phone to do a google search or write a text, it came out fairly well. There was the occasional error but mostly on things like names.
But my transcripts from the voice mails I receive were often trash.
I guess it has to do with the sound-quality: it probably uses the original high-quality recording locally so it performs good Google searches. Meanwhile the compression and static over the phone line (
Re: (Score:2)
I've had the same experience, my voicemail transcripts are garbage.
When I speak into my phone to write a text or run a search I make a point to speak slowly and enunciate very clearly. I suspect most people don't make the same kind of effort in voicemails.
Re: (Score:1)
Another server (Score:3)
They have another server that checks the first server's translation. Part of their work is checking that server's affectiveness, too.
Re: (Score:3)
Re: (Score:2)
Re:Um, how (Score:5, Insightful)
We should just get over the fact that privacy is gone, eh? Not here, my friend. Not ever. People have a RIGHT to privacy despite what anyone will tell you.
The fact that the majority of people could give a hoot in hell about their personal and our collective privacy will come back to haunt us. I don't understand why people would willingly give up their privacy for a little functionality, cool tech, gadgets, whatever. I value my privacy and I don't share my info willingly with anyone or any organization without a lawful requirement, e.g. SSN for employers, banks. I even fought my medical insurance company on getting my SSN because they have no legal mandate to possess that information.
I will not have grocery store cards to save money for the same reasons. I will not trade my personal information for a little savings.
These companies take our information from us and profit greatly while we in turn get what? A "free" email account laden with ads that track our behavior? This is not a win-win situation and no one really cares, because they can chat with their friends across the globe in real time, make "friends" on Facebook they will never meet or really know.
Where does all this end? When the entire world is one transparent collective society where no one has any privacy whatsoever? Personal information is a goldmine as is shown by how desperately companies want to get their hands on it. I think there should be a citizens' clearinghouse where people can agree to sell their info for a profit -- opt-in by default. Anyone caught trying to get around this clearinghouse pays dearly legally. Companies bid for your personal information and you profit as well. Anything short of some model like this is completely lopsided in favor of corporate interests that don't have our best interests at heart.
Re: (Score:2)
You assume that privacy has value. What value really is there to it? There are a few good examples, but none of them apply to the current scenario, or even the sharing of email addresses. And before you reply "Because", think about the value judgment that you're making. What exactly is it that makes privacy as valuable a concept as, say, equality or liberty?
Finally, a free email system of the quality of gmail is worth a lot to me. Google can have the privacy that they're asking for it. One thing is sure tho
Re: (Score:2)
In the end, do I really care who listens to my conversations? No, they can listen to all the phone sex if they want.
They aren't listening to your conversations. They are listening to your voicemail if you send it to them. And if someone were to have phone sex with my voicemail, whether I sent it to them would depend upon who was calling.
Self-checking (Score:3)
Re:Self-checking (Score:5, Informative)
If you log into your Google Voice page, and look at a translated message, in the lower right corner there is the question - "Transcript useful?" along with yes/no checkboxes. If you check one, it asks if you want to "donate" that VM to improve the translations, you can answer yes/no/never:
Re: (Score:2)
It's too bad they don't let you fix the transcription. Even if they're worried about people trying to poison their data (like people talk about with ReCaptcha), they could at least the user fix his own view of it.
Re: (Score:2)
Re: (Score:2)
I don't understand why they don't let me correct it myself.. Not only would I be helping them with their algorithm, but then I would have a known good transcript to save in my voicemail for later searching.. (Searching is pretty bad if your looking for a word or phrase that is often mis-interpreted).
Re: (Score:2)
I don't understand why they don't let me correct it myself.. Not only would I be helping them with their algorithm, but then I would have a known good transcript to save in my voicemail for later searching.. (Searching is pretty bad if your looking for a word or phrase that is often mis-interpreted).
I suspect the fear of trolls mistranslating is too great to allow anyone to do it.
Imagine some people "helping" the algorithm by insisting that if someone says, say, "music" it actually "translates" to "we're no strangers to love".
Re: (Score:2)
nice (Score:1)
>simply their servers trying to the translation right
>trying to the translation right
>the translation right
Nicely done.
Re:nice (Score:5, Funny)
Re: (Score:2)
Google voiceover: Gentlemen, we can rebuild CmdrTaco. We have the technology. We have the capability to build the world's first bionic slashdot editor. CmdrTaco will be that editor. Better than he was. Better... stronger... faster.
Re: (Score:1)
Think of the translations, man... they have rights.
servers trying to the translation (Score:2)
Re: (Score:2)
Re: (Score:2)
I what you did there.
They Make it Hard to Delete History (Score:5, Interesting)
Re: (Score:2)
Good news - after you go through all the painstaking, tedious work of deleting them ten at a time, they're really gone forever!
*snort* yeah, I couldn't keep a straight face while typing that. Hopefully you couldn't keep one while reading it, either.
Re: (Score:2)
Is anyone surprised? (Score:1)
This is the price of "free" services.
Re: (Score:1)
on the flipside, if you're a privacy advocate (which I absolutely get!), then don't sign up.
the thing that I don't get is people shouting "i told you so" at all the people that use google services - we get it, we already know they want to mine our data - and we WANT to give it to them!
*disclaimer: i do not use GV due to the fact that I MMS more than a teenage girl
Re: (Score:3)
We are derisive towards "Hai This is Facebook. Plz give us ur full name, address, cell phone number, age, and eye color so we can give you five Farmville sheep."
But you bring up the more interesting case, "Awesome service versus abused data". (Shout out to Holland and TomTom for yesterday's example.)
Or here, Google Translate vs ... a billion hours of juicy phone calls!
Speech is "Audio" - All we need is a hacker and a Wikileaks Dump!
Re: (Score:2)
This is the price of "free" services.
on the flipside, if you're a privacy advocate (which I absolutely get!), then don't sign up.
And sometimes you pay the price [eff.org] anyway, without your consent, and when the services aren't "free". Given the (lack of) choice of my data and money going to a company that isn't really innovating that much, or to an entity that's ostensibly trying to move the state of the art forward and using data at least partially to this end, I can't see how a privacy advocate would consider GV worse than their current voice service.
A "false-sense-of-privacy advocate", perhaps, or one who refuses phone or voicemail se
Re: (Score:2)
I gave up (Score:4)
- understand a lousy accent: there are some words I cannot and will never be able to pronounce 'right'
- recognize what language is being spoken (having those 3 and only those 3 preset in the options)
Now I haven't tried Google Voice, but none of the software I've tried or heard about could even remotely do those two basic things.
Re: (Score:2)
Re: (Score:1)
They are basic (in the sense that they are a must) for a tool like google voice's.
To tell apart different languages and guess when a word is a foreign language word.
I know three languages. Mother language (spanish) second language (english) and some japanese.
I can still tell when somebody is speaking different languages that I barely know(german, french, chinese, portuguese, italian).
To put into letters words that it does not have in its vocabulary and no just try and find the closest match
To understand dif
Re: (Score:2)
Considering that outside of Africa only a very small fraction of the population speaks more than two languages let alone fluently, I don't think that it's a basic request.
Re: (Score:2)
Considering that outside of Africa only a very small fraction of the population speaks more than two languages let alone fluently, I don't think that it's a basic request.
It strikes me that Europe might disagree with you on that.
Re: (Score:2)
I don't know, even in Spain where they have a dozen languages, few people speak three of them (or two + English).
Re: (Score:2)
I'm Canadian and I speak 3 languages. English, French, and Rubbish. Mostly Rubbish.
Re: (Score:3)
I could be wrong, but I doubt most Europeans are fluent in more than two languages, and I bet a significant number aren't fluent in multiple languages. The reason I'm singling out Africa there is that in parts it's very common for people to speak not just one or two, but three, four or more languages and to have to learn a new language at marriage so that they can communicate.
Trust me, Europeans have nothing on that.
Re: (Score:3)
have to learn a new language at marriage so that they can communicate.
Married people communicate?
Re: (Score:2)
Funny, I was just thinking that everyone has to learn a new language to communicate after they get married.
Re: (Score:2)
The Malaysians I know do. They all speak a local dialect (their first language), plus they speak Mandarin (the regional language taught for normal communication), plus they speak English (the language they learn to conduct business). They can't really co-mingle the applications, either; they don't know many business terms in their native language so English isn't just an option, it's preferred for those uses.
Re: (Score:2)
Va te faire enculer, pendejo
Re: (Score:3)
Considering that outside of Africa only a very small fraction of the population speaks more than two languages let alone fluently, I don't think that it's a basic request.
40% of EUropeans speak English well enough to have a conversation (not including native speakers). In some areas (Switzerland, Belgium, Luxembourg, places near country borders) it's not unusual to speak an extra language.
If you're a European child you speak [your version of European], learn English at school because English is useful, and if you like languages you might choose another; in the same way, perhaps, that an American child might choose to learn Spanish.
I know a little French and a little German -
Re: (Score:1)
My fiancee speaks 6 languages fluently, like a native, and switches between them with an ease that impresses the shit out of me. They are Korean (she is Korean), Tagalog, Mandarin Chinese, English, Japanese and French. The first time she came to America, Immigration didn't want to let her in because her English was so good they didn't believe that she had never been here before.
So, yeah, there are lots of people that speak multiple languages. Just not, unfortunately, in America.
Re: (Score:2)
Google Voice is pretty amazing, even at the early stages. It can auto-recognize many languages. It can also do a fair job with bad pronunciation. Google translate is able to understand my Spanish - which is fairly incomprehensible. Of course, my vocabulary is limited to that provided by public high schools in rural North Carolina back in 1982. Good luck getting me to do more than ask for directions to the bath room.
I'll steal a joke from David Sedaris - a single year of high school Spanish just isn't e
Re: (Score:1)
I haven't heard the Sedaris bit, but in France you can actually ask for "fire" when you want a "light". "Can I get a light?" -> "Tu as du feu?" (I do get your point though.)
But even ignoring that you can ask for "fire" in France, auto-translators have to realize that you can't word-for-word translate, but also understand that written language is often different than spoken language. Some have started to pick up on this, but they all still have a ways to go.
Re: (Score:3)
I used to work with a "trilingual" fella.
Born in Itally, raised in France, and then lived in the USA for 17+ years.
He effectively spoke no language.
Bad Itallian, worse French and jumbled English.
is there an app for that?
Re: (Score:3)
I have had "conversations" with people coming back to Mexico after living years in the US... poor fellows, their Spanish is incomprehensible and their English sounds like a racist joke.
Pochos have really developed their own pidgin language :S
Re: (Score:2)
One of my flatmates speaks four languages fluently. I think her English is at least as good as mine was when I was 16, in some cases better (consistently using "whom", "and I" etc correctly).
I'm learning German; when I get a little better I want to have German-speaking Sundays (Deutsche sprechen Sonntags?... Wrong conjugation of sprechen, probably the wrong word order, oh well, I've not been learning long.).
Re: (Score:1)
Re: (Score:2)
You might have more luck with software that has a training phase before doing recognition. I've had great success with Germans, Chinese and Israelis speaking English through extra training.
As for the language, I'm not sure about consumer software, but phone systems (and probably Google Voice...details are slim on it) will often run recognizers for several languages and then go with the recognition with the highest confidence.
Maybe it tried to translate the summary (Score:1)
it is simply their servers trying to the translation right.
Re: (Score:3)
Re: (Score:2)
Re: (Score:2)
no, it would look more like this ...
innocent lythe airs her versed rhyme to theatre and slay shun, right?
Yeah, you need voice data to get voice to text (Score:2)
And you're surprised why? All voice apps. do this. Always have, always will, and until it's perfected, and we're a long, long way from perfecting it.
Steven
Hmmmmm, looking for accuracy are we? (Score:1)
"it is simply their servers trying to the translation right"
I think you a word in your sentence.
Doesn't work well... (Score:4, Insightful)
"Peter Norvig, Google's director of research, has told New Scientist that one of the reasons the search engine launched Google Voice is that it needs more human voice data to perfect the sort of 'big data, simple algorithm' probabilistic approach to translating voices to text that drives Google Translate. Norvig says that no one is listening to your calls on Google Voice — it is simply their servers trying to the translation right."
I think Google Voice translated the last part of that sentence.
Entertaining anyway (Score:1)
The translation is off pretty far most of the time for my voicemail. But they do end up to be entertaining. Here are two actual translations from google voice:
1) Okay, I don't know why it takes for ever, for your voicemail to pick up. But anyway, I was just calling to tell you that we forgot to while. I will and I told Mrs. Smith and this is best but she signed about it, so I'm gonna shout in the car and have it for her after I pick her up, bye. You Get Out virtual slot is not with us. So, wish me luck. I
I is have hamburger? (Score:2)
Re: (Score:1)
Epic wonderment! I'm get near to this.
Re: (Score:2)
Let's speak MS at it...
Dear aunt, let's set so double the killer delete select all.
Re: (Score:2)
Oblig. UserFriendly comic (Score:3)
There was a Userfriendly.org strip years ago which pretty much summarizes my experience with voice recognition software for the past 15 years. . .
I can't find the link to the comic anymore, but basically, one of the guys in the office had been trying to use voice recog software. Some of his coworkers come to his office. He's not there, but on the screen, they wonder about the mysterious message, "Cod Am Pizza Ship".
Just found it. . . (Score:3)
The Strip [userfriendly.org].
Re: (Score:2)
The Ubersoft comic ran a Google Voice comic of similar humor today. [ubersoft.net]
Note: The characters are running from Apple zealots, while working on a special project to prevent Steve Job's ego from destroying everything.
Dang! (Score:2)
Y'all kin have mah voce data. Sheeeit! I warn't doin' nutin' wid it anyhows.
Re: (Score:2)
Stand by. We've got an incoming call from King George VI [imdb.com].
Go ahead. Mod me -1 Insensitive Clod.
If we get the heuristics back as FOSS (Score:2)
Posting this from google voice (Score:2)
Google is knot an evil umpire. They our hear 2 us with wheel whirled problems. Please stop bash tag google. All your words belong to us.
Google Wonders, "How Can We Become MORE Evil?" (Score:1)
Google announces Google Voice, noting that it will be archiving and auto-transcribing subscribers phonecalls.
"But don't worry," Google Voice Product Evangelist Boris Badinov said at the press conference announcing the service's launch. "We promise full interoperability with Google Docs, GMail, Android, and the NSA. Also, the artist who does the daily search engine doodle has promised to come up with a really cool, shiny logo."
And around the world, geeks sign up in droves, many noting that they didn't even
Time for more testing (Score:1)
Algorithm focused services (Score:2)
This is the same as how they put "Closed Captions" on youtube videos.
Google has no interest in crowd-sourcing the translation or transcription of speech, they want it all automated.
Which is why YouTube Closed Captions SUCK!
Voluntary (Score:2)
YouTube (Score:1)
Just grab audio from thousands of dialogs or talks on YouTube and test it out.
Voice OS (Score:2)
Is voice signature what they really want ? (Score:1)
amazing translation i got yesterday (Score:1)
I love the idea of the feature -- I hate stopping to listen to voice mails. ...but I got the most ludicrous / hilarious translation yesterday. Pure poetry!
"Hi, My name is The bring the Anderson and I was interested in ordering. I'll call. Sarah, Mrs. Kate, on the Hudson birthday. He really liked all here in the for you. So, anyway, I would see it for next Sunday. I'm not sure where they said of. It's Hey Lady, Thank tonight anyway. If you can just give me a back. My phone number is 972."
Good to know about
Re: (Score:1)
Re:Google is breaking wiretapping laws everywhere (Score:4, Informative)
Do you not understand what voicemail is? How can record a message for someone without consenting to it being recorded?
Re: (Score:1)
Re: (Score:2)
Sure for the person owning the phone, but that hardly is going to provide consent for the person calling you in states that require 2 party consent.
Re: (Score:2)
Are you sure that signing up for Google Voice doesn't include a clause giving them permission to store the audio? IANAL, but I would also presume that they could successfully claim that it is obvious that a voice mail service must record the audio in order to store and reproduce it later.
Re: (Score:2)
It'd be kind of hard to leave a voice mail if you didn't want the receiving party to record it. Implied consent much?
Re: (Score:1)
There are many ways to do speech recognition, but one thing they all rely on is training. My guess is that you have a reasonably "average" sounding voice, and therefore the training that was done on other people's voices (specifically, the other users of the app you mention) works well enough for you. For someone with a substantially different sounding voice, it would probably work far worse than Google Voice.
I should also point out that checkbox or no, the app you're using certainly is training. It may