Talk ... Without Speaking 275
mjm7 writes "Finally, we might be able to get rid of all those annoying people yelling over the static on their cell phones! CNN has an article about a new technology that senses muscle movements in your face and then translates them into sound. This way all you have to do is mouth words into the phone...not actually speak!" Somehow I suspect that we'd lose a lot of the
subtleties of communication, but it sure would be nice every time hemos calls me from the discotheque.
Hyperion (Score:3, Insightful)
Re:Hyperion (Score:3, Interesting)
Diphthongs, by the way, are why interfaces that attempt to "read lips" without the benefit of a phonetic dictionary of some kind (and preferably a context one as well) always fail miserably, to the eternal chagrin of the CIA.
Jouster
Re:Hyperion (Score:2)
But this company says that they are already able to distinguish vowels to a high degree of accuracy. And I don't think they're only reading lips. But the real interesting possibility is that even if they *never* figure out how to perfectly identify normal English diphthongs, they could simply invent a new way of representing those sounds with your mouth.
When you executed a certain motion, the voice machine would know to insert a diphthong. With the slightest amount of feedback and practice, people would learn quick.
Iduno. Just a thought. People learn dialects fast, but usually only if they're practicing all day long... Maybe that could never work.
Not a big problem... (Score:3, Informative)
Japanese has very few dipthongs.
A word that might be spelled 'Ao' using latin characters,(Â), would be pronounced as 'Ah-ow' (sort of).
Some words do change the vowels, but usually just by extending it. The word Tokyo isn't pronounced 'toe-key-o' as much as it is 'to-u-key-o-u'. The audible differences can be very slight, though. Possibly by sensing the muscle movements, it would be easier to discern the differences.
Another interesting capability would be the ability to discern mood. Consider the following:
'Yes dear, I'd <rolls_eyes>love</rolls_eyes>to have your mother visit this weekend...'
I'm not sure that I'd want my phone telling my girlfriend when I'm being sarcastic. You could have a new groupof 'tags' kind of like those you see on IRC:
roll_eyes
clench_jaw
check_watch
sneer
cringe
shake_head_in_disbelief_at_the_studidity_of_what_
You get the idea...
Cheers,
Jim in Tokyo
Anderson (Score:5, Funny)
Ship the Enron documents to the Feds
But she heard:
Rip the Enron documents to shreds
It turns out that this was all just a case of bad cellular...
Re:Anderson (Score:3, Insightful)
Mr. Smith is now off his head.
Spot the difference.
Re:Anderson (Score:2)
Mr. Anderson, what good is a phone call, if you're unable to speak?
Re:"Funny" is Slashdot gold... (Score:2)
Actually, I didn't have the email handy so I had to do it from memory but it should be close. I'm not seeking "gold" as I have been at the karma cap for some time now. I just thought that this was appropriate for this topic. You have to realize that not everyone gets the same email as you do.
Next time, be quicker and post it yourself.
Laugh...
finally! (Score:5, Funny)
Muttering under your breath (Score:2, Funny)
The mute and deaf (Score:5, Insightful)
Re:The mute and deaf (Score:2, Interesting)
I just want a jewel in my ear that will let me communicate through subvocalizing to an all-knowing computer network/alien being (a la the Ender's Game universe)!
Re:The mute and deaf (Score:3, Interesting)
Olive Juice (Score:3, Funny)
I know what you mean (Score:2)
Re:Olive Juice (Score:2)
Or worse, "vacuum".
Try it in front of a mirror...
-Russ
Re:Olive Juice (Score:2)
injured vocal chords (Score:4, Insightful)
I'd also have to say this should be made mandatory for all people that would otherwise force me to listen to their loud cell phone conversations.
Re:injured vocal chords (Score:3, Funny)
For some reason, this sentence conjured up a picture in my mind of Steven Hawking sounding a bit like a furby on the phone.
What next... (Score:2, Funny)
Voice recognition (Score:2, Interesting)
Speech recognition software! (Score:3, Interesting)
Alternatively, you could even have a microphone attached so that when you actually did speak, it would automatically disable the recognition - no more accidentally transcribing your half of a phone conversation for example. Wait a minute, I have to patent that idea!
Sign language (Score:2)
What kind of a sound would it make if I held my middle finger up to it?
I mean really, if the static is so bad that you can't get a good enough signal to hear the person, how is the "face recognition" signal going to get transmitted?
I doubt this will ever work... (Score:4, Interesting)
Other than that, it sounds like an interesting technology.
Re:I doubt this will ever work... (Score:4, Interesting)
When I once asked a linguist friend about this on an unrelated topic, he leaned over the table and put his thumb and index finger on the outer corners of my lower lip, and then pinched them together to immobilize it. "Speak," he said. It was wierd but I sounded near normal in less than three words.
We adapt.
Re:I doubt this will ever work... (Score:2)
Wow - your friend is a pretty cunning linguist.
What about the ventriloquists? (Score:2)
William Gibson? (Score:2)
This would be/will be great. Now we just need cellphones that come in vibrate-only mode and I can finally have a peaceful meal in a restaurant without some moron ten tables away disturbing the whole restaurant with an incoming call (and the subsequent conversation).
Question: if this can eventually recognize what sounds the person is meaning to make with 100% accuracy, does that mean that voice recognition has arrived? Instead of spitting out an audio signal, it could output text instead. THAT would be AWESOME.
I'm so excited. :)
Orson Scott Card (Score:2, Informative)
Re:Orson Scott Card (Score:2)
Wonder if there is any connection between "Red Dwarf"'s computer persona and Jane.
Re:William Gibson? (Score:3, Informative)
This was in the Ender's Game series. This is how Ender communicated with Jane.
-prator
Re:William Gibson? (Score:2, Informative)
I believe it is Mona Lisa Overdrive in which the Japanese girl has the virtual assistant that she communicates with subvocally.
Re:William Gibson? (Score:2)
It's in Burning Chrome, which also contains a story named 'The Belonging Kind', which was co-authored with John Shirley.
Re:William Gibson? (Score:2)
The AI was Colin, a Maas-Neotek biosoft unit that acts as a guide to England, given to Kumiko by her father.
U2 (Score:2, Funny)
Re:U2 (Score:2)
You know, I took the poison from the poison stream....
Feelings in Haiku form... (Score:5, Funny)
I twitch, about to sneeze hard.
Phone thinks I said "F*CK."
The Styx Phone (Score:2, Funny)
"Domo Arigato, Mister Roboto"
tongue in cheek? (Score:2, Insightful)
I wonder how they will work out the consonant issues. The way an S is produced is pretty similar to a Z. At least they are pretty similar in my mouth anyway.
I suspect everyone produces consonants in a slightly different manner. I mean, when you are learning to speak, you don't stick your hand in someone else's mouth to figure out what their tongue is doing... You just maneuver your own until you make a similar sound.
So there are probably several different tongue configurations that work to produce a sound. Not to mention the shape of one's mouth may require a specific and unique tongue configuration to produce a particular sound as compared to someone else.
Sounds (hehe) like they have their work cut out for them in this area.
--Scott
one-way solution (Score:2)
Still, this is just a one-way solution. You will be able to hear the person talking in the crowd, but how will the person on the other end be able to hear anything? Will the phone be able to display the message in the form of text or something similar? Or will it just make funny faces at you?
for the 20 and under crowd... (Score:3, Funny)
anywho, i read (and probably own) the whole series in probably 4th grade, i'm 18 1/2 now. on one of their missions, they had special devices like this; except it attached to your throat muscles, which is probably a whole lot easier and less conspicious. the funny part was that they had to whisper, otherwise they'd "yell" right into the other people's earsets. good to know this stuff is comming to fruit
my teacher is an alien on amazon.com [amazon.com]
the interesting thing about the series, is that it explains in amazingly simple terminology, using a large noodle, how hyperspace works. i'd explain more, but i don't want to get modded offtopic TOO much. and i have to go to work.
It's coming... (Score:5, Funny)
Dave
-Ev
My first thought (Score:4, Interesting)
I had a great-aunt who lost a decent portion of her lungs to cancer and cigarettes, and up until her death a few years ago she had to use one of those darth-vader vibration-amplifier things like the "Ned" character does on south park. I was terrified of her when i was six.. (Give me a break, i was six years old and stupid.)
Anyway, i can imagine that technology like this would be just about perfect for people disabled in a similar manner through tobacco, cigarettes or who knows what. No? At least it would keep such people from having to deal with their idiot six-year-old-nephews reactions to the harsh sounds of the vibration amplifier box..
and really, even beyond that, tech like this would be just about the only option for people who are going through whatever that intensive vocal-node-therapy thing is where you're banned from speaking for six months. and i know a number of theatrical singers who would be intensely happy to have one of these so that they could rest their voices between performances without cutting themselves off from the world...
I hope that once this complete, they'll sell a unit where the voice-synth thing outputs into speakers rather than a phone.. I'm sure they would have looked into this possibility by now, right?
(P.S.: While we're on the subject, sort of.. just in case anyone reading knows: This came up as an argument the other night when we were watching the Oscars and examining how much pain Enye appeared to be in from having to exert her voice. What's the difference between a vocal node and a vocal nodule?
Re:My first thought:Re:My Second thought (Score:2)
MCC would you like to play a game?
Now that would scare me as an adult even!
Re:My first thought (Score:2)
This news must be especially frustrating... (Score:4, Funny)
Re:This news must be especially frustrating... (Score:2)
You do have to feel for folks who suffer from Tourette's and Parkinson's. They won't be eligible to throw some money away on THIS shiny toy.
Hello Darkness my old friend... (Score:3, Interesting)
Was anybody else immediately reminded of the old Simon and Garfunkel tune, sounds of silence [google.com] in particular the line about "people talking without speaking" (the link is a poor transcription).
Re:Hello Darkness my old friend... (Score:2)
People hearing without listening
People writing songs that voices never shared
No one dared
Disturb the Sounds of Silence.
If you were in high school or college while Nixon was President, you pretty much had to memorize that song!
privacy issues (Score:2)
Great, but.. (Score:2, Funny)
RonB
should be mandatory in restaurants (Score:2)
This technology, assuming it works, might initially fail to gain popularity if it's not priced right. I doubt many people would pay, say, $100 extra for a phone with this feature. And that's because many people simply don't care if they're irritating the people around them.
But I'd love to see such places as restaurants and bus lines require their customers, who insist on using cellphones on their premises, to use this product. I bet the bulk of customers would support such a rule, and everyone would benefit.
Re:should be mandatory in restaurants (Score:2)
Gimme a break. People talk. Sometimes they talk too loudly. Sometimes they're on the phone, but believe me, SOME PEOPLE JUST TALK TOO LOUDLY, all the damned time, whether or not they have phones. I'll bet Cicero complained about it.
Re:why do I get the feeling... (Score:2)
On the other hand, I've sat 30 feet away from people talking to each other across a small table and heard every word, either because they're elderly and deaf or young and just loud -- they're often the same people who yell down the length of the bus to their friends ("YO, my man Duane!" "Yo, yo, yo, Tyrone!").
Nothing this side of the Cone of Silence will even dent these people. They were loud before telephones were invented, and they're not getting any quieter.
Keep track for a day of all the conversations you hear in public places. How many of them are "cell-yell", and how many are just plain loud people? (For that matter, how many of the ones yelling on the phone hang up and yell at their companions?)
Don't discriminate against cell phone users. (Score:2)
I never did find out what sparked that. If I was talking too loud, for example, she could have just touched her mouth in the 'sssh' sybol and been polite about it. I don't think I was talking that loud. Nobody else even looked up at me. I think she just had a conception that people with cell phones are rude. Well my response to that is 'ITS NONE OF YOUR FUCKING BUSINESS.'
There's 0 difference between me talking on the phone or me talking to somebody in person. If it's okay for me to talk with somebody in person, but not on a phone, then there are some serious social issues that will arise down the road. I bet she'd be tickled to death if any of her kids called her out of the blue just to say hi, but I call my dad (who lives 2000 miles away) and I'm a rude jerk. If it's distracting to her to watch a guy talk to somebody that isn't there, then she can watch Quantum Leap until she gets used to it. I certainly am not turning off my phone for the simple reason that it displeases her.
If anybody is going to discrminate against people with cell phones, make damn sure the reason is unique to the cellphone itself. No phone ringing in a theater: Acceptable. No Cell phones in a Hospital because they interfere with equipment: Acceptable. No talking on a cell phone in a restaraunt: Unaccetpable.
A better attempt to make cell phones less anoying. (Score:2)
This technology may not stop this from hapening but it would give us a reason to force them to stop. Where the answer oh this is an important call, will be become complete BS.
Re:A better attempt to make cell phones less anoyi (Score:2)
Inappropriate capitalization of Apparently randomly selected Words.
As long as.. (Score:2)
Would this help my girlfriend? (Score:4, Funny)
(It's just a JOKE! I know I'm not the first to think of it.)
Re:Would this help my girlfriend? (Score:3, Funny)
"Ick, he tastes awful, not pleasant like Jimmy or Bobby or... or... or even Samantha."
Be satisfied with the funny smile, mate.
You know... (Score:3, Funny)
Re:You know... (Score:2)
"Your suggestion to eat hot-dogs tonight sounds rather cheeky."
LLNL's way to do this (Score:2)
mouth movement + sound? (Score:4, Interesting)
Couldn't someone use the movements in addition to the sound to filter out the actual speaker's voice from the background noise? This seems almost like a nonlinear Kalman filter application (though I am by no means an expert on such things), if you had a (presumably nonlinear) model for speech as a function of the movements of the mouth. The article didn't give too much detail. Oh well, it sounds interesting in the very least.
Bad idea... (Score:2)
Oh, I forgot. Most geeks aren't married...in fact, most probably have no clue how to perform even the most basic intereactions with one (except of course when their mothers call down the basement stairs to them that dinner is ready)!
"Gat" and "Cat" (Score:2)
A system like this, would either need to incorporate some amount of voice recognition, or use a vibration sensing mechanism.
IAALS (Score:2)
Yes, the English phonemes 'g' and 'c' are articulated in the same position, both dorso-velar (dorsum of the tongue contacting the velum, the flap of skin behind the palate). They're both also 'stops' (the passage is momentarily completely blocked). But discerning sounds of identical position is actually somewhat less problematic in English than it might be in certain other languages. You hit upon a really important point when you mentioned 'the air' which accompanies 'c' word-initially in English (called 'aspiration'). Khmer, spoken in Cambodia, distinguishes between aspirated and unaspirated stops (e.g., the first 'k' in 'kook' is an aspirated stop, the second is unaspirated, but English speakers don't distinguish between them). How could this system possibly tell the difference? The only difference between the first 'k' and second 'k' in 'kook', as you point out, is the quick expulsion of air which accompanies the first. Even more confusingly, the first 'k' in 'keel' is not even articulated in the same position at all as the first 'k' in 'kook'. 'k' in 'keel' is palatal (further forward), where 'k' in 'kook' is velar (further back). But, for some reason, in English, we consider them the same phoneme (the subjective perception of what constitutes a unique sound in a given language. 'Keel' and 'kook' start with the same English phoneme, because we can't tell the difference). This is just impressing the point that where a phone is articulated is only a tiny piece of the puzzle. Making a system which understands language on the basis of position alone is ludicrous. That's impossible.
As you point out a workable system would have to detect 'voicing' (the vibration of the vocal cords), as voicing, AFAIK, differentiates at least some phonemes in every language on earth.
What about nasalisation (where the nasal passage is opened in pronouncing a vowel)? The only difference between the French words 'main' (hand) and 'mais' (but) is that the first is pronounced with resonance in the nasal cavity. How is this system to divine that one has opened a tiny passage to one's nasal cavity for the duration of the vowel?
Speaking of point of articulation, how about glottals (articulated in the larynx) and pharyngeals (articulated in the pharynx. We have none in English, but they exist in Semitic languages)? Without a camera rammed down the subject's throat, sensing articulation in there is going to be hard.
If we have some way of determining the position of the tongue, vowels will be comparatively easy to distinguish, as they're distinguished by 'rounding' (i.e., of the lips), position of the tongue and nasalisation alone (a caveat: Japanese has a 'voiceless vowel', but it's a total phonetic red-herring, really). And detecting nasalisation still seems a difficulty.
At any rate, the idea of recognising language mechanically would seem to at least necessitate detection of 1) position and character of vibration in the nasal cavity, pharynx and mouth and 2) exact position of the tongue at all times. At any rate, I'll leave the last word on this 'invention' to others:
The value of a person's voice (Score:3, Funny)
What are _you_ saying under your breath? (Score:2)
Discotheque? (Score:2)
I don't know what's more scary -- a new cellular electrode attachment or Hemos heating up (literally) the floor under a giant mirrored ball.
Maybe in about 30 years (Score:2)
So we've barely gotten voice recognition down, despite being "just a wee bit more" type of promise for so long now, and someone is claiming that they'll read your lips? Fat chance in hell, is all I can say. Unless we concatenate our language to about 4 words, there isn't a chance.
Re:Maybe in about 30 years (Score:2)
I was really impressed when it asked for a number (can't remember what it was for, whether a callback # or something) and it played back my number right after I said it. It has come a really long way, but for it to be perfect for the end user it will take a while, but I have faith it will be here sooner than expected in the "perfect" sense.
wow, at long last. (Score:2)
but... (Score:2)
Inflection (Score:2, Interesting)
*I* put the dishes away.
I *put* the dishes away.
I put the *dishes* away.
I put the dishes *away*.
Looks like we will still that Sprint guy hovering around for a while....
Not new (Score:3, Informative)
Quantum Leap, Tom and Jerry... (Score:3, Interesting)
Tom and Jerry is similar, to a degree. I ran across a cartoon of Tom and Jerry on the web a few days ago and watched it. I noticed something very interesting. The music in the cartoon responded to every little movement that the characters made. You listen to the music, for example, and tell if Jerry was tiptoe'ing or running. That was a very interesting dimension to Tom and Jerry. That is the type of element that would allow you to watch a slideshow of the show with the sound track and still keep track of what's going on.
This article was very interesting because I think it may be the start of making computer interfaces take advantage of audio responses that don't even require words. I've spent a great deal of time assigning different sounds in Windows to different events. For example, I have a very distinctive sound that ICQ makes when I recieve a message. I even went as far as to provide different people with different sounds. I noticed something very interesting, when I went to use ICQ on another machine, I ached to hear the sounds again. It was so strange not hearing them!
I hope one day Windows (or whatever OS I use in the future...) spends more effort into providing a sound-enhanced interface. That would truely provide better a better multi-tasking experience. It'd be cool if, for example, the window on the screen causing the sound was played through the right or left speaker based on where the window is on the screen. Maybe muffle it if a window is under it.
Anybody know of any products for Windows that do this today?
Would be useful for people with speech impairment! (Score:2)
My daughter needs a job... (Score:2)
This way all you have to do is mouth words into the phone...not actually speak!
My daughter talks without saying anything. Maybe she could get a job testing these things.
Just watch what you mouth! (Score:3, Interesting)
Oh well... my boss probably needs to know about what I call him behind his back anyways. q:]
MadCow.
Contraindications? (Score:2)
oh good (Score:2)
Sign language (Score:2)
Enders Game (Score:2)
Travis
Reverse of lip-synching (Score:2)
I also just don't see how the claim can be accurate. I can say the "ih" phoneme with my jaw in any position, and I can say "a e i o" without moving my cheeks or jaw at all. What gives?
Human lip readers need *context*, and lots of it. This one I'll believe when I can use the demo myself.
As for losing subtleties of communication, I think the real problem is in synthesis. I work on the opposite side of this problem, generating lip movements from audio (i.e. lip synching). A lot of the subtleties you might think you'd lose are actually there in both signals, the audio and the muscles. For example, you can tell when somebody's smiling over the phone, the change in the shape of the mouth makes the phonemes sound different. Shouting invokes different muscles from normal speech, emphasis might be picked up from the eyebrows, and so on. But even if you detect such things on the face, no voice synthesis engine is capable of rendering the accompanying vocal effects.
Stupid....Except for (Score:2, Insightful)
Papa Bush... (Score:3, Funny)
"Read my lips. No new taxes (today)."
Mr. Anderson (Score:2)
Speech Recognition (Score:2)
Pretty hard to do this in Japanese (Score:2, Insightful)
I don't know Japanese myself, but I'm in the middle of reading The Japanese Language (no link; not carried by Amazon). One of the things that's discussed is how little mouth movement is required in Japanese, in contrast to other languages. So it's somewhat ironic that DoCoMo, a Japanese company, is leading the charge in this field.
Even in non-Japanese languages, guttural sounds like 'g', 'k', and German 'ch' cause very little muscular change--just watch yourself in a mirror some time. The article didn't go into much detail, but it may be infinitely more useful if the sensors paid attention to tongue movements instead of cheek ones.
Old DARPA project? (Score:2)
Of course there could be many applications for the delivery of this type of thing, but one of the applications that the CIA was interested in subliminal presentation of messages in peoples sleep while the silent transmission of information would obviously be useful to special forces teams that need to communicate without revealing themselves.
Vocalization and Other Features (Score:2)
Although the article talks about getting 100% accuracy in discerning vowel sounds, the Japanese language is pretty simple in its vowels -- a, i, u, e, and o, and that's about it. What about vowel sounds like umlauted vowels that occur in European languages? Heck, what about African languages that incorporate clicks and creaky vowels?
This sounds like promising technology, but the article leaves a lot of questions that need to be answered. I guess five more years of research will help, though.
This won't stop the shouting (Score:2)
It may be a neat bit of technology they've come up with, but people won't stop shouting into their phones until they get feedback.
Phone companies' wet dream (Score:2)
Re:Is Taco trying to be clever? (Score:2)
Discotheque means the same thing in both French and English - a place where records are played and people dance.
I have visions of Hemos... (Score:2)
It's rather a scary vision... sorry for sharing it with you.
Re:That what you kids call it.. (Score:2)
Re:This is bad news because... (Score:5, Funny)
You may not be aware of this or have thought of it this way, but a microwave oven is basically just a big, unmodulated radio station broadcasting in the microwave band instead of the radio band.
Are you a real physics genius, or do you just play one in front of your liberal arts friends ;)
Re:This is bad news because... (Score:2, Informative)
Re:This is bad news because... (Score:2)
Um...
Cell phones always have been subject to a limitation of 600 mW, and more recently subject to specific absorption rate limitations.
If anything, the newer PCS phones that operate at higher frequencies are worse, since you are getting closer and closer to the resonant frequency of your eyeballs.
The only risk that has any real evidence for it is heating risk, so if you feel your brain starting to cook, then you might want to turn the phone off, otherwise, it's harmless.
(Hint, 600mW isn't enough to heat much of anything.)
Anyway, nothing had changed in cell phones to justify your conclusion that they are any less dangerous than they were (they weren't dangerous at all to start with).