Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Technology

Talk ... Without Speaking 275

mjm7 writes "Finally, we might be able to get rid of all those annoying people yelling over the static on their cell phones! CNN has an article about a new technology that senses muscle movements in your face and then translates them into sound. This way all you have to do is mouth words into the phone...not actually speak!" Somehow I suspect that we'd lose a lot of the subtleties of communication, but it sure would be nice every time hemos calls me from the discotheque.
This discussion has been archived. No new comments can be posted.

Talk ... Without Speaking

Comments Filter:
  • Hyperion (Score:3, Insightful)

    by PinkStainlessTail ( 469560 ) on Thursday March 28, 2002 @03:21PM (#3243476) Homepage
    Reminds me of the subvocal systems used in this series (as well as tons of other sci-fi). We're slowly catching up to the imaginary future!

    • Re:Hyperion (Score:3, Interesting)

      by Jouster ( 144775 )
      I would be very surprised if they can successfully and consistently measure the movements that result in diphthongs [dictionary.com], as the muscle movements involved are extremely minimal.

      Diphthongs, by the way, are why interfaces that attempt to "read lips" without the benefit of a phonetic dictionary of some kind (and preferably a context one as well) always fail miserably, to the eternal chagrin of the CIA.

      Jouster
      • Diphthongs are probably the least of the problem. Imagine the difficulty involved in tonal languages.

        But this company says that they are already able to distinguish vowels to a high degree of accuracy. And I don't think they're only reading lips. But the real interesting possibility is that even if they *never* figure out how to perfectly identify normal English diphthongs, they could simply invent a new way of representing those sounds with your mouth.

        When you executed a certain motion, the voice machine would know to insert a diphthong. With the slightest amount of feedback and practice, people would learn quick.

        Iduno. Just a thought. People learn dialects fast, but usually only if they're practicing all day long... Maybe that could never work.
      • Not a big problem... (Score:3, Informative)

        by wirefarm ( 18470 )
        ...since the research is being done in Japan.
        Japanese has very few dipthongs.
        A word that might be spelled 'Ao' using latin characters,(Â), would be pronounced as 'Ah-ow' (sort of).
        Some words do change the vowels, but usually just by extending it. The word Tokyo isn't pronounced 'toe-key-o' as much as it is 'to-u-key-o-u'. The audible differences can be very slight, though. Possibly by sensing the muscle movements, it would be easier to discern the differences.
        Another interesting capability would be the ability to discern mood. Consider the following:
        'Yes dear, I'd <rolls_eyes>love</rolls_eyes>to have your mother visit this weekend...'

        I'm not sure that I'd want my phone telling my girlfriend when I'm being sarcastic. You could have a new groupof 'tags' kind of like those you see on IRC:
        roll_eyes
        clench_jaw
        check_watch
        sneer
        cringe
        shake_head_in_disbelief_at_the_studidity_of_what_i s_being_said

        You get the idea...
        Cheers,
        Jim in Tokyo
  • Anderson (Score:5, Funny)

    by swordboy ( 472941 ) on Thursday March 28, 2002 @03:22PM (#3243481) Journal
    The Anderson partner called his secretary on his cell phone and said:

    Ship the Enron documents to the Feds

    But she heard:

    Rip the Enron documents to shreds

    It turns out that this was all just a case of bad cellular...

    • Re:Anderson (Score:3, Insightful)

      by Mwongozi ( 176765 )
      Mr. Smith is now office head.

      Mr. Smith is now off his head.

      Spot the difference. :)
    • Mr. Anderson, what good is a phone call, if you're unable to speak?

  • finally! (Score:5, Funny)

    by Joe the Lesser ( 533425 ) on Thursday March 28, 2002 @03:22PM (#3243486) Homepage Journal
    We'll finally be able to understand what the hell mimes are doing! Rejoice!
  • Not good news for those that like to mutter curses to the morons on the other end of the phone.
  • The mute and deaf (Score:5, Insightful)

    by spookysuicide ( 560912 ) on Thursday March 28, 2002 @03:23PM (#3243490) Homepage
    Imagine what a world of difference this would make to the mute or to people who had lost the use of their voice due to throat cancer. It seems weird they didn't mention the applications this would have for people who have lost or have never had the use of their voice.
    • Combine it with an avatar [e4engineering.com] and let the deaf read lips.

      I just want a jewel in my ear that will let me communicate through subvocalizing to an all-knowing computer network/alien being (a la the Ender's Game universe)!
    • Re:The mute and deaf (Score:3, Interesting)

      by iabervon ( 1971 )
      Actually, that's old news. This is very similar to Tatama, which used to be used by people who were both deaf and blind to; the speaker would speak normally (in English), and the "listener" would feel what the person was saying with fingers on the side of the person's face and in front of the person's mouth.
  • Olive Juice (Score:3, Funny)

    by looseBits ( 556537 ) on Thursday March 28, 2002 @03:23PM (#3243492)
    Words like this may cause some minor misunderstandings.
  • by CmdrTaco (editor) ( 564483 ) on Thursday March 28, 2002 @03:23PM (#3243493)
    I realize people may think of this as a luxury, but there are many people that don't have the ability to speak. From crippling diseases to the negative effects of a lifetime of smoking, some people simply cannot use their vocal chords. I know I'd find his handy next time I'm sick with a sore throat!

    I'd also have to say this should be made mandatory for all people that would otherwise force me to listen to their loud cell phone conversations.

    • "From crippling diseases to the negative effects of a lifetime of smoking, some people simply cannot use their vocal chords."

      For some reason, this sentence conjured up a picture in my mind of Steven Hawking sounding a bit like a furby on the phone.
  • by BMonger ( 68213 )
    With keyboards we successfully took away peoples needs to physically write something... with this we won't need people to verbally speak... next it'll be visual impulses shot right into your head so you really don't need your eyes anymore... sheesh...
  • Voice recognition (Score:2, Interesting)

    by Mannerism ( 188292 )
    This might help voice recognition catch on as a means of PC input, too. I'd feel slightly less stupid sitting in my office mouthing words at my computer than I would actually talking to it.
  • by WinPimp2K ( 301497 ) on Thursday March 28, 2002 @03:25PM (#3243519)
    Yeah, this sounds like just the thing for people who want voice dictation, but work in a "noisy" environment.
    Alternatively, you could even have a microphone attached so that when you actually did speak, it would automatically disable the recognition - no more accidentally transcribing your half of a phone conversation for example. Wait a minute, I have to patent that idea! :-)

  • What kind of a sound would it make if I held my middle finger up to it?

    I mean really, if the static is so bad that you can't get a good enough signal to hear the person, how is the "face recognition" signal going to get transmitted?

  • by TheNecromancer ( 179644 ) on Thursday March 28, 2002 @03:26PM (#3243522)
    Think about it, don't most people move the muscles in their mouths slightly different when they are mouthing words, as opposed to actually speaking them? I would venture that the technology wouldn't be able to discern the subleties in the way we speak.

    Other than that, it sounds like an interesting technology.
    • by Linux Ate My Dog! ( 224079 ) on Thursday March 28, 2002 @03:48PM (#3243704) Homepage Journal
      If the box gave feedback, people would very quickly compenstate to insert subtelty back and modulate the output just like they want to have it. The speech system is amazing that way, as you prove every time you manage to stay completly intelligible when speaking while chewing.

      When I once asked a linguist friend about this on an unrelated topic, he leaned over the table and put his thumb and index finger on the outer corners of my lower lip, and then pinched them together to immobilize it. "Speak," he said. It was wierd but I sounded near normal in less than three words.

      We adapt.
      • When I once asked a linguist friend about this on an unrelated topic, he leaned over the table and put his thumb and index finger on the outer corners of my lower lip, and then pinched them together to immobilize it. "Speak," he said. It was wierd but I sounded near normal in less than three words.

        Wow - your friend is a pretty cunning linguist.
  • It also seems to me that sounds are not necessarily made due to the movement of the jaw. I'd imagine that non-vowel sounds emanate from the vocal cords and tongue. And, what could this end up looking like? Think Nintendo Power Glove...
  • Wasn't this in a William Gibson novel? A young girl has a computer with an AI companion that only she can see, and she communicates with it by "subvocalizing" instead of actually speaking.

    This would be/will be great. Now we just need cellphones that come in vibrate-only mode and I can finally have a peaceful meal in a restaurant without some moron ten tables away disturbing the whole restaurant with an incoming call (and the subsequent conversation).

    Question: if this can eventually recognize what sounds the person is meaning to make with 100% accuracy, does that mean that voice recognition has arrived? Instead of spitting out an audio signal, it could output text instead. THAT would be AWESOME.

    I'm so excited. :)

    • Orson Scott Card (Score:2, Informative)

      by Cirrius ( 304487 )
      Speaker for the Dead and Xenocide, had Ender and later Miro subvocalizing to Jane, the sentient entity that "lived" in the network of ansibles. It might continue past that, I have only read up to Xenocide.
    • Re:William Gibson? (Score:3, Informative)

      by prator ( 71051 )
      Was that in "The Diamond Age"? I can't remember.

      This was in the Ender's Game series. This is how Ender communicated with Jane.

      -prator
    • Re:William Gibson? (Score:2, Informative)

      by weakethics ( 99716 )
      There is a Gibson story, co-written, I think with John Shirley where astronauts have surgically implanted "bone phones" that picks up their speech.
      I believe it is Mona Lisa Overdrive in which the Japanese girl has the virtual assistant that she communicates with subvocally.
      • There is a Gibson story, co-written, I think with John Shirley where astronauts have surgically implanted "bone phones" that picks up their speech.
        Hinterlands, which I think was soley authored by Gibson.

        It's in Burning Chrome, which also contains a story named 'The Belonging Kind', which was co-authored with John Shirley.
    • Wasn't this in a William Gibson novel? A young girl has a computer with an AI companion that only she can see, and she communicates with it by "subvocalizing" instead of actually speaking.
      Yes, it was in Mona Lisa Overdrive.

      The AI was Colin, a Maas-Neotek biosoft unit that acts as a guide to England, given to Kumiko by her father.
  • U2 (Score:2, Funny)

    "and scream without raising your voice."
    • Wait. Don't forget, we've got to learn to cry without weeping, too.

      You know, I took the poison from the poison stream....
  • by MonkeyBot ( 545313 ) on Thursday March 28, 2002 @03:27PM (#3243534)
    Talking on my phone
    I twitch, about to sneeze hard.
    Phone thinks I said "F*CK."
  • by tag ( 22464 )
    I can hear it now...

    "Domo Arigato, Mister Roboto"
  • tongue in cheek? (Score:2, Insightful)

    Neat idea...(I didn't anyway) it looks like all they can detect right now are vowels.

    I wonder how they will work out the consonant issues. The way an S is produced is pretty similar to a Z. At least they are pretty similar in my mouth anyway.

    I suspect everyone produces consonants in a slightly different manner. I mean, when you are learning to speak, you don't stick your hand in someone else's mouth to figure out what their tongue is doing... You just maneuver your own until you make a similar sound.

    So there are probably several different tongue configurations that work to produce a sound. Not to mention the shape of one's mouth may require a specific and unique tongue configuration to produce a particular sound as compared to someone else.

    Sounds (hehe) like they have their work cut out for them in this area.

    --Scott
  • it sure would be nice every time hemos calls me from the discotheque

    Still, this is just a one-way solution. You will be able to hear the person talking in the crowd, but how will the person on the other end be able to hear anything? Will the phone be able to display the message in the form of text or something similar? Or will it just make funny faces at you? :)
  • by Hadlock ( 143607 ) on Thursday March 28, 2002 @03:30PM (#3243556) Homepage Journal
    does anyone remember the "my teacher is an alien!! series? plot synopsis: 4th grader finds out teacher is an alien (suprise, suprise), teacher/alien sees him seeing him, and keeping glactic security safe, takes him up into the New Jersey (mega-big spaceship), and they cruise about, saving the universe.

    anywho, i read (and probably own) the whole series in probably 4th grade, i'm 18 1/2 now. on one of their missions, they had special devices like this; except it attached to your throat muscles, which is probably a whole lot easier and less conspicious. the funny part was that they had to whisper, otherwise they'd "yell" right into the other people's earsets. good to know this stuff is comming to fruit

    my teacher is an alien on amazon.com [amazon.com]

    the interesting thing about the series, is that it explains in amazingly simple terminology, using a large noodle, how hyperspace works. i'd explain more, but i don't want to get modded offtopic TOO much. and i have to go to work.
  • by IdahoEv ( 195056 ) on Thursday March 28, 2002 @03:30PM (#3243561) Homepage
    "Rotate the pod please, Hal..."

    Dave ... I could see your lips moving ...

    -Ev

  • My first thought (Score:4, Interesting)

    by mcc ( 14761 ) <amcclure@purdue.edu> on Thursday March 28, 2002 @03:32PM (#3243576) Homepage
    Is that this would be great for people who for one reason or another no longer have voiceboxes.

    I had a great-aunt who lost a decent portion of her lungs to cancer and cigarettes, and up until her death a few years ago she had to use one of those darth-vader vibration-amplifier things like the "Ned" character does on south park. I was terrified of her when i was six.. (Give me a break, i was six years old and stupid.)

    Anyway, i can imagine that technology like this would be just about perfect for people disabled in a similar manner through tobacco, cigarettes or who knows what. No? At least it would keep such people from having to deal with their idiot six-year-old-nephews reactions to the harsh sounds of the vibration amplifier box..

    and really, even beyond that, tech like this would be just about the only option for people who are going through whatever that intensive vocal-node-therapy thing is where you're banned from speaking for six months. and i know a number of theatrical singers who would be intensely happy to have one of these so that they could rest their voices between performances without cutting themselves off from the world...

    I hope that once this complete, they'll sell a unit where the voice-synth thing outputs into speakers rather than a phone.. I'm sure they would have looked into this possibility by now, right?

    (P.S.: While we're on the subject, sort of.. just in case anyone reading knows: This came up as an argument the other night when we were watching the Oscars and examining how much pain Enye appeared to be in from having to exert her voice. What's the difference between a vocal node and a vocal nodule?
  • by JordanH ( 75307 ) on Thursday March 28, 2002 @03:34PM (#3243589) Homepage Journal
    to those with Tourette Syndrome.
  • by istartedi ( 132515 ) on Thursday March 28, 2002 @03:35PM (#3243607) Journal

    Was anybody else immediately reminded of the old Simon and Garfunkel tune, sounds of silence [google.com] in particular the line about "people talking without speaking" (the link is a poor transcription).

    • People talking without speaking
      People hearing without listening
      People writing songs that voices never shared
      No one dared
      Disturb the Sounds of Silence.

      If you were in high school or college while Nixon was President, you pretty much had to memorize that song!

  • Will pointing the phone in someone else's direction enable you to eavesdrop on their conversation?
  • How about all those times you get a phone call and you realize you don't want to talk them and as they drone and drone and drone you mouth to anyone around you "SHUT THE F-CK UP!!!" Now they will hear that.

    RonB
  • This technology, assuming it works, might initially fail to gain popularity if it's not priced right. I doubt many people would pay, say, $100 extra for a phone with this feature. And that's because many people simply don't care if they're irritating the people around them.

    But I'd love to see such places as restaurants and bus lines require their customers, who insist on using cellphones on their premises, to use this product. I bet the bulk of customers would support such a rule, and everyone would benefit.

    • You're absolutely right! Restaurants should be places of complete, funereal silence, broken only by the occasional soft clang of a fork on a plate. Better yet, we'll ban forks too, and allow only soup. The waiters could mime the daily specials, and customers would point to items on the menu...

      Gimme a break. People talk. Sometimes they talk too loudly. Sometimes they're on the phone, but believe me, SOME PEOPLE JUST TALK TOO LOUDLY, all the damned time, whether or not they have phones. I'll bet Cicero complained about it.

    • Why? What's the big deal? I can understand if somebody talks too loud, but that's true of anybody, not just a phone user. I was at a McDonald's once grabbing a bite, and I called my dad and talked to him for a bit. The woman in front of me got irritated and muttered that I should get off the phone.

      I never did find out what sparked that. If I was talking too loud, for example, she could have just touched her mouth in the 'sssh' sybol and been polite about it. I don't think I was talking that loud. Nobody else even looked up at me. I think she just had a conception that people with cell phones are rude. Well my response to that is 'ITS NONE OF YOUR FUCKING BUSINESS.'

      There's 0 difference between me talking on the phone or me talking to somebody in person. If it's okay for me to talk with somebody in person, but not on a phone, then there are some serious social issues that will arise down the road. I bet she'd be tickled to death if any of her kids called her out of the blue just to say hi, but I call my dad (who lives 2000 miles away) and I'm a rude jerk. If it's distracting to her to watch a guy talk to somebody that isn't there, then she can watch Quantum Leap until she gets used to it. I certainly am not turning off my phone for the simple reason that it displeases her.

      If anybody is going to discrminate against people with cell phones, make damn sure the reason is unique to the cellphone itself. No phone ringing in a theater: Acceptable. No Cell phones in a Hospital because they interfere with equipment: Acceptable. No talking on a cell phone in a restaraunt: Unaccetpable.

  • Just like now there is a vibrate feature so the phone can ring without anoying every one in a room. Now there is an other feature to try to stop those jirks who talk answer their telephone and talk into in in an unaproprate situation to talk. When this technoligy is realease I think we should have legal rights to smack anyone whos Cell phone goes off in an unaproprate situation (because they should have a vibrate) and then Kick the person who Starts talking to the cell phone where we can hear them. I dont know about anyone else but What is more annoying then having a cell phone go off in an unaproprate place, is when they start talking loudly without leaving the room.
    This technology may not stop this from hapening but it would give us a reason to force them to stop. Where the answer oh this is an important call, will be become complete BS.
  • As long as the cell phone makes real noise, rather than inserting a probe into your ear canal and manually manipulates your eardrum so that you hear the conversation without sound...
  • by Mononoke ( 88668 ) on Thursday March 28, 2002 @03:40PM (#3243648) Homepage Journal
    Could she tell me her day's troubles while kneeled before me with her mouth full?

    (It's just a JOKE! I know I'm not the first to think of it.)

  • You know... (Score:3, Funny)

    by quantaman ( 517394 ) on Thursday March 28, 2002 @03:40PM (#3243649)
    Usually when I mouth a word into my phone it usually means I DON'T want the other person to hear it. I'm not sure what the learning curve would be on a device like this but chances are that until person hits it they are going to have a lot of explaining to do!!
  • LLNL has been researching micropower impulse radar [llnl.gov] to 'image' the vocal chords, mainly for speech recogonition. The main site [llnl.gov] seems down, but you can get to it with google cache [google.com]. Also check out ucdavis [ucdavis.edu]
  • by jat2 ( 557619 ) on Thursday March 28, 2002 @03:42PM (#3243663)
    The article seemed to imply that the technology would only use mouth movements, thus allowing the phone to ignore all sound, a lot of which is noise. Of course, as CmdrTaco points out, this could lead to a loss in some of the subtleties of communication.

    Couldn't someone use the movements in addition to the sound to filter out the actual speaker's voice from the background noise? This seems almost like a nonlinear Kalman filter application (though I am by no means an expert on such things), if you had a (presumably nonlinear) model for speech as a function of the movements of the mouth. The article didn't give too much detail. Oh well, it sounds interesting in the very least.

  • The last thing we need is technology that allows our wives to be able to figure out what we are muttering at them under our breath...

    Oh, I forgot. Most geeks aren't married...in fact, most probably have no clue how to perform even the most basic intereactions with one (except of course when their mothers call down the basement stairs to them that dinner is ready)!
  • tell me how, without vocalization, these could be distinguished? They have the exact same mouth position, with the only difference being "C" comes from air only, and "G" is made with air and your voicebox. [ regardless of your desire to use one on the other =) ]

    A system like this, would either need to incorporate some amount of voice recognition, or use a vibration sensing mechanism.

    • (I Am A Linguistics Student)

      Yes, the English phonemes 'g' and 'c' are articulated in the same position, both dorso-velar (dorsum of the tongue contacting the velum, the flap of skin behind the palate). They're both also 'stops' (the passage is momentarily completely blocked). But discerning sounds of identical position is actually somewhat less problematic in English than it might be in certain other languages. You hit upon a really important point when you mentioned 'the air' which accompanies 'c' word-initially in English (called 'aspiration'). Khmer, spoken in Cambodia, distinguishes between aspirated and unaspirated stops (e.g., the first 'k' in 'kook' is an aspirated stop, the second is unaspirated, but English speakers don't distinguish between them). How could this system possibly tell the difference? The only difference between the first 'k' and second 'k' in 'kook', as you point out, is the quick expulsion of air which accompanies the first. Even more confusingly, the first 'k' in 'keel' is not even articulated in the same position at all as the first 'k' in 'kook'. 'k' in 'keel' is palatal (further forward), where 'k' in 'kook' is velar (further back). But, for some reason, in English, we consider them the same phoneme (the subjective perception of what constitutes a unique sound in a given language. 'Keel' and 'kook' start with the same English phoneme, because we can't tell the difference). This is just impressing the point that where a phone is articulated is only a tiny piece of the puzzle. Making a system which understands language on the basis of position alone is ludicrous. That's impossible.

      As you point out a workable system would have to detect 'voicing' (the vibration of the vocal cords), as voicing, AFAIK, differentiates at least some phonemes in every language on earth.

      What about nasalisation (where the nasal passage is opened in pronouncing a vowel)? The only difference between the French words 'main' (hand) and 'mais' (but) is that the first is pronounced with resonance in the nasal cavity. How is this system to divine that one has opened a tiny passage to one's nasal cavity for the duration of the vowel?

      Speaking of point of articulation, how about glottals (articulated in the larynx) and pharyngeals (articulated in the pharynx. We have none in English, but they exist in Semitic languages)? Without a camera rammed down the subject's throat, sensing articulation in there is going to be hard.

      If we have some way of determining the position of the tongue, vowels will be comparatively easy to distinguish, as they're distinguished by 'rounding' (i.e., of the lips), position of the tongue and nasalisation alone (a caveat: Japanese has a 'voiceless vowel', but it's a total phonetic red-herring, really). And detecting nasalisation still seems a difficulty.

      At any rate, the idea of recognising language mechanically would seem to at least necessitate detection of 1) position and character of vibration in the nasal cavity, pharynx and mouth and 2) exact position of the tongue at all times. At any rate, I'll leave the last word on this 'invention' to others:

      Dr. Scott: This sonic transducer... it is I suppose, some kind of audio-vibratory physio-molecular transport device?

      Brad: You mean...

      Dr. Scott: Yes, Brad, it's something we ourselves have been working on for quite some time. But it seems our friend here has found a means of perfecting it..."
  • by guttentag ( 313541 ) on Thursday March 28, 2002 @03:51PM (#3243727) Journal
    ...with the help of a voice synthesizer, mobile-phone users can communicate in silence...
    Synthetic Voice: hi ... rob ... it's linus ... no ... really ... take a look ... at ... the next version ... of the lye nucks kernel ... at ... h ... t... t... p... colon slash slash ... goat ...
  • Am I the only one who says nice things aloud into the phone while muttering "fscking azzhool" under my breath? How refreshing honest our communications will become!
  • ...but it sure would be nice every time hemos calls me from the discotheque.

    I don't know what's more scary -- a new cellular electrode attachment or Hemos heating up (literally) the floor under a giant mirrored ball.

  • Seriously, though, the promised "killer application" for over a decade now has been voice recognition, and we're STILL at a point where the inaccuracy rate leads to it being generally useless in anything other than "ooh, isn't that neat" kinds of demos (for instance it was a laugh to see voice recognition as a hyped feature of Office XP : Now tell me how many people on the planet are actually using it? While I applaud them for adding it for the handicapped, of the general public it seems neat, but when you have to babysit every word it dictates you relegate it to the unused feature list).

    So we've barely gotten voice recognition down, despite being "just a wee bit more" type of promise for so long now, and someone is claiming that they'll read your lips? Fat chance in hell, is all I can say. Unless we concatenate our language to about 4 words, there isn't a chance.
    • There are actually some really good voice recognition systems out there. I was calling an IVR that had a menu like, "Say One for X, Two for Y" and it recognized it quite well.

      I was really impressed when it asked for a number (can't remember what it was for, whether a callback # or something) and it played back my number right after I said it. It has come a really long way, but for it to be perfect for the end user it will take a while, but I have faith it will be here sooner than expected in the "perfect" sense.
  • cell phones have caught up to deaf people.
  • by Polo ( 30659 )
    Wouldn't everybody start sounding like the vietnam vet on south park??
  • Inflection (Score:2, Interesting)

    So, I wonder how the system works with inflection and stressed syllables. Would be a disaster for those domestic husband/wife disputes (not to mention Japanese which is almost *entirely* inflective):

    *I* put the dishes away.
    I *put* the dishes away.
    I put the *dishes* away.
    I put the dishes *away*.

    Looks like we will still that Sprint guy hovering around for a while....
  • Not new (Score:3, Informative)

    by doublem ( 118724 ) on Thursday March 28, 2002 @04:16PM (#3243901) Homepage Journal
    Not new, old technology. They even have a guy who uses a handheld one on South Park!
  • by NanoGator ( 522640 ) on Thursday March 28, 2002 @04:20PM (#3243915) Homepage Journal
    Anybody ever pay attention to the sounds that the handlink makes on Quantum Leap? For example, it kind of goes 'waaaaahhhh' when he smacks it. That's the most obvious one, but if you listen a little more carefully, the sounds that little device makes start to emote. You can get an idea what he's reading on the screen before he actually states it.

    Tom and Jerry is similar, to a degree. I ran across a cartoon of Tom and Jerry on the web a few days ago and watched it. I noticed something very interesting. The music in the cartoon responded to every little movement that the characters made. You listen to the music, for example, and tell if Jerry was tiptoe'ing or running. That was a very interesting dimension to Tom and Jerry. That is the type of element that would allow you to watch a slideshow of the show with the sound track and still keep track of what's going on.

    This article was very interesting because I think it may be the start of making computer interfaces take advantage of audio responses that don't even require words. I've spent a great deal of time assigning different sounds in Windows to different events. For example, I have a very distinctive sound that ICQ makes when I recieve a message. I even went as far as to provide different people with different sounds. I noticed something very interesting, when I went to use ICQ on another machine, I ached to hear the sounds again. It was so strange not hearing them!

    I hope one day Windows (or whatever OS I use in the future...) spends more effort into providing a sound-enhanced interface. That would truely provide better a better multi-tasking experience. It'd be cool if, for example, the window on the screen causing the sound was played through the right or left speaker based on where the window is on the screen. Maybe muffle it if a window is under it.

    Anybody know of any products for Windows that do this today?

  • I have a minor speech impairment (not very clear) so it would probably be useful to me. :)

  • This way all you have to do is mouth words into the phone...not actually speak!



    My daughter talks without saying anything. Maybe she could get a job testing these things.

  • by MadCow42 ( 243108 ) on Thursday March 28, 2002 @04:28PM (#3244002) Homepage
    Now I can't mouth obscenities about the person I'm talking with without them hearing!!! You can't also hold a "quiet" conversation with the person beside you while "politely" listening to the person on the phone...

    Oh well... my boss probably needs to know about what I call him behind his back anyways. q:]

    MadCow.
  • I figure Bell's Palsy, any facial tic involving the mouth/cheeks, and perhaps beards?

  • now i can say 'olive juice' to my wife and still earn major points. OTOH, everyone in Italy is gonna think that the entire US population is gay!
  • Another way to help speechless persons to communicate is the recognition and translation of sign language. If you're interested in that you might want to look here [rwth-aachen.de].
  • Are they going to call it "subvocalizing" like in Enders Game?

    Travis
  • The "100% vowel detection" claim sets off alarm bells for me. Sure, pure vowels tend to show up on the face, but there are lots of characteristics of speech which occur down in the throat, or back of the tongue...how do they plan to distinguish between sh, ch, and j? S and Z? F and V? For now, they don't.

    I also just don't see how the claim can be accurate. I can say the "ih" phoneme with my jaw in any position, and I can say "a e i o" without moving my cheeks or jaw at all. What gives?

    Human lip readers need *context*, and lots of it. This one I'll believe when I can use the demo myself.

    As for losing subtleties of communication, I think the real problem is in synthesis. I work on the opposite side of this problem, generating lip movements from audio (i.e. lip synching). A lot of the subtleties you might think you'd lose are actually there in both signals, the audio and the muscles. For example, you can tell when somebody's smiling over the phone, the change in the shape of the mouth makes the phonemes sound different. Shouting invokes different muscles from normal speech, emphasis might be picked up from the eyebrows, and so on. But even if you detect such things on the face, no voice synthesis engine is capable of rendering the accompanying vocal effects.
  • by Piper82 ( 556975 )
    To think this will end up as mass market technology is plain wrong. Can anyone really think that people will stop "speaking" into their phones? And for what, to evade cell phone static? Come on. Usage of this technology will only really take off in very niche markets where there's an actual need, like those who's speach is affected frome one form or another. Those are the people who will really benefit from this. The implications there are incredible. Now where's my Crystal Pepsi?
  • by chinton ( 151403 ) <chinton001-slashdot@nospam.gmail.com> on Thursday March 28, 2002 @05:24PM (#3244395) Journal
    "Read my lips.". With this nifty device, we wouldn't have had to read his lips, and we would have heard the (subvocal) last word of his famous quote:

    "Read my lips. No new taxes (today)."

  • How can you make a phone call if you can't even spea-k.
  • Maybe speech recognition w/o speaking? Now all you have to work about is a repetitive strain injury of your facial muscles. ;)
  • I don't know Japanese myself, but I'm in the middle of reading The Japanese Language (no link; not carried by Amazon). One of the things that's discussed is how little mouth movement is required in Japanese, in contrast to other languages. So it's somewhat ironic that DoCoMo, a Japanese company, is leading the charge in this field.

    Even in non-Japanese languages, guttural sounds like 'g', 'k', and German 'ch' cause very little muscular change--just watch yourself in a mirror some time. The article didn't go into much detail, but it may be infinitely more useful if the sensors paid attention to tongue movements instead of cheek ones.

  • I seem to recall a project like this designed for special forces that was funded by CIA and DARPA (Defense Advanced Research Projects Agency). There was also a "silent sound" device that could transmit acoustic information directly into the skull.

    Of course there could be many applications for the delivery of this type of thing, but one of the applications that the CIA was interested in subliminal presentation of messages in peoples sleep while the silent transmission of information would obviously be useful to special forces teams that need to communicate without revealing themselves.

  • How will this differentiate between voiced and unvoiced consonants? "Pat" and "bat" sound different but the two initial consonants are extremely similar outside of vocalization. Yes, the articulation of the "b" is longer than the "p", but it's really miniscule and probably differs from person to person. I wonder if this will take the tack of making the phone "learn" how to discern such, or will it make the person learn how to "speak" in a way that the phone "understands" (kind of like handwriting recognition versus using Graffiti)...

    Although the article talks about getting 100% accuracy in discerning vowel sounds, the Japanese language is pretty simple in its vowels -- a, i, u, e, and o, and that's about it. What about vowel sounds like umlauted vowels that occur in European languages? Heck, what about African languages that incorporate clicks and creaky vowels?

    This sounds like promising technology, but the article leaves a lot of questions that need to be answered. I guess five more years of research will help, though.
  • The reason people shout into cell phones isn't that the phones don't pick up sound well enough. They do. It's that people don't *THINK* they pick up sound well enough because the phones don't give you any feedback in your own ear. Normal phones do give feedback and people are used to that. When you hear no feedback, you think "hey this phone must not be picking me up very well".

    It may be a neat bit of technology they've come up with, but people won't stop shouting into their phones until they get feedback.
  • Phones that can understand your mouth movements will probably have to translate these movements into some sequence of sounds that correspond to the speaker's langauge words. What I mean is that the phone will have to KNOW what language you are speaking in order to be able to translate your mouth movements into sounds meaningfull for your language. I made a joke a couple of days ago about phones understanding their owners, I was with my girlfriend and told her that it is amazing what my cell phone heard from me so far, for example when I talk to her. Anyway, cell phone companies would love to sell these phones since it will mean more upgrade capabilities - do you want your phone to speak english? russian? german? japaneese? Ha! Another 99.99$!

If you steal from one author it's plagiarism; if you steal from many it's research. -- Wilson Mizner

Working...