Forgot your password?
typodupeerror
Communications Science

Detecting Speech Without Microphones 221

Posted by CmdrTaco
from the this-kind-of-wierds-me-out dept.
kyle90 writes "New Scientist is reporting on a new way of detecting speech without using microphones, using electrodes places on the neck that measure muscle activity and nerve impulses. Apparently the user doesn't even need to speak the words out loud in order for them to be detected. This looks like pretty neat technology; if used with cell phones it could give the user a little more privacy, and the rest of us a little more peace and quiet."
This discussion has been archived. No new comments can be posted.

Detecting Speech Without Microphones

Comments Filter:
  • by iggymanz (596061) on Sunday April 10, 2005 @10:08AM (#12193264)
    then we'd have to look at idiots moving their mouth in exaggerated motions....
  • Huh? (Score:3, Interesting)

    by Javanista (834415) on Sunday April 10, 2005 @10:08AM (#12193266)
    How do you get the same nerve impulses in your neck if your vocal cords are not vibrating?
    • Re:Huh? (Score:5, Interesting)

      by CSMastermind (847625) <freight_train10@hotmail.com> on Sunday April 10, 2005 @10:13AM (#12193301)
      The process of speech is one that involves several steps, you must inhale, make your vocal cords viberate, exhale through the viberating vocal cords, and then use your mouth and tounge to shape the air as it's going out to produce a certain sound. Any one of these steps can be done by itself but it won't produce speech. It's the same way you can "mouth" words to a freind who's sitting on the opposite side of a quite room, you are saying the words...just very quitely so that nobody can hear them.
    • Re:Huh? (Score:5, Informative)

      by InternationalCow (681980) <mauricevansteensel@@@mac...com> on Sunday April 10, 2005 @10:13AM (#12193302) Journal
      It works by virtue of the fact that your motor cortex plans ahead. So, even while you have not yet consciously taken the decision to speak yet, your motor cortex has already set up the appropriate commands and sent them out to the nerves involved. This translates to an increased firing rate in these nerves, which is not enough to move the muscles but will be sufficient to register on sufficiently sensitive equipment. In fact (other discussion entirely, but fascinating nonetheless) most of our "voluntary" decisions appear to be made before we become aware of them. So much for free will :)
      • Re:Huh? (Score:3, Insightful)

        by Javanista (834415)
        Yeah, but nerves send data in both directions. You get feedback from the vocal cords when they're working (as well as all the other tissue around them). It just seems like 'mouthing' words would exclude a lot of data from those neural pathways vs. actually saying them...
      • Re:Huh? (Score:4, Interesting)

        by Transcendent (204992) on Sunday April 10, 2005 @10:51AM (#12193564)
        So, even while you have not yet consciously taken the decision to speak yet, your motor cortex has already set up the appropriate commands and sent them out to the nerves involved. ... In fact (other discussion entirely, but fascinating nonetheless) most of our "voluntary" decisions appear to be made before we become aware of them. So much for free will :)

        That argument against free will is flawed. I've heard it many times, and it's always because of an assumption made on the events leading up to an action. It assumes that you make the decision to speak after your brain starts to setup speech for you... which is rediculous. We're not aware of them because we're not that in tune with our brain/body... which is how we function efficiently; we don't have to sweat the small stuff (Like keeping our heart beating? Perfectly controlling exactly which muscles to fire in walking?).

        There are many events before you actually speak that involve your decision to speak, such as thinking of (obviously) what to say, how to phrase it, tone of voice... even taking in breath before actually speaking. Even thinking "hmm... should I say this to so and so person" is a decision that would induce a response along the lines of speaking.

        Basically, you've already made the decision (consciously on some level at least) to speak before you do it, but it is possible to stop yourself right before you actually speak.
        • For one, the different input and processing pathways to the brain have a perceptible lag, which is variable from pathway to pathway. However we do not notice this because we time-shift the inputs after the fact.

          It helps a lot if you use transactional terms when discussing the perceptions of time in the brain. The whole process is not quite Serializable, but enough to suffice for day-to-day tasks.
        • You miss the point. Your internal voice is consistently several hundred miliseconds behind the action elsewhere in the brain. You might tell yourself not to say something, but that's only a reflection of a decision made elsewhere in the brain. You don't thinking by talking to yourself. Your internal voice is epiphenomenal. Hell, some people function just fine without one.
          • Hell, some people function just fine without one.

            How can you verify that?

            Can we take a poll?

            How many people here have voices in their heads? Raise your hands, please. :)
          • "You don't thinking by talking to yourself."

            Not quite right. When you think in words, you are sending the nerve impulses to your vocal chords, but not creating the sound. When you think in a less conscious manner - i.e. day-to-day decision making, putting the kettle on, as opposed to mulling things over - you don't. But reading to yourself and thinking to yourself are the same if you can hear the words in your head. If this is the case, then you are sending the nerve impulses.

        • One idea is that the phenomemon we call "free will" operates by having a veto power over actions that started a few hundred milliseconds before they were consciously noticed.

          In other words, it's "free won't" rather than "free will".

          That's not a scientific idea unless someone can imagine a way to test it, but it's a persuasive one because it matches our introspection about "stopping ourselves" from doing things. The "free won't" idea also lines up with the observation that many people become more active af
    • What good are electrodes when you can't vibrate your vocal cords?
  • Quick (Score:5, Funny)

    by Dachannien (617929) on Sunday April 10, 2005 @10:08AM (#12193269)
    However, both systems come at a cost. Because the words are produced by a computer, the receiver of the call would hear the speaker talking with an artificial voice. But for some that may be a price worth paying for a little peace and quiet.

    Get one of these for Ashlee Simpson, pronto!

  • by Nate4D (813246) on Sunday April 10, 2005 @10:09AM (#12193275) Homepage Journal
    This sounds almost exactly like the subvocalization technology that Ender uses to communicate with Jane in the later books.

    As those who've read it will remember, silent communication while around others can lead to a whole new set of problems all it's own... Especially when it's apparent that you're communicating, but not what you're saying.
    • That's exactly what I was thinking! But wasn't Ender's communication with Jane done through a jewel in his ear? I never really understood how the jewel in his ear could provide two way communication...
    • Isn't limited to Ender's Game. As an interface, it's a sci fi staple that goes back at least to the John Campbell days at Amazing Stories.
      nbsp;

      And the real world observance of the phenomenon is quite a bit older. Many people subvocalize while reading -- subconsciously forming each word in their throats, even if the sound never makes it from their mouths.
    • ... silent communication while around others can lead to a whole new set of problems all it's own... especially when it's apparent that you're communicating, but not what you're saying.

      Are the problems particular to silent communication, or are they caused by not being understood by those around you? When I speak to my family here in the States, it's apparent we're talking, but nobody has a clue what we're saying. It's the advantage of being from a country [visitmalta.com] of 400,000 people. It hasn't caused any problems

    • Just yesterday, as I was reading a novel featuring subvocalization tech, I thought to myself:

      "Self, you must have read, like, a hundred books that have subvocal mics, etc. But in real life, have you ever heard of subvocal mics?"

      And I was forced to confess to myself that no, I had not. And I continued to myself:

      "So is it impossible in theory?" And I knew that it probably was not. "So, as impossible as it may seem, has everyone simply overlooked development of subvocal mics?" And I conceded that this could
    • As those who've read it will remember, silent communication while around others can lead to a whole new set of problems all it's own... Especially when it's apparent that you're communicating, but not what you're saying.

      This is a problem with spoken language too. We had a couple of Phillipina workers at my dialysis clinic that were asked to stop speaking Tagalog to each other - it makes people uncomfortable when there is communication going on around them and they can't understand/percieve it. Happens

  • heh (Score:5, Funny)

    by B3ryllium (571199) on Sunday April 10, 2005 @10:09AM (#12193277) Homepage
    .

    I just said something, guess what it was?
  • by Faust7 (314817) on Sunday April 10, 2005 @10:10AM (#12193280) Homepage
    However, both systems come at a cost. Because the words are produced by a computer, the receiver of the call would hear the speaker talking with an artificial voice.

    With all due respect to Stephen Hawking, I'd rather not have my friends/parents/S.O. all sound like him.
    • Well this is just a guess but I'm sure with some sound programming, voice recording of basic sounds (letters and certain combinations of letters), and a little time and love, in the near future we'd be able to make it sound a lot more like you and alot less like a computer.
    • By "artificial voice" what they really mean is a sample of your voice reading some calibration text, sliced up into phoenemes and played back in bits and at the proper speeds based on what your face is doing. It doesn't actually need to know what you're saying, it just knows what your face is doing when you're saying it. It's just convenient to break it up at phoenemes because those are the smallest useful repeated elements in speech, which means you will have the smallest useful database of samples to play
    • Not all computer generated speech sounds completely robotic. AT&T has had it's Natural Voices Speech Engine demo [att.com] around for some time. I'm surprised that more text-to-speech programs haven't used this. I suppose licensing is pretty 'spensive.

    • Even if this tech is perfected, it won't mean everybody's normal, day-to-day speaking voices will be replaced with it. It's something you'd only hear in certain situations.
    • Reproducing audible speech from text is one thing; text doesn't contain the intonations, pace, or cadence of actual speech, and computers can't even make a good guess about replacing that lost information without knowing the intent of what was said.

      But producing speech directly from subvocalization should work better, because the speaker is supplying all the inputs of normal speech (except the air). Add to that a few parameters for a vocal tract model (from prerecorded samples), and you might get somethi

  • The speech pattern is sent to a computerised voice generator that recreates the speaker's words.

    Would you want to talk to microsoft sam? I can see this being used for speech to text conversions, but will it be possible to recreate tone, emotion? Why would you want to emulate this in a social situation anyway?
    • A lot of people already hate the way they sound on the phone, and could pick a better voice to represent them... dibs on Robin Williams! Alternatively, you could vocode your boss's voice into Frank Welker's or Gilbert Gottfreid's.
  • by G4from128k (686170) on Sunday April 10, 2005 @10:12AM (#12193295)
    This is a great idea until you mutter some expletive under your breath while talking to your boss. I can also foresee some embarrassments for those that can't read without moving their lips.
  • Vocal cords (Score:5, Interesting)

    by DaLukester (687299) on Sunday April 10, 2005 @10:12AM (#12193296)
    My first question is this: The vocal cords are resonators, they move because air is moving over them. If the cords aren't making any noise, it's because they aren't moving. If they aren't moving how does this system pick up their movement. If you have to sub vocalise (ie mumble quietly to yourself) then how is this different from the throat mike that has been around for ages. Very skimpy article for the New Scientist (all new, no science)
    • Re:Vocal cords (Score:2, Informative)

      by Zenmonkeycat (749580)
      It's not working based on the movement of the vocal chords, it's working based on the electrical impulses sent from the brain to muscles in the throat and mouth. I'm sure that the tension of the vocal chords could be measured, but the chords themselves don't have to be moving.

      Vocal chords themselves are not resonators, they simply excite motion in the air. The throat, mouth, nasal passages and sinuses are the resonators, sort of like the body of a guitar resonates with the sound excited by a string being

    • Re:Vocal cords (Score:5, Interesting)

      by wik (10258) on Sunday April 10, 2005 @12:14PM (#12194152) Homepage Journal
      I'm also suspicious. The distinction between many sounds is the placement or movement of the tongue. For instance, I can whisper and be understandable without moving my vocal cords. They describe this device as something that "detects" speech by observing the vocal cords, not the tongue. How does it work?

      Also, it sounds like the speech is recognized and converted into words in this system (as in Sphyinx or commercial voice recognition software?). The accuracy of even the best voice recognition software is still too poor to be used in general applications (and requires a fast P4 to do the recognition in real-time). It'll be a while before any cell phones carry this.
      • "For instance, I can whisper and be understandable without moving my vocal cords."

        No you can't.

        If you don't use your vocal cords at all, you aren't whispering. You're breathing out, which is soundless. Try it!

        In order to make a whisper, you contract the vocal cords slightly and thus create a taut edge which gives the sussuration we call... whispering. Try doing a 'heavy breath' and a normal breath and you'll see what I mean.

        Justin.

    • So how about wispering? No vocalization there, and you can understand it pretty well!
    • My first question is this: The vocal cords are resonators, they move because air is moving over them. If the cords aren't making any noise, it's because they aren't moving. If they aren't moving how does this system pick up their movement

      You are not quite correct. Vocal chords create sound by oscillating, yes, however they also tense and relax to control the pitch of the sound they produce. Air is not required to tense or relax these muscles - indeed if you were given appropriate feedback you could eas
  • Also known as (Score:2, Interesting)

    This reminds me of some Ann mccaffrey novels where the main characters communicate via 'sub-vocalisation'. It was a skill that needed to be learned and ended up being a slight movement of the jaws and some light humming when people were talking. If I remember correctly, also through some of Vernor Vinges' novels (namely A Deepness in the Sky)
  • by Anonymous Coward
    I can already see the next challenge: generating speech not based on muscle nerve signals, but directly on brain activity...

    Options for military / police uses seem unlimited. However I wouldn't really want that blonde to know what my nerves are doing about her...

  • I'll wait for something like this [bbc.co.uk] to develop beyond "computer cursor control". With little more tweaking it should be possible to use this thing to, at least, send text messages...
  • Looks like these folks [inquista.com] might be looking for a new line of work.

    But that's progress. innit?

  • How very 1980's. (Score:2, Informative)

    by Conor Turton (639827)
    Jesus...living in the 80's? Military radios were using throat mikes back in the 80's.
    • Re:How very 1980's. (Score:2, Informative)

      by Monx (742514)
      Jesus...living in the 80's? Military radios were using throat mikes back in the 80's.

      RTFTitle: Detecting Speech Without Microphones.

      Get it? There's no microphone.
  • privacy (Score:3, Interesting)

    by icepick72 (834363) on Sunday April 10, 2005 @10:24AM (#12193368)
    However, both systems come at a cost. Because the words are produced by a computer, the receiver of the call would hear the speaker talking with an artificial voice.

    And the cost of implicitly having every single word of your conversation immediately recorded into digital format. Very archivable.

  • "NASA Develops System To Computerize Silent, "Subvocal Speech" "

    http://www.nasa.gov/home/hqnews/2004/mar/HQ_04093_ subvocal_speech.html

    Are they using different methods? If they are (no time to RTHA) that would be cool, as it might double the chances of a working system.

  • What would make the technology even cooler is a speech channel segmentation system that directs out-loud speech to one conversation/phone circuit and silent/sub-vocalized speech to another conversation. That way someone could really have two conversations at once without putting people on hold/swapping lines.

    To avoid collisions, the receiver could use a buffer and sound accelerator that alternates the streams from the other side of the conversation. The only challenge would be the latency heard on the
  • Subconscious speech? (Score:4, Interesting)

    by Paul Townend (185536) on Sunday April 10, 2005 @10:33AM (#12193429) Homepage
    Could this have interesting ramifications when used in an interrogation? Would subvocal speech include bursts of what someone was thinking but did not want to say? Or anything from the subconscious?
    • that someone finds a new way of detecting speech, and someones first reaction (+4!) is "Hey, we could use this to torture people!"

      wtf?
    • by Threni (635302)
      It's nothing to do with the subconcious. It's just reading people's muscles instead of their lips. "Mind readers" (such as Derren Brown), "clairvoyants" and other such con artists use this technique, amongst others.
  • can we stop calling them vocal cords? they resemble nothing like cords. they are vocal folds, and we should think of them that way.
  • by Timesprout (579035) on Sunday April 10, 2005 @10:43AM (#12193494)
    I see a pretty girl, I get a bulge in my pants. Pretty girls sees me, sees bulge, smacks me in the face. Not a word said yet we are all perfectly clear where we stand.
  • This looks like pretty neat technology; if used with cell phones it could give the user a little more privacy, and the rest of us a little more peace and quiet.

    I think history shows that people will use the rudest and most annoying use of a technology whenever possible. In this case, I think they will still use "push to talk", not speak, but have the speakers on as loud as possible to "share" the other end of the conversation.
  • by Anonymous Coward
    We use it usually in places with noise like tanks. The receiver doesn't hear any background noise. Would be great for night clubs :)
  • by darkonc (47285) <stephen_samuel.bcgreen@com> on Sunday April 10, 2005 @10:56AM (#12193606) Homepage Journal
    The other day, I walked by someone who was sitting on a park bench by himeelf and talking to nothing/nobody in particular. It hit me that, 10 years ago, I would have taken this as a clear sign that the poor sod was completely off of his rocker. These days, however, if you see someone doing that, best bet is that (s)he's got a handfree cell phone on him and is talking to someone real.

    Now, I'm gonna have to deal with people walking around Mumbling to themselves!

    The next time I walk into an insane asylum^W^W Mental Health Facility, the only way I'm gonna be able to tell the difference between the visitors/staff and the patients is goint to be by looking for a badge.

    Actually, now that I mention it...

  • by EnsilZah (575600) <EnsilZah@noSpaM.Gmail.com> on Sunday April 10, 2005 @11:05AM (#12193667)
    I can't wait to have one so i could hook it up to some speakers and talk to people without moving my lips.

    Would probably creep people out... i mean... more than i usually do.. =\
  • if used with cell phones it could give the user a little more privacy, and the rest of us a little more peace and quiet."

    But you'd look like a lunatic walking around moving your mouth but not talking?
    • Re:Technology (Score:3, Insightful)

      by argent (18001)
      But you'd look like a lunatic walking around moving your mouth but not talking?

      People talking on handsfree cells already look like that.
  • I wonder if it would work for people who lost their larynx and who have to use those vibrator things to speak. Just have a speaker with a natural sounding voice and use it that way, to speak. It would look freaky, maybe a way to put a peaker in the mouth would help. Then again if the surgery removed the larynx maybe there's no muscle respone to detect.
    • I was going to bring this possibility up too, as it's a much better idea than the "silent cell phone".

      Also it may work for people like Stephen Hawkings, and other people who might know what to say but can't speak. It could in theory also be used as a simple universal translator. Each sentence would be run through a computer which could use Babelfish essentially to translate the speech in almost real time. It would be crude, but better than nothing in some situations.
  • Brin (Score:3, Insightful)

    by SWroclawski (95770) <serge@@@wroclawski...org> on Sunday April 10, 2005 @11:27AM (#12193828) Homepage
    This is very similar to David Brin's idea in the Book "Earth" with people needing to wear a strap on thier chin to measure the elctrical impulses for the very same reason.

    In the book he postulates that doing so, the actual movement can be reduced, and in time, you can speak quicker with this method than you can when actually vocalizing.
    • That book is exactly what I thought of as well. I seem to come across half a dozen news stories every year that he already thought of ... anyone who hasn't read the book, should.

      Anyway, in Earth, most people didn't use this technology even though it was available. The reason was control -- it took way too much concentration to control all of your thoughts *before* they activated subvocalizations. At best it was just annoying, like controlling a mouse on too much caffeine. At worst it could get pretty embar
  • Bone-induction Mics (Score:5, Interesting)

    by LordMyren (15499) on Sunday April 10, 2005 @11:36AM (#12193915) Homepage
    aircraft pilots have been using bone-induction mic's since WWII; there's no other way to block out the background noise. this is interesting because it reads from the nervous system directly

    are there any good bone-induction mics for cell phone / portable usage? i spent a while looking a couple years back and turned up two things, both of which were ear-mounted. i'd much rather a throat mounted system; i imagine its much better able to pick up sound.
    • Ear mounted (Score:3, Interesting)

      The ear mounted microphones have the benefit of being two-way devices. You can talk and listen with them. With a bone microphone you still need some sort of headphones to listen in a high noise environment.
    • by don.g (6394)
      Bone induction microphones do *not* read from the nervous system. They pick up vibrations in your bones (typically jaw bone, I think, but I could be wrong). Your ears do the same thing, which is why you sound different to how you normally hear yourself when you record your voice and play it back - you're missing the sound conducted by your bones to your ear.
    • are there any good bone-induction mics for cell phone / portable usage? i spent a while looking a couple years back and turned up two things, both of which were ear-mounted. i'd much rather a throat mounted system; i imagine its much better able to pick up sound.

      Help me out here: "And the ________ bone's connected to the throat bone."

  • I just can't imagine how a computer-generated voice produced from this technology would be better than the current text-to-speech engines (which aren't 90% as effective as human voice) that have the words in plain english before generating the speech. And it's fairly uncomfortable to listen to those programs, never mind converse with. So conversing with a program with less accuracy might make some go insane or casue wars due to some mis-understanding.

    At the same time, it's interesting application for peo
  • Phones should at least ship in vibrate mode, with a big sticker attached to the switch showing the normals how to turn the ringer on (and off again!). A really good tech upgrade would let a bluetooth signal at least request switching to vibrate (notifying with a vibration), if not autoforcing it. Then people controlling spaces could request quiet, targeting just those people who carry these sophisticated personal communicators. It's outrageous that many quiet events, particularly in auditoriums, can't manag
  • This could make speech recognition a higher bandwidth computer input method. The keyboard is currently king, for the following reasons:

    1. Speaking the amount most knowledge workers touch-type would be physically strenuous.
    2. It's culturally weird to talk to a machine. Imagine sitting in a cube farm with 100 voices talking to their machines. Too chaotic.

    This would seem to solve both problems. It probably has applications for the disabled too.

    Although I am a card carrying member of the Human League, I for
  • Isn't this techonology perfect for people who have no chors, or can not speak because they had cancer and things are damages physycally?
  • Remember Snake doing this on MGS?
  • considering that most people have never even learnt how to bloody whisper on the blody phone, I cannot expect them to keep bloody quiet!
  • Sounds like an electroglottograph aka laryngograph aka electrolaryngograph.

    -Don

    Glottal Enterprises EG2- PC electroglottograph [glottal.com]

    Summary

    Using both an electronically controlled resistance simulating the variations in neck resistance caused by vocal fold vibratory patterns and live measurements of vocal fold contact area, it is shown that the Glottal Enterprises EG series electroglottograph (EGG) has an inherent background noise that is less than that of the Laryngograph/Kay Elemetrics EGG units by rou

  • I hope it's more reliable than the Cone of Silence [cinerhama.com]!

    -Don

    "See Chief? It's working fine!"

    "We're supposed to be sitting, Max!"

    "We are sitting, Chief."

    "I'm telling you Max, this isn't a good idea!"

    "You see? Stuck!!"

    "No Max! Not THAT way!!"

    "AAAAAAAAAaaaaaagh!"

    "censored"

  • The principle behind this is astonishingly simple. I'm surprised it took this long for someone to think of the technological application.

    Deaf people (at least the few I know) have been taught to feel their throat to learn how to speak. (ie how making certain sounds "feels" rather than sounds)

    A case in point, one of my friends (deaf) was the first to notice a fire, as we were meant to (SOPs), she yelled "Fire, Fire, Fire" to alert everyone to the fire - she put her hand to her throat to ensure that she wa
  • OK, so we have some sensors (from the article) picking up the movement
    of the vocal cords. Great. What you have there, my friends is
    fundamental frequency. Not speech. You also need the formants.

    You could get (by picking up other movements in the head) a synthetic
    model of what the speaker is doing (raising the tongue in back,
    lowering it in front, opening the nasal passages) and use that to
    build a filter model to synthesize the speech, but such models sound
    like crap.

    I'd love something like this to work.

Life would be so much easier if we could just look at the source code. -- Dave Olson

Working...