Speech Recognition in Silicon 328
Ben Sullivan writes "NSF-funded researchers are working to develop a silicon-based approach to speech recognition. "The goal is to create a radically new and efficient silicon chip architecture that only does speech recognition, but does this 100 to 1,000 times more efficiently than a conventional computer." Good use of $1 million?"
Funny... (Score:5, Interesting)
If this really is true what they're saying, and knowing how much money is invested in speech recognition research on a yearl y basis, yeah, i would definately say that this is one million dollars of great investment...
1... million... DOLLARS!!! (Score:5, Interesting)
Let me think for a moment... Hell yeah! If we had low power speech processors, the possibilities would be endless. For one, we'd finally have a Star Trek(TM) interface for our homes!
"Computer, lights!"
"Computer, make coffee!"
"Computer, Earl Grey, hot!"
As silly as it may sound, such an interface would be far more efficient than mashing buttons.
In addition, blind people could be significantly helped by this. Many of them already use speech recognition and synthesis to assist in computer usage. Imagine if their computers could suddenly understand them a thousand times better? They could talk to their computers a bit more naturally, thus saving their vocal chords from undue stress.
Other applications (off the top of my head) are:
- Voice notes on embedded devices (store only text!)
- Helpful Kiosks that can give you directions
- A new use for natural language database queries (i.e. Ask the computer what last quarter's net sales were.)
- Voice controlled robots ("You missed a corner, vacuum cleaner")
- Data search by voice ("Find me a channel that plays Star Trek")
Any other cool ideas out there?
A measily $1 million? (Score:3, Interesting)
Heck, the hospital I used to work at by itself spent over a million dollars a year on medical transcriptionists
Re:1... million... DOLLARS!!! (Score:5, Interesting)
Universal language translators. Imagine headphones that let you understand any known language.
Good use of $1 million? (Score:3, Interesting)
History.. (Score:5, Interesting)
Initially, doing anythign beyond understanding a few words would take special hardware, but after a bit of 'training' highly acurate and fast speech to text was quite a possibility with a specially developed dsp.
Then, the pentium class cpus came about, and a p90 could just do the whole thing without the dsp.
So, now someone is developing a new dedicated piece of silicon for this.. lets see how long it takes for general purpose computers to catch up.
The issue is not that this is not usefull, but that it either has to keep developing, or offer a somewhat longer lasting price/performance ratio or much better features for a logn time to come.
Better approach (Score:3, Interesting)
We'll still need to do traditional development to interpret the data from the DSPs. We'll need to parse the output so that we can use natural commands to control devices.
"Coffee maker, brew 10 cups, strong."
"Bathroom lights, on."
Without some manner of AI to interpret them, these phrases will be useless.
LK
Yay! Boo! Uh... Oh bugger.... (Score:5, Interesting)
From the blog: ''Homeland security applications are the big reason we were chosen for this award,'' says Rutenbar. ''Imagine if an emergency responder could query a critical online database with voice alone, without returning to a vehicle, in a noisy and dangerous environment. The possibilities are endless.''
Like some slight tweaking in order to deploy massive voiceprint-recognition silicon arrays for amazingly efficient automatic realtime conversation transcription and identity determination, attached to Echelon [agitprop.org.au].
So cool... so potentially evil... head begins to hurt... tinfoil hat burning....
Pretty Ambitious, Harder than it sounds (Score:5, Interesting)
My Master's research was on implementing machine learning in hardware, specifically support vector machines.
Now, they have much more money than I did, and probably this will be a collaboration involving many graduate students, but converting complex algorithms from software to hardware is no easy task.
It is just easier to do things in software, that's why it has evolved. The modular layers of abstraction allow a Computer Scientist working in machine learning or speech recognition to not have to worry about how the underlying hardware works.
Working in hardware, a lot these issues come face to face. Particularly since you want an architecture on a chip, whereas in a conventional desktop/server system there are resources such as lots of RAM, harddrive space, etc are available and their interconnections have been built and refined over decades.
Throw in concerns about small form factor, low power consumption, quite fast a lot of unexpected hurles pop up.
My master's research goal was to produce a data mining/machine learning machine, or at the very least a data mining/machine learning co-processor. In retrospect, that was a very ambitious goal that would require many years of work, probably in collaboration with other graduate students.
What I ended up doing was just Support Vector Machines in digital hardware. Now granted, there is another aspect to my research that I'm not mentioning here, mainly that I didn't use normal floating point mathematical architectures, but a different innovative logarithmic based mathematical architecture. That in itself was a significant undertaking.
In any case, this sounds like a great project, I just wonder how much they can do in their (in an academic sense) very small time frame of 2-3 years. Even though a lot of preliminary work has probably already been done just to apply for the grant.
In any case, it is great to see something like this, something to keep in mind in case I ever go back for a Ph.D.
Re:1... million... DOLLARS!!! (Score:3, Interesting)
The same could be done with tea. Just keep a reservoir of hot water, a stack of tea bags, cubes of sugar, and refrigerated lemons. When you order tea, the machine would inject the bag into the hot water stream, then drop the sugar and lemon into the tea.
Voila, Earl Grey, hot!
You bet it's worth it (Score:3, Interesting)
Re:Funny... (Score:2, Interesting)
I'm wondering if you could just do this with your average ATI or Nvidia 3D chip and an FPGA wrapper?
Re:Funny... (Score:4, Interesting)
How far away are we from having a machine that could identify all of the instruments in a piece of music by "listening" to the music? I say "listening" because there need not physically be a playback-and-listen, the playback could be mathematically modeled by the computer.
Re:1... million... DOLLARS!!! (Score:3, Interesting)
brains are and probably should be modular (Score:3, Interesting)
After all, the human brain has different areas for processing different types of stimuli.
In fact, some parts of our brain are so radically different they are almost considered brains of their own.
like the cerebellem; it's often referred to as "the small brain". This controls motor coordination - and in humans allows us to do amazing things like flips, kung-fu, and cup-stacking.
And forgive me for forgetting the exact names, but the brain has layers as well. the outmost layer being the cortex (where most of the higher-level mamillian processing takes place - correct me if I'm wrong, the frontal lobe is pretty much purely cortical tissue). as you delve deeper you get into the hippocampus and medulla whatever (sorry IANAN I am not a Neurologist) which is where emotion rules - and if I again remember correctly is sometimes referred to as the "reptilian" brain.
Even the eyes themselves can almost be considered little 'brains' of thier own - considering the amount of pre-processing they do (maybe a co-processor would be more accurate).
make
The UN would probably use this heavily (Score:2, Interesting)
Re:Funny... (Score:1, Interesting)
Re:Funny... (Score:3, Interesting)
Just dial a number on your mobile phone, hold it up to the speaker while the tune you want ID'd is playing and it'll SMS you back shortly with the track name and artist. You can then log onto the Shazam website, enter in your mobile number and you get a list of all the tracks you've searched for along with links to an Amazon search so you can purchase the track.
Pretty good for ID'ing tracks when you're in a club and can't get to the DJ to hassle him.
Re:History.. (Score:1, Interesting)
Small low-power units are useful for say a soldier's helmet, or in a PDA.
I'd also say, that the same thing happened with 3D cards, and they keep making them faster/more features, but you could play half-life with software 3D on a 2.x Ghz PC looking pretty much the same as it did on a Voodoo card back in the day.
The question is rather, would there be much future speed advances in hardware, or once it's built, would later software recognition do as well - a little like DVD hardware cards. I have an encore card, but software decoding beats it now, and my DVD decoding doesn't need to be any faster.
I think the thing they're looking for is building some cheap (as) chips for embedded systems, like mobile phones and PDA's.
Live Chat & Search (Score:3, Interesting)
With speech-to-text, you could log all conversation to IRC.
Then you could have search engines that search *all conversation within the last 5 minutes, world-wide.*
Well, at least all conversation that was okay with being public.
So you could say, "Show me all conversations that are going right now about Python, [python.org] and immediately find the people talking about Python, wherever they were.
One step towards the HiveMind. [communitywiki.org]
Re:Carnivore on telephones (Score:1, Interesting)
Why doesn't this kind of thing bother more people?
national security? (Score:3, Interesting)
Re:Funny... (Score:1, Interesting)