Online Speech Indexing 87

Posted by CmdrTaco on Wednesday December 08, 1999 @12:24PM from the isn't-that-nifty dept.

Thomas Edwards from The Sync (where we host Geeks in Space) sent us an interesting site: "Speechbot" is a Compaq Research project that is indexing online radio shows. Apparently it found terms like 'Red Hat' and 'Yahoo' in past episodes of GiS. Interesting technology. Imagine when it lets me ask my TV to find me every show that mentions Sarah Michelle Gellar.

This discussion has been archived. No new comments can be posted.

Online Speech Indexing

Load All Comments

Search 87 Comments Log In/Create an Account

Comments Filter:

Or Gillian Anderson (Score:1)

by prop-hed ( 110220 ) writes:

mmmmmmmmmmmmmmm, FBI Betty, oooooooooooh.
TV is already text searchable. (Score:1)

by NomadYarg ( 123252 ) writes:

TV input cards that let you watch TV on your computer will download all of the "Closed Captioning" text and then let you search it. There is no reason why you couldn't setup something that watches all channels for closed captioning text and indexes all of it.
Espesially with these new hard-drive VCR's that let you pause a TV show while you go "take a #2". This device could also watch particular channels and buffer them for an hour while it searches it's closed captioning for specific search criteria. When it sees the text you are looking for in the closed captioning text it will send the whole program to tape. It would be pretty nifty I think if I could have my computer watch the Science Channel for anything to do with Quantum computing and when it does it will output the whole thing to tape. It's like having your computer watch TV for you and only get the good stuff. :-)
Re:Text to Speech problems? (Score:1)

by frantzdb ( 22281 ) writes:

Perhaps it should be reading the web pages that link to a clip and finding some context from that. Esp. when the computer doesn't know who's talking context becomes evertying. Obviously gathering context from sourrounding audio isn't sure fire.

--Ben
Re:Nice... (Score:1)

by Goner ( 5704 ) writes:

Word! Since you guys use it (RealNetworks stuff), I figure you would be complaining if you didn't like it. I didn't know that they devoloped RealServer on Linux. And I never even thought about Windows Media Player, yikes!
Sometimes you actually learn something on /., huh.
BTW the sync rocks for hosting geeks in space, the most freaky and funny talk show I've ever listened too. With this Andover business, the show needs to lay down some cash to get a call in section, so all of the geeks across the nation (err, umm globe, sorry) can call in and be nerdy with the kings.
-Rich
Re:Impressive...? (Score:1)

by Midnight Ryder ( 116189 ) writes:

Even at 3x realtime, I could probably live with working with it based on what I saw on the pages. I'd have to record everything to wav's first, then have one of my machines on the network handle processing it later.

This sounds like it's going to turn out to be quite the kick ass product!

I'll ask the same question the other guy did - Open Source, or proprietary? And in the future, can the SQL server be replaces with something else like MySQL, etc.? Heck, for that matter, got something that gives us even more information?

Hell, need a beta tester to do any stress testing on this puppy? I'm yer man!
Re:Echelon, anyone? (Score:1)

by The CrapHead! ( 5146 ) writes:

They probably need more powerful searches than that. Their patent for searching and sorting text [ibm.com] describes a system for searching and sorting "speech-based text, optical-character-read text, stop-word-filtered text, stutter-phrase-filtered text, and lexical-collocation-filtered text," according to claim #2. [ibm.com]
Re:Impressive...? (Score:3)

by xyzzy ( 10685 ) writes: on Wednesday December 08, 1999 @10:58AM (#1475403) Homepage

Ohoh, I've stirred up a firestorm here :-)

Re: SQL -- it can be any SQL server, really. However, I will add that we are somewhat in bed with Microsoft on the visualization end, simply because IE5 does XML quite well (note to Mozilla people: get with the program).

Re: Open source. Unfortunately, not up to me. Much of the technology is "open source" in the sense that papers have been published about it (not what you were looking for, I know), but we've already licensed some of the core technology to another company, and being a phone company (GTE) we consider the speech rec somewhat of a competitive advantage (wipe those Echelon thoughts out of your mind! We use it for call center and directory assistance automation! Sheesh :-)

As I posted probably about 6 months ago in a thread about speech recognition, there are some significant issues with open-sourcing beyond the recognizer code. The learning processes behind the recognition are based on a considerable amount of data for which licensing is an issue, such as CNN broadcasts. In fact, we use over 100 hours of broadcast news audio to train the system, and several million words of text for the language model. This comes to us through the Linguistic Data Consortium at the University of Pennsylvania (http://www.ldc.upenn.edu). This is an academic group set up to maintain these common train-and-test databases for researchers, and there's a fairly sizeable fee to join. They handle the intellectual property issues with the training data.

And, unfortunately, without the training data, it's kind of hard to use the system. At least, if you want to use it on something it's not already trained on (in our case: north american broadcast news).

Share
twitter facebook
Re:Impressive...? (Score:1)

by Midnight Ryder ( 116189 ) writes:

I wouldn't call it a firestorm - no flameage involved coming from me. Just lots o' questions. I've been waiting for something like this to come along that can handle transcribing a conversation. I've got more than one application I'd love to use it for - most of the pretty fluff, mind you, and none of them professional.

Of course, in the end, this thing is going to be outside of my price range (if available at all to 'consumer level' people) based on what it's for.

I can keep dreaming I suppose :-)

As for Open Source - it's a valid question to ask about any product any more, but, doesn't mean that it HAS to be Open Source to be useable.
Humourous Transcriptions (Score:1)

by grepMeister ( 37303 ) writes:

Did anyone try searching for "Binks"? I got 2 results, "charge are binks" and "charge or binks".
More of this (at dutch university) (Score:1)

by Titanhead ( 24709 ) writes:

For another project that does something like this (I think) see:
http://parlevink.cs.utwente.nl/P rojects/olive.html [utwente.nl]
Re:Linux = iMacs? (Score:1)

by Otto ( 17870 ) writes:

Just did a search for "line next" (exact phrase), and got quite a bit of "and on the line next we have...", a la talk show introductions.

Oh, I was only searching "Geeks in space," not the whole thing..

Oh, and speaking about the engine... I thought that it might store the transcripts in some sort of phonetic format, and then match your search phrase to it (i.e., "your not hyped" and "journal typed" both match the stored, transcribed phoentics). So I did some searches to match the same sections extracted, and they all seem to match. Conclusion: the transcripts appear to be stored as text, not phonetic symbols.

That's quite a good idea actually. Store it as phonetic symbols, translate the search query into symbols on the fly, then search based on symbol..

Better patent it quick, or put it in the public domain for all to use. :-)

---
The Remarkable Media Search (Score:3)

by GoRK ( 10018 ) writes: on Wednesday December 08, 1999 @08:17AM (#1475408) Journal

What is particularly interesting to note is that the quality of these Internet Raido shows is generally fairly poor. The voice recognition and dictation software that I have toyed with before have always suggested using better microphones and higher sampling rates to achieve decent results. Some even claimed that low quality audio results in a severe accuracy penalty.

It is very remarkable that this thing can index these low quality streams with the accuracy that they do! I hope that searchable media (other than text) continues to get better like this. Companies like Virage and Compaq definately deserve our support. I hope that standard interfaces appear soon.

~GoRK

Share
twitter facebook
Indexing TV via closed captioning (Score:1)

by kriegsman ( 55737 ) writes:
Indexing broadcast TV should be easier...
1. Capture video feed
2. Decode closed captioning text
3. Make text index
Anyone know of this technique being used today?
The Closed-Captioning FAQ [robson.org] seems to think that using speech recognition to generate text from broadcast audio "isn't there yet [robson.org]" technology-wise.
No linux? ESR is a duck? (Score:3)

by Otto ( 17870 ) writes: on Wednesday December 08, 1999 @08:19AM (#1475410) Homepage Journal

It found no instances of the word Linux, which I found humorous.

However, a little brain usage, search for "line" and get this:

... there an a to think you're doing is making good news slash my next monday's announcement makes it you can use less leonard still want to which it's tilman of the of the open sores movement I have not part of the open sewers that but why in part of the priests out their foundations giving his last line next to the flashlight next with you a while and we can end of the various duckling and in the he's serving snacks that promptly opening the top of that there is god who will bomb and the crowd is bernie this is definitely the most exciting play a thing would have to have one of I mean you for a column about how ...

The words "end of the various duckling" and around there are in fact "Eric Raymond" in the clip, which I thought utterly hysterical. You can tell because they say "it's Eric Raymond, and he's serving snacks," which partway comes out correct.

Linux seems to have came out as "line next" a lot, and "line of" in some clips I've found..

Obviously, the technology is not quite there yet. :-)

---

Share
twitter facebook
q (Score:1)

by Cylix ( 55374 ) writes:

Obviously your going to see more of this, the web is growing vastly more popular with every passing day, especially with everyone and thier uncle attempting to package the web in a nice little gift wrapped box for sale.
Although I should probably be more worried about the ramifications of being able to search though countless fields of speech...I am actually more concerned with a different aspect of this new means of indexing.
My concern is this, with each new means of indexing speech and text becoming readily avaible could this reault in web sites being aggressivly taxed by indivduals/machines not even barely using these services. Granted these are just my simple uniformed worries with little merit. Although it would be interesting to see the results if a slew of these technologies developed and became devastating popular. (types that would work from your home computer and index your favorite speech/web sites) :)
Re:Speech processing (Score:1)

by Midnight Ryder ( 116189 ) writes:

Yes, you can have it done that way. Unluckly, if there's anything odd going on with the Cable line (which is most of the time) you get some really strange output from the CC information. My card and software does it, and from time to time it skips parts of words, inserts 'odd' characters, etc., etc., etc. So, you could set it up with the trigger text, but, the likelyhood of getting it to work right all the time is a bit low. (Just based on what I've seen - I'm sure it's different in other areas of the country.)
The possiblities (Score:2)

by TheFitz ( 113719 ) writes:

Just think about it, add a regexp to this...you can get Bill gates saying things like: "Microsoft Windows dominates the market due to our huge inovation." and apply a quick couple regexp's (excuse me if they are long code, not really trying)

$gatespeak =~ s/Microsoft/Micro\$oft/g;
$gatespeak =~ s/Windows/Winbloze/g;
$gatespeak =~ s/dominates the market/controls your lives/g;
$gatespeak =~ s/inovation/stealing and strongarm tactics/g;
To get this: "Micro$oft Winbloze controls your lives due to our huge stealing and strongarm tactics". Wow, you can actually get Billy boy to speak the TRUTH!
Re:Nice... (Score:1)

by rmull ( 26174 ) writes:

After a bit of prodding, I got whatever version of windows realplayer I have installed (4.x or G-something) to run under wine - I expect more recent ones ought to as well. It wasn't too hard, just "wine rvplayer.exe", but you've gotta save the link on the web page to a .rm file, then open that from withing the player.

Not perfect, but certainly better than nothing!
About that SMG comment (Score:1)

by scottj ( 7200 ) writes:

Imagine when it lets me ask my TV to find me every show that mentions Sarah Michelle Gellar.

Well, I guess that would be a good way of determining what NOT to watch. ;)
--
who cares about the NSA... (Score:1)

by CrudPuppy ( 33870 ) writes:

who CARES about the NSA...I just got a giggle
out of searching the Art Bell Show for the word
UFO *grin*

(for those of you who are not clued in to Art
Bell, he is famous for conspiracy theories)

:)
The NSA is gonna sue their ass (Score:1)

by belphegore ( 66832 ) writes:

I'm pretty sure the NSA has recently been granted a bunch of patents covering this kind of thing. Just wait till Compaq gets cease-and-desist letters from Janet Reno... You thought going up against RIAA on intellectual property was hard... Try the US Government.
Re:Processing power and time? (Score:3)

by anl ( 9070 ) writes: on Wednesday December 08, 1999 @08:47AM (#1475418) Homepage

The press release [compaq.com] has a little more information. We use workstations running NT to spider the sites; processing is done on a farm of Linux servers, and the UI runs on AlphaServer DS20 machines.

Share
twitter facebook
Ha. For a great laugh check out the transcripts. (Score:1)

by AndyL ( 89715 ) writes:

This is the introductions section of the show : "I get the deductions on robb commander cockrell mullah eyeing the toast and or mayan jeffery"

I relise that theres a disclamer saying that the transcripts wouldn't be exact. But I was expecting on or two words off.

Actualy, after looking through a couple more of these transcripts it seems to have a problem with the nature of Geeks in space. That is, when the voice changes it takes the program a little while to catch up. When there are long stretches of only one person talking it seems to do better. ("the most interesting thing that pops up today is that microsoft is set up a box and they have basically challenged the internet to crack it...")

Interestingly enough not a single episode of GiS seems to include the word "Taco".
filler (Score:1)

by confidential ( 23321 ) writes:

so does that mean that the last half hour of a show will be filled with "sex, lies, conspiricy, money, $$$, free, topless" etc just to up its index like web pages do nowadays?
"fuck" search... This is more fun than babelfish! (Score:1)

by Saxton ( 34078 ) writes:

Try doing a search for "fuck." This is hillarious.

"... such chaos these days with possibly of that david thinks not to use nuclear weapons but it could use a biological or chemical try to get the fuck you mentioned before newberg thousand level..."

or better yet...

" ... and they knew that and we haven't done just like the chevy malibu fuck are still under oath whistled carefully crafted dig two hundred thousand miles of course..."

...it gets more bizarre:

"... your occurred several koppel with the mask of a focus of of fuck 'em berry's school principals the jaw and the m. our washington studio with a few arctic of the papal trip to do that to us from...

...then... listen to the clips and pretend they're actually saying it.

I need a life.

_________
Wasn't that a Nirvana song? (Score:1)

by spiralx ( 97066 ) writes:
Re:Speech processing (Score:1)

by ktm ( 122333 ) writes:

My ATI all-in-wonder does this. GATOS does not though....
Re:Some interesting bytes. (Score:1)

by patrick0 ( 109339 ) writes:

Interestingly their contact email addresses are @dec.com (ie. Digital Equipment Corp) .. so it's probably research that has been carried over from Digital. I wonder if they're using Digital Unix. Strange that they don't mention anything about the algorithms, software, hardware, people.
Re:Some interesting bytes. (Score:1)

by anl ( 9070 ) writes:

Our team is listed at http://speechbot .research.compaq.com/cgi-bin/query?help=about#team [compaq.com], and, in the press release [compaq.com], it explains that we use NT workstations for the content acquisition, a Linux farm to process the data, and Tru64 (formerly Digital Unix) machines to serve the site.

HTH,
Andrew Leonard
Webmaster
SpeechBot
SMG baby! (Score:1)

by Helmethead ( 107851 ) writes:

Haha. You people really know ur shit. I love the amount of buffy & SMG references here :P
Buffy Rox! :D
Speech recognition worries (Score:2)

by Anonymous Coward writes:

"Note: Indexed text does not match audio exactly." And you wonder what kind of technology the NSA has listening to us all right now?
Echelon, anyone? (Score:2)

by kevin805 ( 84623 ) writes:

The cryptography community usually believes they are a couple years behind the NSA, given that the NSA reads all their papers, but doesn't publish its own work.

I had been skeptical of Echelon being able to do word recognition on phone conversations, but I expect that the NSA is ahead of private industry in this area too, so Echolon looks plausible.

--Kevin
Echelon!!! (Score:2)

by um... Lucas ( 13147 ) writes:

Based on everyones assumptions around here, this would peg the NSA as having that capability since 1990 or so (just to pick a round number)... And it only came to light this year.

oh, and first post too... maybe
Re:searches the whole transcript (Score:2)

by Myddrin ( 54596 ) writes:

However, it can do "exact phrase" searches which is almost as good.

For example, searching on Black 47 returns 2,000+ hits when using the default search, but 0 when using an exact phrase....

Linux OTOH returns only two matches... sigh. Actually I wonder how much that has to do with the confusion over the pronunciation(sp?), considering I've never met two people who say it with the same exact phonetics....

Oh well,
RobK
searches the whole transcript (Score:1)

by jshare ( 6557 ) writes:

Apparently, the engine can't do "near" type searches. If you search for more than one thing, it looks through the whole transcript for the words.

So, you might get a result back that isn't quite what you are looking for.

Jordan
Re: (Score:2)

by account_deleted ( 4530225 ) writes:

Comment removed based on user account deletion
Dragon Systems (Score:3)

by Col. Klink (retired) ( 11632 ) writes: on Wednesday December 08, 1999 @07:32AM (#1475433)

Dragon Systems [dragonsys.com] (makers of Naturally Speaking continuous VR) announced a similar product at Comdex. They call it audiomining [dragonsys.com].

Share
twitter facebook
Re: (Score:2)

by account_deleted ( 4530225 ) writes:

Comment removed based on user account deletion
If the technology falls into the wrong hands (Score:1)

by Cigs ( 115253 ) writes:

do you think it could be adapted so I could filter out stuff, like....
1)Every mention of Celine Dion out of my radio and TV.
2)Every mention of Bill Gates out of Slashdot posts.
Just wonderin'
But as someone who has spent hours upon hours researching old radio shows for a college assignment, it sounds like a real good idea.
Yummy (Score:2)

by freakho ( 28342 ) writes:

Not only for the SMG thing, but also imagine the possibilities when applied to C-SPAN. Now you don't have to listen to hours of mind-numbing, coma-inducing boredom or hope and pray that the media will deign to bring a certain issue out of the Washinton black hole in order to find out about your favorite target of litigation. Like, say, the one you're reading.
speechbot transcript (Score:2)

by technos ( 73414 ) writes:

a hundred twenty gates at a note let me give you every moment in your life right we've got we've got that word the key guess like thirty five to defend the trees now we don't like pancakes free

Methinks a lexical generator produces better speech than the triumvirate of ./
The speech is out there... (Score:1)

by Raetsel ( 34442 ) writes:

Gives you some idea of what the NSA and Echelon are capable of... Interesting and incredibly useful if used properly, terribly frightening if used for those "black projects" our wonderful TLAs are so fond of running.
Hey, at least it'll make finding quotes and sound bites easier! The politicians will probably outlaw it, (for civilian use) of course...
Imagine what it could do if run against all the archived tape that CNN, or NBC, etc... have. Ah, the possibilities!
I wonder what Katz will have to say about this?
Another audio indexing system (Score:3)

by xyzzy ( 10685 ) writes: on Wednesday December 08, 1999 @07:46AM (#1475441) Homepage

Might as well use this as a chance to plug my project:

http://www.gte.com/AboutGTE/gto/bbnt/speech/rese arch/extraction/roughn_ready/index.html

...which not only tells you what words were said, but who said them, and what topics were being talked about...

Share
twitter facebook
Hmm... Mars murder ritual rides? (Score:3)

by Croaker ( 10633 ) writes: on Wednesday December 08, 1999 @07:46AM (#1475443)

Did a search for "Mars Probe" in the Science Friday show, and got this snippet:

.. of deep space walk which show the first I am to arrive and interplanetary space another mars or murder ritual rides the september twenty third of mars lander which lands on...

Err... yeah. That would explain a great many things about space probes. Actually, I'm sure the textified show would be a lot more interesting than the real show. And then, we could shove it through Babelfish for added enjoyment...

I recently installed the ViaVoice beta for Linux, and found its recognition not quite ready for prime time... at least for my needs. I'd be surprised if radio shows, which often have people on fairly crummy phone connections, would be an ideal candidate for automated indexing.

Share
twitter facebook
check out this logic (Score:3)

by / ( 33804 ) writes: on Wednesday December 08, 1999 @08:49AM (#1475444)

"I want to die" turns up 6
"Grits" turns up 12
"Sex with animals" turns up 5.
"Your mother" turns up 200.

My conclusion: "Your mother is still almost ten times as important as suicide, sex with animals, and grits combined."

Remember that, always.

Share
twitter facebook
Reads like bad poetry. (Score:2)

by Dast ( 10275 ) writes:

Check out the results of "Show me more". The transcript reads like bad poetry:

settling rata at and legs
to the team network concept
fighting ends
the single monolithic entity
of those are all things
that the challenger are undermined
neither side I think
what I look at the to the the marcie katz
the respective committees

(I didn't change anything but adding line breaks.)

Good for a chuckle. :)
Re:Processing power and time? (Score:1)

by anl ( 9070 ) writes:

Forgot to add my signature information to that post, so you have some idea where I'm coming from:

Andrew Leonard
Webmaster
SpeechBot
Re:filtering out celine dion (Score:1)

by quadong ( 52475 ) writes:

Is f**k rich? or fuck for that matter?
Buffy (Score:2)

by quadong ( 52475 ) writes:

The vampire Slayer.

Mind you, I didnt know this until I did a web search for her.
Re:Nice... (Score:1)

by TheSync ( 5291 ) writes:

Rant all you want about RealNetworks [real.com], but at least they have a Linux version of their player, more than I can say for the next leading competitor in low bitrate streaming media (i.e. Windows Media Technology). Linux is also the base OS for the development of the RealServer.

I do understand Real's need to "encourage" people to purchase the RealPlayerPlus, because they need to make money to keep up the excellent low-bitrate R&D they've been doing. Unlike Microsoft, they can't just chalk it up to selling more NT Server software.
babelfish (Score:3)

by quadong ( 52475 ) writes: on Wednesday December 08, 1999 @09:20AM (#1475452) Homepage

Someone should take these pseudo-transcripts and run them thru babelfish. Think of the gibberish level we could achive!

Share
twitter facebook
ViaVoice (Score:1)

by brigand ( 62956 ) writes:

The problem with ViaVoice, according to IBM's web pages, is that the Linux Beta does not have the training software the Windows version has. This is very aggravating because without any training, I'm getting only 30-50% recognition when I speak slowly and clearly. When trained, this should be much higher. Has any third party written software to train ViaVoice under Linux?
spooky (Score:2)

by ubermuffin ( 39292 ) writes:

Speech recognition is really neat and stands to greatly improve indexing and organization of non-text media. It looks like this is a pretty cool application of it, too.

That being said, let me say that something like this scares the crap out of me. This sort of technology is exactly what the FBI had in mind when it began to pressure telecommunications companies to make their phone lines more tappable. Now I don't remember the exact figure, but they wanted something like 1% of the phones in any metropolitan area tappable at once. 1% of the phones in New York City is something on the order of 50,000 phones. Tell me how you're going to keep track of all of that without a computer monitoring 50,000 conversations and looking for key words. You can't.

Monitor 1% of the population's conversations for some suspect keywords like 'bomb', 'assasinate', 'cocaine' or perhaps 'open source' and you've got one scary computer-assisted big brother watching over everything. If you don't hear anything juicy, shift to another 1%. I suppose people have had the technology to do realtime speech recognition and filtering for some time now, but the idea of maintaining searchable archives of phone conversations (enter Speechbot) is a genuinely spooky privacy violation.

Now, any technology is only as good or as evil as the people who use it. I will be cautiously interested to watch what Speechbot evolves into.
Re:Linux = iMacs? (Score:1)

by JabberWokky ( 19442 ) writes:

> Better patent it quick, or put it in the public domain for all to use. :-)
I wonder if, years from now when AT&T or some such august body patents it, I can claim a Slashdot post as "prior art".
Hurm....
--
Evan
Re:problem (Score:1)

by xyzzy ( 10685 ) writes:

...which would be...?
Re:Echelon!!! (Score:2)

by jd ( 1658 ) writes:

At the very latest, I'd say. The late 1980's sound more likely, but (as you say), that's a convenient round number.
This would also be about the time UK piliticians were banned from Menworth Hill, an NSA listening post in the UK, which would have been a likely candidate for early deployment of such technology.
Processing power and time? (Score:2)

by dr ( 93364 ) writes:

The FAQ [compaq.com] is incredibly vague and the About [compaq.com] page doesn't say much either in terms of the actual technology used. It says that they index 20 shows and index daily. Does anyone know what the time to actually do an index is and what kind of processing power these guys are using?
On an un-related note, the about page says that Compaq has a research lab in Australia... sweet.
-dr
Re:SPEECH TO TEXT rather! (Score:1)

by technos ( 73414 ) writes:

If I had a T2S that could reproduce that garbled mess as understandable speech, I could make a pile of cash turning Marketing Speak into English.

Might make IBM manuals more friendly too!
Nice... (Score:2)

by Goner ( 5704 ) writes:

\rant{But does where is realaudio at? The company itself is worse (in its smaller domain) than microsoft, I mean their version numbering (5,G2,7) is absurd, their website pushes you to download the plus version of their player (ie the one you have to pay for), and their monopoly on video (and most of the sound) on the web lets them get away with it. I believe they have an ok product, but their marketing schemes are stuck in mid-nineties "pay-for-this-better-version-now-even-though-a-bet ter-free-one-will-be-out-in-three-months ."
Things like pointcast have died due to this type of scheme, but real is still staying strong. The linux (unix) install scenario, and html documentation is absurd as well, and to be honest the reason for this rant. We'll just have to wait until some sort of disruptive technology forces real to compete, instead of stagnate.}
As far as the implications of this technology, echelon, etc. I just can't wait until I can do boolean searches through my old phone calls. Not like they're listening anyway...

-Rich
Some interesting bytes. (Score:3)

by kaniff ( 63108 ) writes: on Wednesday December 08, 1999 @07:58AM (#1475462) Homepage

When the guys introduce themselves, the translator has a fun time with their names and nick names.

Rob "CmdrTaco" Malda -- rob commander topple mall
Jeff "Hemos" Bates -- jeff in both states
Nate Ostendorf -- the husband or the smoke

I also searched for linux and I'll bet that it can't find any instances, because it doesn't translate it right. With all the different pronounciation possibilities.

It's a cool idea, but has a ways to go. Go Compaq.
yay.

Share
twitter facebook
Re:Reads like bad poetry. (Score:2)

by dr ( 93364 ) writes:

The transcript reads like bad poetry:
The site's FAQ [compaq.com] admits to that (in not so many words)...

Warning: The "transcript" that is output by the speech recognition software (and shown in small extracts on the Results and Details pages) rarely matches what was spoken exactly, and often often does not read very well. Because different people speak at different rates and with different degrees of clarity, speech recognition software does not correctly interpret every word. However, research has shown that meaningful words are recognized with a high degree of accuracy, and that even when a word is missed, it will most likely be recognized when it is spoken somewhere else in the program.

And in all fairness, they are not claiming to be a "transcript service" per se, though I can certainly see a lot of transcript writers losing their jobs in the future as the technology advances.
-dr
Re:The speech is out there... (Score:1)

by susano_otter ( 123650 ) writes:

Hrm. I can't help thinking of the so-called "stealth fighter", which was apparently fully operational in the late 70s/early 80s (erm, I haven't checked the exact dates on that).

Anyway, certain gov't TLAs always seem to be about 10 years ahead of what they're telling the rest of us. I wouldn't be surprised if the NSA [nsa.gov] has had open source drew barrymore since the E.T. days.

__________
filtering out celine dion (Score:2)

by Travoltus ( 110240 ) writes:

I'm gonna patent that.

That'll make me rich as f**k!
Re:Echelon, anyone? (Score:1)

by penguinicide ( 73759 ) writes:

Keep in mind the NSA does not need to produce transcripts, just scan for keywords. When a keyword is found, the last couple of minutes and the rest of the conversation would be recorded, logged, and shipped off to the agents on duty at the time.
Without the need to have thousands of words in the recognition engine it would definitely be much faster, and possibly much more accurate than current state of the art systems.
What`s worse, we might never find out!! (Score:1)

by Psychotic Illusion� ( 122381 ) writes:

If the government or the cops or some agency decided they wanted to act as "Big-Brother", then we might not evre find out. The F-117 stealth military aircraft was only a legend for 15 years, the best kept secret in the military history of the USA. (Or was it? There could be others - that we have never heard of ofcourse...) The point is that the government and military allways seem to be one srep ahead. Why should there not be someone monitoring us normal people and the things we do? If they have got one of these things organized in your neighborhood, you might never even hear of it. What would then happen if a guy was unlucky enough to tell a friend a joke about. Saddam Husain and a revolution? - You never know...
Re:Indexing TV via closed captioning (Score:1)

by xyzzy ( 10685 ) writes:

Yes, there are quite a few places that do that (CMU has a system called "InforMedia" that does), but there are a few problems:

a) Close-captioning is not an exact representation of what was said. Quite often it is a paraphrase (but speech recognition is errorful, of course);

b) There are many many many useful sources of audio that don't have closed captioning. A meeting, for instance. Or a foreign broadcast (closed-captioning is much more prevalent in the US than in other nations).
Re:Impressive...? (Score:1)

by xyzzy ( 10685 ) writes:

Unfortunately, one of the downsides is that we don't have a whizzy on-line demo for people to play with. I suspect that will change in a year, but until now you'll have to make-do with the screen shots.

HOWEVER, I do want to add that the system does run on standard, ordinary PC hardware. The indexer currently runs in 3x realtime (so a half-hour wavefile takes 1.5 hours to index) on a P3-500 running RedHat 6.0 with 512mb of memory. It deposits its product in an SQL-Server database on an equivalent PC running NT (no jeers, please). So the analysis and querying/browsing are decoupled.

We plan on having this down to realtime this year, both through algorithmic improvements and some additional hardware.
Re:Reads like bad poetry. (Score:2)

by gorilla ( 36491 ) writes:

What would be interesting would be to link up to a machine translation (such as babelfish), and then finally text to speech.
Remember (Score:1)

by Shotgun ( 30919 ) writes:

This technology can be tuned for specific words/pronunciations/dialects/languages/etc. I used IBM's VoiceType that came with OS/2. It actually performed fairly well with training, wasn't worth a damn before that though (I have a rather hard Southern accent and a deep voice.)

The Echelon folks wouldn't use this to transcribe phone conversations. They would use it to filter for interesting conversations and then appoint an agent to listen to the real thing. Save a lot of manpower that way.
Linux = iMacs? (Score:1)

by JabberWokky ( 19442 ) writes:

.
Just did a search for "line next" (exact phrase), and got quite a bit of "and on the line next we have...", a la talk show introductions.
The only computer related ones that I found without searching too hard were two instances of "iMacs" being translated to "line next".
Incidently, as we all searched for our favorite phrases, I got a pretty good accuracy on finding "Rocky Horror", but got *many* spurious returns on "Tim Curry". It seems to like to put the phrase "Tim Curry" into people's speech, which might mean that it's using a list of names of famous people within it's dictionary. It's a good idea, and makes sense; I wish they had more info on the guts.
Oh, and speaking about the engine... I thought that it might store the transcripts in some sort of phonetic format, and then match your search phrase to it (i.e., "your not hyped" and "journal typed" both match the stored, transcribed phoentics). So I did some searches to match the same sections extracted, and they all seem to match. Conclusion: the transcripts appear to be stored as text, not phonetic symbols.
--
Evan
Re:Impressive...? (Score:1)

by Zurk ( 37028 ) writes:

Any chance of GPLising and releasing it ? Or is this going to be locked down proprietary forever ?
re: Online Speech Indexing (Score:1)

by Rocketboy ( 32971 ) writes:

Ok, so I live in a box. Who's Sarah Michelle Gellar?
Speech processing (Score:1)

by Anonymous Coward writes:

Cmdr Taco said "What if I could find every instance of someone saying Sarah Michelle Gellar on TV?" You can. There are TV cards for the PC that can scan all of the closed captioning data on all of the channels at once for a word or phrase. I remember reading about this about three years ago. I think you can switch to that channel as soon as it is said. Anyone know more? -by WGS
Hmm needs a little work (Score:1)

by FoulBeard ( 112622 ) writes:

My results varied depending wildly depending on the input. For instance I searching "Geeks it Space" for the word "slashdot" and came up with no hits.
Are these algorithims Opensource by any chance? I looked into voice algoritims at one point, and it made my run back for my knuth book. I never found any real source for it. It is a fascinating subject and I am sure the opensource community could do a fantastic implentation of it. :)
I stop rambling now.
Re:Some interesting bytes. (Score:1)

by UnclPedro ( 67702 ) writes:

"Linux Underground" (from GiS 3.1) came across as "limits underground", so try searching on "limits".

------
Impressive...? (Score:1)

by Midnight Ryder ( 116189 ) writes:

Wow - I looked at it, and well, it looks damned impressive to me! Of course - there's nothing to really see but the screen shots, and of course information about what it can do.

I'd love to see demos of this stuff - I could have this stuff filter the TV for news for me :-) Or even better - I'd never have to take down notes after GMing a gaming session - I'd just let this thing transcribe the session on the fly!

Of course, I'm figuring the software that does the speech to text and indexing probably doesn't run on plain ol' PC hardware with Windows 98 or NT loaded

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Or Gillian Anderson (Score:1)

TV is already text searchable. (Score:1)

Re:Text to Speech problems? (Score:1)

Re:Nice... (Score:1)

Re:Impressive...? (Score:1)

Re:Echelon, anyone? (Score:1)

Re:Impressive...? (Score:3)

Re:Impressive...? (Score:1)

Humourous Transcriptions (Score:1)

More of this (at dutch university) (Score:1)

Re:Linux = iMacs? (Score:1)

The Remarkable Media Search (Score:3)

Indexing TV via closed captioning (Score:1)

No linux? ESR is a duck? (Score:3)

q (Score:1)

Re:Speech processing (Score:1)

The possiblities (Score:2)

Re:Nice... (Score:1)

About that SMG comment (Score:1)

who cares about the NSA... (Score:1)

The NSA is gonna sue their ass (Score:1)

Re:Processing power and time? (Score:3)

Ha. For a great laugh check out the transcripts. (Score:1)

filler (Score:1)

"fuck" search... This is more fun than babelfish! (Score:1)

Wasn't that a Nirvana song? (Score:1)

Re:Speech processing (Score:1)

Re:Some interesting bytes. (Score:1)

Re:Some interesting bytes. (Score:1)

SMG baby! (Score:1)

Speech recognition worries (Score:2)

Echelon, anyone? (Score:2)

Echelon!!! (Score:2)

Re:searches the whole transcript (Score:2)

searches the whole transcript (Score:1)

Re: (Score:2)

Dragon Systems (Score:3)

Re: (Score:2)

If the technology falls into the wrong hands (Score:1)

Yummy (Score:2)

speechbot transcript (Score:2)

The speech is out there... (Score:1)

Another audio indexing system (Score:3)

Hmm... Mars murder ritual rides? (Score:3)

check out this logic (Score:3)

Reads like bad poetry. (Score:2)

Re:Processing power and time? (Score:1)

Re:filtering out celine dion (Score:1)

Buffy (Score:2)

Re:Nice... (Score:1)

babelfish (Score:3)

ViaVoice (Score:1)

spooky (Score:2)

Re:Linux = iMacs? (Score:1)

Re:problem (Score:1)

Re:Echelon!!! (Score:2)

Processing power and time? (Score:2)

Re:SPEECH TO TEXT rather! (Score:1)

Nice... (Score:2)

Some interesting bytes. (Score:3)

Re:Reads like bad poetry. (Score:2)

Re:The speech is out there... (Score:1)

filtering out celine dion (Score:2)

Re:Echelon, anyone? (Score:1)

What`s worse, we might never find out!! (Score:1)

Re:Indexing TV via closed captioning (Score:1)

Re:Impressive...? (Score:1)

Re:Reads like bad poetry. (Score:2)

Remember (Score:1)

Linux = iMacs? (Score:1)

Re:Impressive...? (Score:1)

re: Online Speech Indexing (Score:1)

Speech processing (Score:1)

Hmm needs a little work (Score:1)

Re:Some interesting bytes. (Score:1)

Impressive...? (Score:1)

Related Links Top of the: day, week, month.

Slashdot Top Deals