Google's One Step Closer To Building Its 1,000-Language AI Model 17
Google's progressing toward its goal of building an AI language model that supports 1,000 different languages. The Verge reports: In an update posted on Monday, Google shared more information about the Universal Speech Model (USM), a system Google describes as a "critical first step" in realizing its goals. Last November, the company announced its plans to create a language model supporting 1,000 of the world's most-spoken languages while also revealing its USM model. Google describes USM as "a family of state-of-the-art speech models" with 2 billion parameters trained on 12 million hours of speech and 28 billion sentences across over 300 languages.
USM, which YouTube already uses to generate closed captions, also supports automatic speech recognition (ASR). This automatically detects and translates languages, including English, Mandarin, Amharic, Cebuano, Assamese, and more. Right now, Google says USM supports over 100 languages and will serve as the "foundation" to build an even more expansive system. You can read more about USM and how it works in the research paper Google posted here.
USM, which YouTube already uses to generate closed captions, also supports automatic speech recognition (ASR). This automatically detects and translates languages, including English, Mandarin, Amharic, Cebuano, Assamese, and more. Right now, Google says USM supports over 100 languages and will serve as the "foundation" to build an even more expansive system. You can read more about USM and how it works in the research paper Google posted here.
Finally (Score:2)
Finally we can get closed captions in Tetawo.
Mumbling like a drunk in 1000 languages (Score:2)
life couldn't get better.
All those parameters! (Score:2)
2 billion parameters, all to deliver the worst closed captioning I've ever seen. Impressive!
Re: (Score:2)
2 billion parameters isn't much. That'll easily run on a rather low-end modern consumer-grade GPU.
Re: (Score:3)
Really?
I turn it on when a Brit is slurring and I can't make out a damn word he's saying.
Results are impressive.
At the same time my "Google Assistant" is damn-near useless for dictation.
Count me impressed (Score:3)
Re: (Score:2)
>++++++++[-]++++[-]>++++++[-]++++++[-]>>++++[-
]+.
Re: (Score:2)
Languages maybe, but accents and dialects? (Score:2)
For various, unrelated reasons I have seen a variety of Google training videos in the past several weeks. Two problems are apoarent:
First, Google has a wide variety of people presenting videos, even in the same series. One time, you get a thick Indian accent, the next time Chinese, the next time something else. You spend so much time struggling to understand the words that the meaning behind them gets lost.
Second, getting back to TFA: their automa
Re: Languages maybe, but accents and dialects? (Score:2)
Re: (Score:2)
Re: (Score:1)
The dividing line between how different two speech patterns have to be, to be different dialects as opposed to merely different accents, is quite blurry, and so for that matter is the dividing line between being different enough to count as distinct languages, vs merely different dialects of the same language. In principle, accents are mostly about pronunciation and word choi
Not news yet? (Score:2)
What would be news is if it outperforms existing solutions.
Simon Munnery (Score:2)
So, the best source of puns... (Score:2)
... ever.
not impressed (Score:2)
If YouTube closed captions are any indications, this technology is not very impressive. There are constant mistakes to the point where I have to turn them off because of the distraction, even though I generally prefer them to be on due to the inconsistent audio levels on YT.
Re: (Score:1)