Nvidia's Riva Custom Voice Lets Companies Create Custom Voices Powered by AI (venturebeat.com) 26
At its fall 2021 GPU Technology Conference (GTC), Nvidia unveiled Riva Custom Voice, a new toolkit that the company claims can enable customers to create custom, "human-like" voices with only 30 minutes of speech recording data. From a report: According to Nvidia, businesses can use Riva Custom Voice to develop a virtual assistant with a unique voice, while call centers and developers can leverage it to launch brand voices and apps to support people with speech and language disabilities. Brand voices like Progressive's Flo are often tasked with recording phone trees and elearning scripts in corporate training video series. For companies, the costs can add up -- one source pegs the average hourly rate for voice actors at $39.63, plus additional fees for interactive voice response (IVR) prompts. Synthesization could boost actors' productivity by cutting down on the need for additional recordings, potentially freeing the actors up to pursue more creative work -- and saving businesses money in the process. For example, Progressive used AI to create a Facebook Messenger chatbot with the voice of Stephanie Courtney, who plays Flo. KFC in Canada built a voice in a Southern U.S. English accent for the chain's ambassador, Colonel Sanders, in the company's Amazon Alexa app. Duolingo is employing AI to create voices for characters in its language learning apps. And National Australia Bank has deployed an AI-powered Australian English voice for the customers who call into its contact centers.
Re: AI is not real (Score:3)
Maybe it's the concept you file under "AI" that's not real and you need to change your understanding.
Also, animated characters with good lip sync (Score:2)
About that... (Score:3)
"potentially freeing the actors up to pursue more creative work"
In other words, only paying an actor for a short session instead of a full day, with no retakes.
Never mind that other companies will also be "freeing up" actors at the same time, creating much less overall work for those actors.
Re: About that... (Score:2)
Or the actors could record a standard set of voice recordings, then license their "voice". Passive income, as they won't need to be present to do future recordings.
Of course, they're going to want a clause in the license about script approval. Otherwise someone will create racist screeds in the actors voice, etc
You're fired! (Score:1)
My. Voice. Is. My. Passport? (Score:4, Funny)
The AFLAC Duck ... (Score:2)
Re: (Score:1)
No, that's Joan Rivers.
voice source? (Score:2)
I wonder if someone can take 30 minutes of a famous actor by just poaching from movie scenes and using that as a source without permission. Rips of people like Samuel L Jackson are going to be popular.
Re: (Score:2)
Yes.
A more interesting question is whether you'll be able to take a famous actor's voice and mix it with something else juuuust enough so that it's different enough not to get sued.
Orwellian (Score:2)
Re: (Score:2)
The thing you have to keep in mind when it comes to articles like this is that we're typically getting the perspective of whatever business is going to be able to use this tech to save money. In their view, this will be a utopia. They cut massive profit losses to paying actual actors, and literally get to pocket that money from here on out. That's a huge managerial / accounting win.
It's the same way any management feels about being able to still get the job done while paying fewer people to do it. They
What property protections does you voice hold? (Score:2)
Can anyone record me publicly, create a voice print, then use it to produce works with my voice, all without my approval?
The answer to this question has an enormous impact on what this technology means for society, for consumers, and for artists.
My great hope for this technology is to support dynamic role-playing games. The ability for a modder to create a new module using a game's existing characters and voices is a "game changer". Or - even better - procedurally generated NPCs and dialogue the likes of wh
Re: (Score:2)
Yes and No.
The easiest way to prevent your voice from being "deep-fake"-able is to never have a 1:1 conversation without background noise or music. Deepfakes are only possible because a high-quality recording of the voice exists somewhere. Resembler ai only needs a 5-second clip of a voice to make something "reasonable" sounding like someone else, but it will not copy their accent or voice pitch. That requires additional work.
Neural voices are all trained on LJSpeech, so they will all sound somewhat english
Already got mine... (Score:1)
"I have the best voice, and I know voices, believe me! Billions love to hear it every day, and play it over an over. Even CNN likes it; they put me on all the time to boost their loser ratings. There's even a MyPillow speaker with my voice so you can hear me in your dreams, wonderful wonderful dreams, like orange feathery silk, not unlike my terrific hair. Bigly Smoooth. #MVGA!
Majel Barrett or GTFO (Score:2)
Sadly, the residuals would make this cost-prohibitive.
I'm going to wait... (Score:3)
For their RivaTNT variant.
Re: (Score:2)
Don't get excited (Score:2)
I've been playing with TTS and ASR stuff for over a year (well much longer if you count certain things, but have only made practical use of it in the last year.)
The ability to "quick train" using samples of someone's voice has been possible for around two years, This is what "Resemble"'s stuff is, or at least started out. You can find their original project on Github, it works but it's exceptionally low-quality. Basically the style transfer works, but it transfers gaps in audio as well, so it tends to sound
And in a parallel universe... (Score:2)
3Dfx Voodoo Custom Voice Lets Companies Create Custom Voices Powered by AI
Does anyone think this will be beneficial? (Score:2)