Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
AI Technology

Nvidia's Riva Custom Voice Lets Companies Create Custom Voices Powered by AI (venturebeat.com) 26

At its fall 2021 GPU Technology Conference (GTC), Nvidia unveiled Riva Custom Voice, a new toolkit that the company claims can enable customers to create custom, "human-like" voices with only 30 minutes of speech recording data. From a report: According to Nvidia, businesses can use Riva Custom Voice to develop a virtual assistant with a unique voice, while call centers and developers can leverage it to launch brand voices and apps to support people with speech and language disabilities. Brand voices like Progressive's Flo are often tasked with recording phone trees and elearning scripts in corporate training video series. For companies, the costs can add up -- one source pegs the average hourly rate for voice actors at $39.63, plus additional fees for interactive voice response (IVR) prompts. Synthesization could boost actors' productivity by cutting down on the need for additional recordings, potentially freeing the actors up to pursue more creative work -- and saving businesses money in the process. For example, Progressive used AI to create a Facebook Messenger chatbot with the voice of Stephanie Courtney, who plays Flo. KFC in Canada built a voice in a Southern U.S. English accent for the chain's ambassador, Colonel Sanders, in the company's Amazon Alexa app. Duolingo is employing AI to create voices for characters in its language learning apps. And National Australia Bank has deployed an AI-powered Australian English voice for the customers who call into its contact centers.
This discussion has been archived. No new comments can be posted.

Nvidia's Riva Custom Voice Lets Companies Create Custom Voices Powered by AI

Comments Filter:
  • by cirby ( 2599 ) on Tuesday November 09, 2021 @09:28AM (#61971069)

    "potentially freeing the actors up to pursue more creative work"

    In other words, only paying an actor for a short session instead of a full day, with no retakes.

    Never mind that other companies will also be "freeing up" actors at the same time, creating much less overall work for those actors.

    • Or the actors could record a standard set of voice recordings, then license their "voice". Passive income, as they won't need to be present to do future recordings.

      Of course, they're going to want a clause in the license about script approval. Otherwise someone will create racist screeds in the actors voice, etc

  • This "potentially freeing the actors up to pursue more creative work" sounds so much bette than "sorry we don't need you anymore"
  • by Arnonyrnous Covvard ( 7286638 ) on Tuesday November 09, 2021 @09:33AM (#61971089)
    Verify Me.
  • ... is back!

  • I wonder if someone can take 30 minutes of a famous actor by just poaching from movie scenes and using that as a source without permission. Rips of people like Samuel L Jackson are going to be popular.

    • by ceoyoyo ( 59147 )

      Yes.

      A more interesting question is whether you'll be able to take a famous actor's voice and mix it with something else juuuust enough so that it's different enough not to get sued.

  • There's a live about how this will free up voice actors for more opportunities. It then mentions that voice actors are paid hourly. That's not freeing you up for more opportunities that's cutting your hours and your pay. Instead of being paid to come in and record lines periodically you'll get a paycheck with 30 minutes and I'll never call you again because they don't need you now that they've got the data from your voice. But to read this article you think we'd entered some sort of utopia.
    • The thing you have to keep in mind when it comes to articles like this is that we're typically getting the perspective of whatever business is going to be able to use this tech to save money. In their view, this will be a utopia. They cut massive profit losses to paying actual actors, and literally get to pocket that money from here on out. That's a huge managerial / accounting win.

      It's the same way any management feels about being able to still get the job done while paying fewer people to do it. They

  • Can anyone record me publicly, create a voice print, then use it to produce works with my voice, all without my approval?

    The answer to this question has an enormous impact on what this technology means for society, for consumers, and for artists.

    My great hope for this technology is to support dynamic role-playing games. The ability for a modder to create a new module using a game's existing characters and voices is a "game changer". Or - even better - procedurally generated NPCs and dialogue the likes of wh

    • by Kisai ( 213879 )

      Yes and No.

      The easiest way to prevent your voice from being "deep-fake"-able is to never have a 1:1 conversation without background noise or music. Deepfakes are only possible because a high-quality recording of the voice exists somewhere. Resembler ai only needs a 5-second clip of a voice to make something "reasonable" sounding like someone else, but it will not copy their accent or voice pitch. That requires additional work.

      Neural voices are all trained on LJSpeech, so they will all sound somewhat english

  • "I have the best voice, and I know voices, believe me! Billions love to hear it every day, and play it over an over. Even CNN likes it; they put me on all the time to boost their loser ratings. There's even a MyPillow speaker with my voice so you can hear me in your dreams, wonderful wonderful dreams, like orange feathery silk, not unlike my terrific hair. Bigly Smoooth. #MVGA!

  • Sadly, the residuals would make this cost-prohibitive.

  • by Junta ( 36770 ) on Tuesday November 09, 2021 @12:20PM (#61971487)

    For their RivaTNT variant.

  • I've been playing with TTS and ASR stuff for over a year (well much longer if you count certain things, but have only made practical use of it in the last year.)

    The ability to "quick train" using samples of someone's voice has been possible for around two years, This is what "Resemble"'s stuff is, or at least started out. You can find their original project on Github, it works but it's exceptionally low-quality. Basically the style transfer works, but it transfers gaps in audio as well, so it tends to sound

  • 3Dfx Voodoo Custom Voice Lets Companies Create Custom Voices Powered by AI

Every nonzero finite dimensional inner product space has an orthonormal basis. It makes sense, when you don't think about it.

Working...