'Inaudible' Watermark Could Identify AI-Generated Voices (techcrunch.com) 39

Posted by msmash on Wednesday February 08, 2023 @12:01PM from the how-about-that dept.

The growing ease with which anyone can create convincing audio in someone else's voice has a lot of people on edge, and rightly so. Resemble AI's proposal for watermarking generated speech may not fix it in one go, but it's a step in the right direction. From a report: AI-generated speech is being used for all kinds of legitimate purposes, from screen readers to replacing voice actors (with their permission, of course). But as with nearly any technology, speech generation can be turned to malicious ends as well, producing fake quotes by politicians or celebrities. It's highly desirable to find a way to tell real from fake that doesn't rely on a publicist or close listening.

[...] Resemble AI is among a new cohort of generative AI startups aiming to use finely tuned speech models to produce dubs, audiobooks, and other media ordinarily produced by regular human voices. But if such models, perhaps trained on hours of audio provided by actors, were to fall into malicious hands, these companies may find themselves at the center of a PR disaster and perhaps serious liability. So it's very much in their interest to find a way to make their recordings both as realistic as possible and easily verifiable as being generated by AI.

'Inaudible' Watermark Could Identify AI-Generated Voices

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 39 Comments Log In/Create an Account

Comments Filter:

- Re: (Score:3)
  
  by darkpixel2k ( 623900 ) writes:
  
  ...we're going to design a device that's sorta like an Electrolarynx that you will fit over your throat or embed under the skin.
  When you speak "on the record", it will activate and emit a tone just above/below the range of human speech that contains a PKI encrypted digital version of whatever the fuck you just said. This will allow all human speech to be verified by the blockchain. Just make sure you don't give a computer a copy of your voice certificate, or anyone can impersonate you.
  
  A market will ope
  - Re: (Score:2)
    
    by narcc ( 412956 ) writes:
    
    This will allow all human speech to be verified by the blockchain.
    Even in this future-tech fantasy, "the blockchain" is still a pointless bolt-on. In the future, we'll all wonder why anyone thought it was a good idea.
- Re: (Score:2)
  
  by ewibble ( 1655195 ) writes:
  
  I don't see how a watermark would fix anything. All someone has to do is release/alter the program so it doesn't insert a watermark.
  Even a watermark that is impossible to remove is useless if I can generate my own without a watermark.
  Isn't that the problem we are trying to solve, people producing fakes so now voice or video is no longer believable proof.
  Watermarking will only stop people selling other peoples generated content, not making their own.
  - Re: (Score:2)
    
    by suutar ( 1860506 ) writes:
    
    If you're generating your own, then you have your own model that you're training yourself, no? Seems like enough work to be a deterrent.
    - Re: (Score:2)
      
      by suutar ( 1860506 ) writes:
      
      (Note: "doing it yourself" is only a deterrent for individuals, not corporations or nations. If Russia wants to fake up some Biden speeches and distribute them, they're going to.)
- Not possible (Score:2)
  
  by Comboman ( 895500 ) writes:
  
  If irremovable inaudible and/or invisible watermarks were possible, piracy would have been finished long ago. See also "Analog Hole".
  - Re: (Score:3)
    
    by TheRealMindChild ( 743925 ) writes:
    
    Bluray discs *DO* have a yet-to-be-cracked audio watermark (Cinavia). Try to play a a copy on a standard consumer bluray player and about 20 minutes it, it pops up a message telling you that are playing a bootlegged copy and disables the audio. Right now there are two solutions for it... use an audio source that isn't from the bluray (dvd release for instance, but they have added the watermark there in recent years too), or compress and mangle the audio *so much* that it passes, but it sounds like complete
    - - Re: (Score:2)
        
        by aRTeeNLCH ( 6256058 ) writes:
        
        If the bit rate is that low, just cut off anything below 20Hz. Most people don't have sound systems that can output in that low range anyway...
- Re: (Score:2)
  
  by Kisai ( 213879 ) writes:
  
  Audio watermarks are in inaudible portions of audio. You know, portions that are thrown away by lossy encoding.
  This is why when people cite nyquist as a reason to keep sampling rates at 22khz, they are #1, wrong, and #2, not understanding that there is more to audio than just the part you can hear inside your head.
  People with poor hearing, who destroyed their hearing in their teens by listening to loud music, don't understand all these other sounds other people hear or even feel.
  So to detect AI, what you ha
- Re: (Score:2)
  
  by tragedy ( 27079 ) writes:
  
  Not sure how they could guarantee it in the first place. Currently, the kind of software they're talking about is typically provided as a service from various providers, but there is no reason to think that will always be the case. There's absolutely no reason to think that software that can do this won't proliferate or that there won't be versions of it that don't put in a watermark or where you can stop it from being put in to start with. Or maybe even compromise it somehow.
Watermarks won't help (Score:4, Insightful)

by eth1 ( 94901 ) writes: on Wednesday February 08, 2023 @12:08PM (#63275523)

Problem is, while the presence of a watermark might be useful for proving that audio DID come from an AI, the absence of a watermark is not proof that it didn't. And proving that audio didn't come from an AI is the more critical use case.

- Re: (Score:1)
  
  by twitch1982 ( 1833278 ) writes:
  
  For things like dubs and audio books, it could be useful for consumers to verify that their materials are authentic depending on how difficult it is to counterfeit the watermark, But I agree, it won't do anything to stop people who are trying to deep fake someone's voice.
- Re: (Score:2)
  
  by skinfaxi ( 212627 ) writes:
  
  What's to stop people from reverse-engineering the watermark and applying to non-AI voice recordings?
  - Re: (Score:2)
    
    by eth1 ( 94901 ) writes:
    
    What's to stop people from reverse-engineering the watermark and applying to non-AI voice recordings?
    Presumably, the watermarks would be some kind of cryptographic signature that could be verified.
This will only make things worse (Score:5, Insightful)

by S_Stout ( 2725099 ) writes: on Wednesday February 08, 2023 @12:09PM (#63275525)

If the watermark is not there, people will say that it is not AI when it is. It will only embolden people to say a fake is real.

Everything old is new again... (Score:5, Insightful)

by Will_Malverson ( 105796 ) writes: on Wednesday February 08, 2023 @12:10PM (#63275529) Journal

In 1998, the Secure Digital Music Initiative [wikipedia.org] was formed. One of their goals was to inaudibly watermark music so that SDMI-compliant players would recognize copyrighted music and refuse to play it if the user didn't have the appropriate permission.
Hackers almost immediately developed methods to remove the watermark, and SDMI folded a couple of years later.

- Re: (Score:2)
  
  by Snotnose ( 212196 ) writes:
  
  Is that the one where they made a "notch" in the sound, supposedly inaudible? Turns out anyone with a half decent stereo could easily hear it and it made it sound like shit.
Doesn't need a watermark, yet (Score:2)

by DavenH ( 1065780 ) writes:

They all sound easily identifiable at present, due to the fact that ML models disgard phase information of the audio in their training datasets. They train only on MEL spectrograms, at least all the papers I've seen from Facebook or Google etc; these use the power-spectrum only, when training. Disregarding half of the information in the phase spectrum is not for free. It's just that they cannot easily train with the phase spectrum information, because they're chaotic and any averaging over training batches
Frequency cuts? (Score:1)

by Quake1v1 ( 5328633 ) writes:

Just a thought, but I think all you'd need to remove any audio watermark is to know the frequencies in which the watermark is recorded, then determine the frequency response Q curve needed to cut that frequency (this is simple to do with a Floating-Band Dynamic EQ). Then render the new audio file. /source - audio engineer for 3 decades.
- Re: (Score:2)
  
  by DavenH ( 1065780 ) writes:
  
  Any watermark worth its chebyshev filters isn't going to be a few static frequencies. More likely it'd be an image or QR code periodically overlaid onto the power spectrum. Like the "Aphex face" in Aphex Twin - Windowlicker.
  - Re: (Score:1)
    
    by Quake1v1 ( 5328633 ) writes:
    
    that makes sense. I was thinking purely audio.
Is that like the metal in plastic explosives? (Score:2)

by Opportunist ( 166417 ) writes:

As in, you'll only have it in those cases where it's not going to be used for shady purposes anyway?
only as trustworthy as the actors (Score:2)

by argStyopa ( 232550 ) writes:

They did the same thing to try to dope synthetic diamonds.
The moment there is a value to be gained by faking something, the 'rules' will be broken.
How exactly will this work? (Score:1)

by AlexSledge ( 10102306 ) writes:

Will I have to disable my watermark filtering AI when creating my artificial audio production?
excellent plan; i expect this will solve the deepf (Score:2)

by karlandtanya ( 601084 ) writes:

Just as easily as the malicious [ietf.org] Network actor problems were solved
Naive.. (Score:2)

by SuperDre ( 982372 ) writes:

And then they'll just use a version of the AI that doesn't add the watermark.
AI (Score:2)

by cstacy ( 534252 ) writes:

Someone will release another AI that figures out the watermark and removes it. After about a week, this will be a plugin for Audacity.
Or maybe a filter called "Analog" will remove any inaudible sounds.
In any event, the point of the watermark is to say that something is FAKE. So, absent a watermark in the first place, the recording is TRUE. Like everything is now. The watermark is not a digitally signed certificate from the content creator. The watermark does not authenticate things; it de-authenticates the
- Re: (Score:2)
  
  by cstacy ( 534252 ) writes:
  
  What was the point of this again?
  Oh, right.
  Blockchain.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

'Inaudible' Watermark Could Identify AI-Generated Voices (techcrunch.com) 39

'Inaudible' Watermark Could Identify AI-Generated Voices More Login

'Inaudible' Watermark Could Identify AI-Generated Voices

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Not possible (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Watermarks won't help (Score:4, Insightful)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

This will only make things worse (Score:5, Insightful)

Everything old is new again... (Score:5, Insightful)

Re: (Score:2)

Doesn't need a watermark, yet (Score:2)

Frequency cuts? (Score:1)

Re: (Score:2)

Re: (Score:1)

Is that like the metal in plastic explosives? (Score:2)

only as trustworthy as the actors (Score:2)

How exactly will this work? (Score:1)

excellent plan; i expect this will solve the deepf (Score:2)

Naive.. (Score:2)

AI (Score:2)

Re: (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot