Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
AI IT Technology

'Inaudible' Watermark Could Identify AI-Generated Voices (techcrunch.com) 39

The growing ease with which anyone can create convincing audio in someone else's voice has a lot of people on edge, and rightly so. Resemble AI's proposal for watermarking generated speech may not fix it in one go, but it's a step in the right direction. From a report: AI-generated speech is being used for all kinds of legitimate purposes, from screen readers to replacing voice actors (with their permission, of course). But as with nearly any technology, speech generation can be turned to malicious ends as well, producing fake quotes by politicians or celebrities. It's highly desirable to find a way to tell real from fake that doesn't rely on a publicist or close listening.

[...] Resemble AI is among a new cohort of generative AI startups aiming to use finely tuned speech models to produce dubs, audiobooks, and other media ordinarily produced by regular human voices. But if such models, perhaps trained on hours of audio provided by actors, were to fall into malicious hands, these companies may find themselves at the center of a PR disaster and perhaps serious liability. So it's very much in their interest to find a way to make their recordings both as realistic as possible and easily verifiable as being generated by AI.

This discussion has been archived. No new comments can be posted.

'Inaudible' Watermark Could Identify AI-Generated Voices

Comments Filter:
  • by eth1 ( 94901 ) on Wednesday February 08, 2023 @12:08PM (#63275523)

    Problem is, while the presence of a watermark might be useful for proving that audio DID come from an AI, the absence of a watermark is not proof that it didn't. And proving that audio didn't come from an AI is the more critical use case.

    • For things like dubs and audio books, it could be useful for consumers to verify that their materials are authentic depending on how difficult it is to counterfeit the watermark, But I agree, it won't do anything to stop people who are trying to deep fake someone's voice.
    • What's to stop people from reverse-engineering the watermark and applying to non-AI voice recordings?
      • by eth1 ( 94901 )

        What's to stop people from reverse-engineering the watermark and applying to non-AI voice recordings?

        Presumably, the watermarks would be some kind of cryptographic signature that could be verified.

  • by S_Stout ( 2725099 ) on Wednesday February 08, 2023 @12:09PM (#63275525)
    If the watermark is not there, people will say that it is not AI when it is. It will only embolden people to say a fake is real.
  • by Will_Malverson ( 105796 ) on Wednesday February 08, 2023 @12:10PM (#63275529) Journal

    In 1998, the Secure Digital Music Initiative [wikipedia.org] was formed. One of their goals was to inaudibly watermark music so that SDMI-compliant players would recognize copyrighted music and refuse to play it if the user didn't have the appropriate permission.

    Hackers almost immediately developed methods to remove the watermark, and SDMI folded a couple of years later.

    • Is that the one where they made a "notch" in the sound, supposedly inaudible? Turns out anyone with a half decent stereo could easily hear it and it made it sound like shit.
  • They all sound easily identifiable at present, due to the fact that ML models disgard phase information of the audio in their training datasets. They train only on MEL spectrograms, at least all the papers I've seen from Facebook or Google etc; these use the power-spectrum only, when training. Disregarding half of the information in the phase spectrum is not for free. It's just that they cannot easily train with the phase spectrum information, because they're chaotic and any averaging over training batches
  • Just a thought, but I think all you'd need to remove any audio watermark is to know the frequencies in which the watermark is recorded, then determine the frequency response Q curve needed to cut that frequency (this is simple to do with a Floating-Band Dynamic EQ). Then render the new audio file. /source - audio engineer for 3 decades.

    • by DavenH ( 1065780 )
      Any watermark worth its chebyshev filters isn't going to be a few static frequencies. More likely it'd be an image or QR code periodically overlaid onto the power spectrum. Like the "Aphex face" in Aphex Twin - Windowlicker.
  • As in, you'll only have it in those cases where it's not going to be used for shady purposes anyway?

  • They did the same thing to try to dope synthetic diamonds.

    The moment there is a value to be gained by faking something, the 'rules' will be broken.

  • Will I have to disable my watermark filtering AI when creating my artificial audio production?
  • Just as easily as the malicious [ietf.org] Network actor problems were solved

  • And then they'll just use a version of the AI that doesn't add the watermark.
  • by cstacy ( 534252 )

    Someone will release another AI that figures out the watermark and removes it. After about a week, this will be a plugin for Audacity.

    Or maybe a filter called "Analog" will remove any inaudible sounds.

    In any event, the point of the watermark is to say that something is FAKE. So, absent a watermark in the first place, the recording is TRUE. Like everything is now. The watermark is not a digitally signed certificate from the content creator. The watermark does not authenticate things; it de-authenticates the

To do nothing is to be nothing.

Working...