'Inaudible' Watermark Could Identify AI-Generated Voices (techcrunch.com) 39
The growing ease with which anyone can create convincing audio in someone else's voice has a lot of people on edge, and rightly so. Resemble AI's proposal for watermarking generated speech may not fix it in one go, but it's a step in the right direction. From a report: AI-generated speech is being used for all kinds of legitimate purposes, from screen readers to replacing voice actors (with their permission, of course). But as with nearly any technology, speech generation can be turned to malicious ends as well, producing fake quotes by politicians or celebrities. It's highly desirable to find a way to tell real from fake that doesn't rely on a publicist or close listening.
[...] Resemble AI is among a new cohort of generative AI startups aiming to use finely tuned speech models to produce dubs, audiobooks, and other media ordinarily produced by regular human voices. But if such models, perhaps trained on hours of audio provided by actors, were to fall into malicious hands, these companies may find themselves at the center of a PR disaster and perhaps serious liability. So it's very much in their interest to find a way to make their recordings both as realistic as possible and easily verifiable as being generated by AI.
[...] Resemble AI is among a new cohort of generative AI startups aiming to use finely tuned speech models to produce dubs, audiobooks, and other media ordinarily produced by regular human voices. But if such models, perhaps trained on hours of audio provided by actors, were to fall into malicious hands, these companies may find themselves at the center of a PR disaster and perhaps serious liability. So it's very much in their interest to find a way to make their recordings both as realistic as possible and easily verifiable as being generated by AI.
Re: (Score:3)
When you speak "on the record", it will activate and emit a tone just above/below the range of human speech that contains a PKI encrypted digital version of whatever the fuck you just said. This will allow all human speech to be verified by the blockchain. Just make sure you don't give a computer a copy of your voice certificate, or anyone can impersonate you.
A market will ope
Re: (Score:2)
This will allow all human speech to be verified by the blockchain.
Even in this future-tech fantasy, "the blockchain" is still a pointless bolt-on. In the future, we'll all wonder why anyone thought it was a good idea.
Re: (Score:2)
I don't see how a watermark would fix anything. All someone has to do is release/alter the program so it doesn't insert a watermark.
Even a watermark that is impossible to remove is useless if I can generate my own without a watermark.
Isn't that the problem we are trying to solve, people producing fakes so now voice or video is no longer believable proof.
Watermarking will only stop people selling other peoples generated content, not making their own.
Re: (Score:2)
If you're generating your own, then you have your own model that you're training yourself, no? Seems like enough work to be a deterrent.
Re: (Score:2)
(Note: "doing it yourself" is only a deterrent for individuals, not corporations or nations. If Russia wants to fake up some Biden speeches and distribute them, they're going to.)
Not possible (Score:2)
If irremovable inaudible and/or invisible watermarks were possible, piracy would have been finished long ago. See also "Analog Hole".
Re: (Score:3)
Bluray discs *DO* have a yet-to-be-cracked audio watermark (Cinavia). Try to play a a copy on a standard consumer bluray player and about 20 minutes it, it pops up a message telling you that are playing a bootlegged copy and disables the audio. Right now there are two solutions for it... use an audio source that isn't from the bluray (dvd release for instance, but they have added the watermark there in recent years too), or compress and mangle the audio *so much* that it passes, but it sounds like complete
Re: (Score:2)
Re: (Score:2)
Audio watermarks are in inaudible portions of audio. You know, portions that are thrown away by lossy encoding.
This is why when people cite nyquist as a reason to keep sampling rates at 22khz, they are #1, wrong, and #2, not understanding that there is more to audio than just the part you can hear inside your head.
People with poor hearing, who destroyed their hearing in their teens by listening to loud music, don't understand all these other sounds other people hear or even feel.
So to detect AI, what you ha
Re: (Score:2)
Not sure how they could guarantee it in the first place. Currently, the kind of software they're talking about is typically provided as a service from various providers, but there is no reason to think that will always be the case. There's absolutely no reason to think that software that can do this won't proliferate or that there won't be versions of it that don't put in a watermark or where you can stop it from being put in to start with. Or maybe even compromise it somehow.
Watermarks won't help (Score:4, Insightful)
Problem is, while the presence of a watermark might be useful for proving that audio DID come from an AI, the absence of a watermark is not proof that it didn't. And proving that audio didn't come from an AI is the more critical use case.
Re: (Score:1)
Re: (Score:2)
Re: (Score:2)
What's to stop people from reverse-engineering the watermark and applying to non-AI voice recordings?
Presumably, the watermarks would be some kind of cryptographic signature that could be verified.
This will only make things worse (Score:5, Insightful)
Everything old is new again... (Score:5, Insightful)
In 1998, the Secure Digital Music Initiative [wikipedia.org] was formed. One of their goals was to inaudibly watermark music so that SDMI-compliant players would recognize copyrighted music and refuse to play it if the user didn't have the appropriate permission.
Hackers almost immediately developed methods to remove the watermark, and SDMI folded a couple of years later.
Re: (Score:2)
Doesn't need a watermark, yet (Score:2)
Frequency cuts? (Score:1)
Just a thought, but I think all you'd need to remove any audio watermark is to know the frequencies in which the watermark is recorded, then determine the frequency response Q curve needed to cut that frequency (this is simple to do with a Floating-Band Dynamic EQ). Then render the new audio file. /source - audio engineer for 3 decades.
Re: (Score:2)
Re: (Score:1)
that makes sense. I was thinking purely audio.
Is that like the metal in plastic explosives? (Score:2)
As in, you'll only have it in those cases where it's not going to be used for shady purposes anyway?
only as trustworthy as the actors (Score:2)
They did the same thing to try to dope synthetic diamonds.
The moment there is a value to be gained by faking something, the 'rules' will be broken.
How exactly will this work? (Score:1)
excellent plan; i expect this will solve the deepf (Score:2)
Just as easily as the malicious [ietf.org] Network actor problems were solved
Naive.. (Score:2)
AI (Score:2)
Someone will release another AI that figures out the watermark and removes it. After about a week, this will be a plugin for Audacity.
Or maybe a filter called "Analog" will remove any inaudible sounds.
In any event, the point of the watermark is to say that something is FAKE. So, absent a watermark in the first place, the recording is TRUE. Like everything is now. The watermark is not a digitally signed certificate from the content creator. The watermark does not authenticate things; it de-authenticates the
Re: (Score:2)
What was the point of this again?
Oh, right.
Blockchain.