Alexa Scientists Claim Audio Watermarking Technique Nearing 100% Accuracy (venturebeat.com) 85

Posted by EditorDavid on Sunday March 31, 2019 @10:34AM from the audio-identifiers dept.

georgecarlyle76 brought our attention to Amazon's claim of an algorithm that "solves the 'second-screen problem' in real-time."

"Ever hear (no pun intended) of audio watermarking?" asks VentureBeat. It's the process of adding distinctive sound patterns identifiable to PCs, and it's a major way web video hosts, set-top boxes, and media players spot copyrighted tracks. But watermarking schemes aren't particularly reliable in noisy environments, like when the audio in question is broadcasted over a loudspeaker. The resulting noise and interference -- referred to in academic literature as the "second-screen" problem -- severely distorts watermarks, and introduces delays that detectors often struggle to reconcile. Researchers at Amazon, though, believe they've pioneered a novel workaround, which they describe in a paper newly published on the preprint server Arxiv ("Audio Watermarking over the Air with Modulated Self-Correlation") and an accompanying blog post. The team claims their method -- which they'll detail at the International Conference on Acoustics, Speech, and Signal Processing in May -- can detect watermarks added to about two seconds of audio with "almost perfect accuracy," even when the distance between the speaker and detector is greater than 20 feet...

So how's it work? As Tai explains, the model employs a "spread-spectrum" technique in which watermark energy is spread across time and frequency, rendering it inaudible to human ears while robustifying it against postprocessing (like compression). And it generates watermarks from noise blocks of a fixed duration, each of which introduces its own distinct pattern to selected frequency components in the host audio signal. Conventional detectors would compare the resulting sequence of noise blocks -- the decoding key -- with a reference copy. But Tai and colleagues take a different approach: Their algorithm embeds the noise pattern in the audio signal multiple times and compares it to itself. Because said signal passes through the same acoustic environment, Tai explains, instances of the pattern are distorted in similar ways, enabling them to be compared directly. "The detector takes advantage of the distortion due to the acoustic channel, rather than combatting it," he added.
"Audio content that Alexa plays -- music, audiobooks, podcasts, radio broadcasts, movies -- could be watermarked on the fly," explains Amazon's blog post. It argues that this could be useful "so that Alexa-enabled devices can better gauge room reverberation and filter out echoes."

Alexa Scientists Claim Audio Watermarking Technique Nearing 100% Accuracy

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 85 Comments Log In/Create an Account

Comments Filter:

- - Re: Bezos says theres no air shortage (Score:1)
    
    by Anonymous Coward writes:
    
    You misgendered her. Misgendering a normal person isn't a crime, but because she is a trannie, you have committed high treason against the internet. I hereby sentence you to Barbara Hudson for life, without the possibility of escape.
- priorities (Score:2)
  
  by astrofurter ( 5464356 ) writes:
  
  We could have devoted our research efforts to curing disease. Or reducing pollution. Or improving energy efficiency. Or even building better killer robots.
  But no. We spent millions in research money on... Preventing unauthorized enjoyment of music.
  Capitalism FTW! Fuck you, plebs, that's why!
- Re: (Score:1)
  
  by Anonymous Coward writes:
  
  It is a perfectly cromulent word.
  - Re: Robustifying? (Score:1)
    
    by Anonymous Coward writes:
    
    Does it embiggen even the smallest man?
- Re: (Score:1)
  
  by Tablizer ( 95088 ) writes:
  
  It's a Bush-ism. Seems people miss semi-normal Presidents a lot these days.
And I give it ten minutes till its "hacked" (Score:1)

by Joviex ( 976416 ) writes:

Add low-intensity, imperceptible white noise to stream, destroy any chance of any detection. Profit.
- Re: (Score:2, Informative)
  
  by Anonymous Coward writes:
  
  audio watermarking in the video industry (cinavia) is still not cracked. It can be slightly defeated by smearing the audio by pitch shifting. But you loose quite a large amount of signal doing so. Now with audio this can be semi acceptable.
  It has yet to be properly defeated by finding the signal and correcting just that signal.
  https://en.wikipedia.org/wiki/... [wikipedia.org]
  - Re:And I give it ten minutes till its "hacked" (Score:4, Interesting)
    
    by KiloByte ( 825081 ) writes: on Sunday March 31, 2019 @12:40PM (#58361646)
    
    Obtain two distinct copies of the audio, diff them. Anything not common to both copies is either watermarking, noise, or compression artifacts -- and you want none of that.
    
    - - Re: And I give it ten minutes till its "hacked" (Score:2)
        
        by K. S. Kyosuke ( 729550 ) writes:
        
        That only works if one of the copies is the original unwatermarked one. But usually, you'll only have access to watermarked copies.
        I think the point was precisely that you'd compare two different watermarked versions.
  - Re: (Score:2)
    
    by silverkniveshotmail. ( 713965 ) writes:
    
    I'm confused about cinavia. What does it actually do? Why can I rip blu-rays without worrying about it?
    - Non-BD-certified players ignore Cinavia (Score:2)
      
      by WoodstockJeff ( 568111 ) writes:
      
      If a player is certified BD-compliant, it will look for, and respect, Cinavia.
      The rest of the world doesn't. So, if you rip a BD disk and play it in, say, Kodi or VLC, Cinavia doesn't matter. Play it on an Oppo player, though, and you may get a surprise.
  - Re:And I give it ten minutes till its "hacked" (Score:5, Interesting)
    
    by TheRaven64 ( 641858 ) writes: on Sunday March 31, 2019 @01:05PM (#58361722) Journal
    
    Your claim, precisely as stated, appears to be true but, per your link, that doesn't mean that the watermarking hasn't been broken in other ways. In fact, citation 16 regarding DVD-Ranger CinEx appears to do precisely that: detect the signal and then remove it.
    The Amazon technique sounds like exactly the same crap that you get from a lot of machine-learning researchers doing security work: they don't think about an adaptive adversary. There's an entire field of adversarial machine learning that works by training a machine-learning system on the inputs and outputs of another: if you can train a neural network to insert and recognise these watermarks, can you train another one to recognise and remove them? If you haven't even tried that, it's likely that an attacker will be able to.
    
  - Re: (Score:2)
    
    by K. S. Kyosuke ( 729550 ) writes:
    
    If the watermarks are different in different streams, for example (for identification of origin?), how about combining multiple streams?
  - Re: (Score:2)
    
    by swilver ( 617741 ) writes:
    
    Add a few thousand watermarks of the same type, and see if they can still detect it.
- Re: And I give it ten minutes till its "hacked" (Score:1)
  
  by Cmdln Daco ( 1183119 ) writes:
  
  I would say spread low cost 'watermark' emitters widely, so the watermark sound is present everywhere. A low cost emitter could be a common accessory, so that every concert, bar, and public gathering place would have a few humming along. Maybe even an emitter app people could run on their mobile device.
I see the problem... (Score:1)

by Anonymous Coward writes:

Their algorithm embeds the noise pattern in the audio signal multiple times and compares it to itself.
Somehow I doubt (Score:5, Informative)

by Kernel Kurtz ( 182424 ) writes: on Sunday March 31, 2019 @11:00AM (#58361282)

they are going to make the watermark "more detectable in noisy environments" without making it more detectable to the listener. There is already much discussion of how current watermarks are commonly audible in otherwise high fidelity music files.
https://www.mattmontag.com/mus... [mattmontag.com]
Of course, nobody is ever going to use Alexa for anything remotely related to hifi, but this is certainly not something we would want to see spread anywhere else.

- - Re: (Score:2)
    
    by CaseCrash ( 1120869 ) writes:
    
    Who uses Alexa period? End thread here...
    Shit tons of people? We don't all wear tinfoil hats.
- Why do you need to ID music in noisy environments? (Score:2)
  
  by Solandri ( 704621 ) writes:
  
  I don't see this as a problem that needs solving. If there's a copyrighted soundtrack playing in a noisy environment, then quite obviously the music (1) is secondary, tertiary, or non-essential to whatever else is going on in the video and thus not a copyright violation, and (2) is not a reproduction someone wanting an illegal copy of the copyrighted work would be interested in. So there's zero reason for a copyright holder to even want to detect it. It would result in stupid things like people getting a
- Re: (Score:2)
  
  by Kogun ( 170504 ) writes:
  
  Of course, nobody is ever going to use Alexa for anything remotely related to hifi, but this is certainly not something we would want to see spread anywhere else.
  Allow me to introduce you to the phrase "works with Alexa", already featured in TVs and A/V receivers.
  - Re: (Score:2)
    
    by Kernel Kurtz ( 182424 ) writes:
    
    Allow me to introduce you to the phrase "works with Alexa", already featured in TVs and A/V receivers.
    I don't think that really changes my premise at all.
Why? (Score:3)

by fluffernutter ( 1411889 ) writes: on Sunday March 31, 2019 @12:09PM (#58361530)

Why would anyone buy a device that might refuse to play any damn sound file you point it at? As if I needed another reason not to buy one of these things.

- - Re: (Score:2)
    
    by fluffernutter ( 1411889 ) writes:
    
    2004 called and wants to sell you an MP3 player with a headphone jack.
You keep using this word "Problem" (Score:5, Insightful)

by UnknownSoldier ( 67820 ) writes: on Sunday March 31, 2019 @12:12PM (#58361544)

Reading the article title ... Audio Watermarking Algorithm Is First to Solve "Second-Screen Problem" in Real Time
... I immediately see what the problem is. They keep using the word "problem" when it isn't one.
/sarcasm Ah, good old greed to start labeling everything as as a "problem"! It must the be same idiots who think "Piracy is a Problem".
Here's a clue stick. Instead of treating symptoms how about addressing the cause, namely:
a) availability (lack of legal availability), and
b) price (due to expensive licensing)
because Piracy "solves" those two problems. Treating the symptom, audio watermarking, is not going to stop people from sharing music. Content sharing is called free advertising -- or am I in "violation" because the rest of my family can listen to my music even though only I paid for it? If you don't want people to share it, then don't release it. Real simple.
Maybe it is time to bring back Sneaker Net ?

- - Re: (Score:1)
    
    by Anonymous Coward writes:
    
    fuck you. Stop spreading lies. It's not theft since nothing is stolen, it's copyright infringement.
  - Re: (Score:2)
    
    by Okind ( 556066 ) writes:
    
    1) if I own the song, video, book, etc, and I choose not to sell it or make it available, it is my personal property and fuck you.
    2) I am legally and ethically allowed to set any price I want. Your stolen sound track to the Harry Potter movies has yet to be determined to cure cancer. If I own the soundtrack and want to charge you a gazillion dollars to listen to it then fuck you. You do not have any rights to my Harry Potter movie sound track. If you dont have a gazillion dollars then you dont get to listen to it. Fuck you.
    [...]
    You have no ethical grounds or argument.
    Yes we do: it's called the Universal Declaration of Human Rights. Specifically article 27, which states that “everyone has the right freely to participate in the cultural life of the community, to enjoy the arts and to share in scientific advancement and its benefits.”
    Whether you like it or not, you and everyone else are not allowed to abuse copyrights to deny someone the option to "enjoy the arts and to share in scientific advancement and its benefits". So if you ever publish your copyrighted w
    - Re: (Score:2)
      
      by piojo ( 995934 ) writes:
      
      Yes we do: it's called the Universal Declaration of Human Rights. Specifically article 27, which states that “everyone has the right freely to participate in the cultural life of the community, to enjoy the arts and to share in scientific advancement and its benefits...
      But it does mean that rights holders, once they publish their work, are required to make it available at a price all members of the community can pay.
      That seems like a big leap. First, one may participate in culture without being able to participate in all aspects. Would you advocate that a concert in a small venue should be priced so everyone can attend? But more importantly, having the right to something doesn't obligate others to provide it to you. Everyone has the right to a happy love life, but that doesn't mean anybody has to sleep with you!
  - Re: (Score:2)
    
    by WolfgangVL ( 3494585 ) writes:
    
    Actually, we DO have a right to what is supposed to be ALL PEOPLES content.
    We had a deal, and for a long time both sides walked away happy. Then in 1976, the deal changed, and many works of culture that had been earmarked for public domain instead remained locked behind a price tag. Then in 1998, the same thing happened, and another 20 years of works was STOLEN from the public.
    (STOLEN, as in we don't have it anymore)
    Did you know that this year is the first year that any works have fallen into the public dom
- Re: You keep using this word "Problem" (Score:2)
  
  by sound+vision ( 884283 ) writes:
  
  "Availability and price" have been taken care of already, your critiques are a few years behind. What the industry didn't understand a decade ago, they do now. Whatever song you desire, you can stream it on YouTube for free with an ad. Or you can pay $10/mo or whatever Spotify is and get it without ads. The anti-piracy efforts are about wrangling up the stragglers who haven't fully moved over to this system yet. The remaining problem of ownership is easily ignored. People don't own their homes, cars, or cel
Working in a noisy environment (Score:4, Interesting)

by gnasher719 ( 869701 ) writes: on Sunday March 31, 2019 @01:01PM (#58361702)

I read the headline and thought it was about making Alexa work nicely in an environment where it plays loud music.

Apple's HomePod does that very nicely. Instead of adding a watermark, it compares the signal entering its microphones with the signal leaving the speakers, so if you have loud music playing through your HomePod, it can eliminate that music almost completely before it starts speech recognition.

Next, if some person in the room says "Hey, Siri", it analyses the voice of the person saying the words, and eliminates what anyone else in the room is saying. Apple published a paper about this, and has some demos somewhere. One is very loud music in a room with many people talking. Phase 1 eliminates music, leaving many people talking and a bit of white noise. Phase 2 eliminates the voices of anyone except the person saying "Hey, Siri" and what's left is one perfectly recognisable voice, plus a bit more white noise. So "Hey Siri" works with loud music as long as it is played by the HomePod, and lots of people talking. What Amazon is planning here, on the other hand, doesn't seem to be something that any of the customers buying Alexa is asking for.

It's about making media NOT trigger 'Alexa' (Score:4, Insightful)

by Miamicanes ( 730264 ) writes: on Sunday March 31, 2019 @01:40PM (#58361832)

I think Amazon's motive is to come up with a way for a home theater amp, TV, media device, etc. to watermark its audio output in a way that tells a listening Alexa-implementing device, "don't be triggered by THIS SPECIFIC audio" (so every time someone on TV says, "Alexa" the device WON'T be triggered).
The catch is, the device needs a way to distinguish between "media audio" (that should NOT trigger it) and people in the room (who should ALWAYS be able to trigger it, even while watching a TV show or movie with the 'ignore me' watermark).
It has to be something that a device on the consumer end can add, because remastering a century's worth of media to add it at the content-producer's end just plain isn't going to happen.
Amazon is painfully aware of the "TV triggered Alexa" problem. It's not just annoying, it's a real potential vulnerability (mitigated mostly by the fact that buying radio & TV ads is both expensive & non-anonymous, so an ad that INTENTIONALLY tried to exploit it would get the advertiser sued). They don't want to just overlay a "dumb" "ignore everything for {n}ms" ultasonic tone burst, because THAT could be abused as well (say, by advertisers who wanted to prevent an Alexa-controlled device from accepting commands from ANYONE during the ad). So... it needs to be:
* specific to media being played in the presence of an Alexa-implementing device
* able to be injected at the consumer end, and something that could cheaply be added to something like a blu-ray player (ideally, lightweight enough to implement as a firmware update to existing players).
* NOT affect verbal commands from humans in the room.
Incidentally, I believe Amazon initially considered trying to use Cinavia for this purpose (since it's already present in many movies), but quickly realized it would cause more problems than it solved. Cinavia was designed to robustly (and indiscriminately) scream, "stop recording!", not "ahem... please don't attempt speech-recognition on THIS SPECIFIC audio". If Echo ignored 'Alexa' for {n} seconds after recognizing a Cinavia watermark, mere playback of Cinavia-watermarked content within listening range would effectively disable the use of 'Alexa' entirely for those {n} seconds. Ergo, Amazon had to come up with something better.
To wit, this is NOT about imposing DRM. It's about preventing media content from triggering the device by having someone on-screen say 'Alexa', by giving the device a way to distinguish BETWEEN media content and local users.

- What about other humans? (Score:2)
  
  by DogDude ( 805747 ) writes:
  
  Well, that's nice and all for the "TV", but it won't do anything to stop other humans (like me) from doing things with strangers' Alexa gadgets. I have a neighbor that routinely leaves his "smart speaker" on too loudly. I just shut it off through the door. If he continues, I'll just start order large tubs of lube and rubbers for him.
  
  Also, when I go into somebody's house, I'll ask them to turn off any of their recording devices. Just to make sure, I will also try to order some escorts via a smart speak
Alexa the copyright cop (Score:1)

by Anonymous Coward writes:

I want Alexa verifying that I've paid for what I'm listening to. I also want Alexa verifying that I'm renting all my music and video from Amazon.
This is yet another reason why I don't have Alexa, Siri, or Google's thing in my house or car.
I suspect that this will be mandatory soon, much like the BBC license in the UK.
- Re: Alexa the copyright cop (Score:2)
  
  by sound+vision ( 884283 ) writes:
  
  At least the BBC licenses gave them Top Gear. This just gives you ads, and possibly invites strangers into your home.
better room signature detection? (Score:2)

by RhettLivingston ( 544140 ) writes:

can better gauge room reverberation and filter out echoes
For some reason, I read that statement and immediately felt that the speaker was more likely to be thinking about how much more accurately they can measure the room for many other reasons. I guess I'm getting more cynical.
Rooms have audio signatures. Those signatures are altered by how many people are in the room and where they are at. How much more information will they now be able to gain about the room and its contents?
- Re: better room signature detection? (Score:1)
  
  by Cmdln Daco ( 1183119 ) writes:
  
  Will Alexa be able to identify that it is in a brown paper bag in a trashbin? How accurately will it be at identifying the sound of a spring loaded center punch releasing into it's charging port? Will it initiate a timer to count down to it's battery death?
What they're not saying (Score:2)

by sjames ( 1099 ) writes:

If you're going to listen to or watch anything that may be of "questionable" providence, better put ear muffs on your Alexa first or she might tattle.
Robistify is not English (Score:2)

by sandbagger ( 654585 ) writes:

Please sue your grade school for malpractice.
this sounds like too good to be true (Score:2)

by mapkinase ( 958129 ) writes:

Even on all other 364 days of the year

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Re: Bezos says theres no air shortage (Score:1)

priorities (Score:2)

Re: (Score:1)

Re: Robustifying? (Score:1)

Re: (Score:1)

And I give it ten minutes till its "hacked" (Score:1)

Re: (Score:2, Informative)

Re:And I give it ten minutes till its "hacked" (Score:4, Interesting)

Re: And I give it ten minutes till its "hacked" (Score:2)

Re: (Score:2)

Non-BD-certified players ignore Cinavia (Score:2)

Re:And I give it ten minutes till its "hacked" (Score:5, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: And I give it ten minutes till its "hacked" (Score:1)

I see the problem... (Score:1)

Somehow I doubt (Score:5, Informative)

Re: (Score:2)

Why do you need to ID music in noisy environments? (Score:2)

Re: (Score:2)

Re: (Score:2)

Why? (Score:3)

Re: (Score:2)

You keep using this word "Problem" (Score:5, Insightful)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: You keep using this word "Problem" (Score:2)

Working in a noisy environment (Score:4, Interesting)

It's about making media NOT trigger 'Alexa' (Score:4, Insightful)

What about other humans? (Score:2)

Alexa the copyright cop (Score:1)

Re: Alexa the copyright cop (Score:2)

better room signature detection? (Score:2)

Re: better room signature detection? (Score:1)

What they're not saying (Score:2)

Robistify is not English (Score:2)

this sounds like too good to be true (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals