AI Hears Your Anger in 1.2 Seconds (venturebeat.com) 51

Posted by msmash on Friday February 08, 2019 @11:35AM from the how-about-that dept.

MIT Media Lab spinoff Affectiva's neural network, SoundNet, can classify anger from audio data in as little as 1.2 seconds regardless of the speaker's language -- just over the time it takes for humans to perceive anger. From a report: Affectiva's researchers describe it ("Transfer Learning From Sound Representations For Anger Detection in Speech") in a newly published paper [PDF] on the preprint server Arxiv.org. It builds on the company's wide-ranging efforts to establish emotional profiles from both speech and facial data, which this year spawned an AI in-car system codeveloped with Nuance that detects signs of driver fatigue from camera feeds. In December 2017, it launched the Speech API, which uses voice to recognize things like laughing, anger, and other emotions, along with voice volume, tone, speed, and pauses.

SoundNet consists of a convolutional neural network -- a type of neural network commonly applied to analyzing visual imagery -- trained on a video dataset. To get it to recognize anger in speech, the team first sourced a large amount of general audio data -- two million videos, or just over a year's worth -- with ground truth produced by another model. Then, they fine-tuned it with a smaller dataset, IEMOCAP, containing 12 hours of annotated audiovisual emotion data including video, speech, and text transcriptions.

AI Hears Your Anger in 1.2 Seconds

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 51 Comments Log In/Create an Account

Comments Filter:

Cortana and Siri (Score:3)

by Zorro ( 15797 ) writes: on Friday February 08, 2019 @11:38AM (#58089266)

Anytime Cortana or Siri popscup and gets in the way there will be anger!

- - AT&T Or Time Warner (Score:2)
    
    by sycodon ( 149926 ) writes:
    
    That dumb butch they have doesn't understand shit.
    Then they go all fucking stupid pretending computers make some kind of bepop noise when thinking.
    Stupid fuckers.
- Re: (Score:2)
  
  by syn3rg ( 530741 ) writes:
  
  Fracking toasters...
- Re: (Score:1)
  
  by Anonymous Coward writes:
  
  Every time I get one of those worthless automated "assistants" instead of live customer support there will be anger. I guess the next step is having the automated "assistant" (it's not AI, sorry) determine the reason that I'm angry directly due to the automated "assistant" itself always being programmed to offer only simplistic choices that have nothing to do with my issue and wasting my time instead of getting a live person on the line. If the issue were as simple as the ones proffered by the automated "as
- I'm sorry Dave (Score:2)
  
  by goombah99 ( 560566 ) writes:
  
  This conversation can serve no further useful purpose. goodbye.
It's quick, but not quick enough? (Score:2)

by froggyjojodaddy ( 5025059 ) writes:

1.2 seconds sounds kinda long for anger detection

I can detect anger in someone's voice practically immediately, even before they've finished the first word because as a human, I use a number of other clues e.g. facial contortion, body positioning, finger pointing etc.

1.2 seconds to detect a change in pitch, volume etc. seems too long and I think that's the overall problems with artificial intelligence or machine learning - they're great for massive data sets that have common patterns (or used to build
- - - Re: (Score:1)
      
      by bob4u2c ( 73467 ) writes:
      
      I just wear a Mike Myers mask, and I'm smiling the whole time underneath.
- It's for IVR trees (Score:2)
  
  by rsilvergun ( 571051 ) writes:
  
  the idea is they detect when you're angry and move you along to a rep faster. Yes, this means for savvy folks they call in already angry, but honestly if you're savvy and being forced into an IVR you're probably already angry anyway since you're calling for a rep to do something you couldn't do online.
  - Re: (Score:2)
    
    by froggyjojodaddy ( 5025059 ) writes:
    
    IVRs are one of those antiquated things that I can't believe still exist. Worse are the 'voice recognition' ones. I have an English accent but live in Canada and I have all but given up on those travesties
- Re: (Score:2)
  
  by alvinrod ( 889928 ) writes:
  
  I can detect it even quicker than that. After being out all hours of the night drinking with my friends, I already know that my girlfriend is going to be angry before I even talk to her. It's like the pre-cogs from Minority Report or something.
- Re: (Score:2)
  
  by mnemotronic ( 586021 ) writes:
  
  Q: How are you with detection without visual clues? Like over an average poor cell phone connection with lots of latency. Visual contact is, IMHO, an extremely important source of perception for humans. I suppose that's why we're such crappy internet communicators.
Maybe they should consider the fact (Score:2)

by bobstreo ( 1320787 ) writes:

that I have a resting bitchy voice. Especially when not talking to a human that speaks english.
"ground truth" (Score:2)

by XXongo ( 3986865 ) writes:

" the team first sourced a large amount of general audio data ... with ground truth produced by another model."
So, actually, the program wasn't detecting anger. The program was modelling what a different program detected in the signal.
- Re: (Score:2)
  
  by ceoyoyo ( 59147 ) writes:
  
  You neglected the next part, where they fine tuned it using hand labelled data. If you're training a system that learns (and that includes people) and you've got a automatic system that performs okay, it's often a good idea to do a first round of training on the automatic results. Then you come along with a smaller, higher quality training set to boost performance over what the existing automatic system can do.
  And yes, the term "ground truth" is usually used in stupid way.
When my wife gets angry (Score:1)

by fluffythedestroyer ( 2586259 ) writes:

I can see the future, knows what will happen to me and I know when she's angry and its a lot faster than 1.2 seconds. In fact, 1.2s is what it takes my wife to hit me or give me that death glare stare.
I can code AI (Score:2)

by fluffernutter ( 1411889 ) writes:

I can write AI:

If volume_before * 1.5 < volume_now:
then ANGRY!
My view on AI (Score:2)

by fluffernutter ( 1411889 ) writes:

Here is my new view on AI: I think there are a bunch of people out there in industries that did not previously work with computers. Now they are applying common programming tests to variables that mean something in their world and it seems so magical that they call it AI.
I test for this with old man profanity (Score:3)

by xxxJonBoyxxx ( 565205 ) writes: on Friday February 08, 2019 @12:00PM (#58089394)

If I'm on a call with an automated tree (and I'm sufficiently alone), I often let loose a string of angry "old man" profanity while it's listening just to see if I get get auto-routed to the agent. Hasn't happened too often, but it happens (most often with airlines/creditcards).

Even German? (Score:2)

by Ogive17 ( 691899 ) writes:

Unless Germans are angry 100% of the time and it's hard coded in the logic.
It's about empathy towards computers (Score:2)

by turp182 ( 1020263 ) writes:

Now they can sense the anger you have towards them for whatever reasons and say "No, please don't throw me out the window" while you are throwing it out the window (or smashing it with a hammer).
This isn't AI (Score:3)

by sdinfoserv ( 1793266 ) writes: on Friday February 08, 2019 @12:41PM (#58089602)

Stop calling everything a computer does "AI".. 15 years ago in 2004 is was an IT Director at a large call center that did both inbound (skills based routing) and outbound (predictive dialing). One of the features of our telephone switch back then was real time monitoring that could detect when someone would get agitated or use a "bad " word (like swearing) . When pre-specified thresholds were reached or certain words used, the system would call a supervisor and allow the supervisor to "ghost" (listen but not be heard), "whisper" (coach the agent without being heard by the caller), or take over the call. The terminology 15 years ago was real time monitoring with language recognition heuristics. It worked great then and it was commercially available, it wasn't "AI"...

- Re: (Score:2)
  
  by r2kordmaa ( 1163933 ) writes:
  
  Everything that gets done with neural networks is called "AI" these days. That's how it is, so get used to it, nobody is going to change it just because few people keep calling "Stop". It's even somewhat apropos, it works based on (sometimes failed) training and pretty much nobody can adequately explain how, kind of like natural intelligence in humans.
- Re: (Score:3)
  
  by ceoyoyo ( 59147 ) writes:
  
  You've identified the difference: "The terminology 15 years ago was real time monitoring with language recognition heuristics."
  Heuristics are a set of rules used for decision making. In the context of algorithms, those heuristics are designed by a human and programmed into the system.
  "AI" is a nonspecific term, but if it means anything it means a system that learns from experience. Specifically, it does not use preprogrammed heuristics.
Bruce Banner (Score:1)

by bob4u2c ( 73467 ) writes:

Did they test Bruce Banner, he's always angry? Hulk Smash!

Also does it detect passive aggressive anger? What if I yell "I LOVE YOU" at a pet, vs I whisper "I'm going to put you in the microwave and set it on high for 4 minutes, ohh yes I am, such a bad doggie you are"? What is the algorithm keying on; volume, facial expressions, changes in skin tone, words spoken? And all they did was get close to what a human could do. Come on, I thought computers were faster. Get it down to 0.000001s and I'll be i
Wells Fargo peeps don't need software to do this. (Score:2)

by TigerPlish ( 174064 ) writes:

Wells Fargo customer service reps lately just assume everyone is pissed at them these past few weeks, espcially yesterday and today.
[stewie] Where's my money?! *WHAM!* Where's my money?!" [/stewie]
German (Score:2)

by cascadingstylesheet ( 140919 ) writes:

can classify anger from audio data in as little as 1.2 seconds regardless of the speaker's language
It was just a coincidence that the German speakers were angry 100% of the time ...
What's its false positive record like? (Score:2)

by mark-t ( 151149 ) writes:

Can it tell the difference between a raised voice because of excitement or strong feelings about a matter and a voice raised in actual anger?
No, it's recognizing arousal (Score:2)

by maiden_taiwan ( 516943 ) writes:

Not possible. If someone screams "Fuuuuuuuck!!!" at the top of their lungs, there is no way AI can distinguish whether it's anger, pain, frustration, surprise, or even joy, because the source signal may be identical for all of them. At best, this system is detecting high arousal and possibly unpleasant mood.
- Re: (Score:2)
  
  by djinn6 ( 1868030 ) writes:
  
  It'll probably also fail on people who are angry, but aren't shouting it. E.g. "I've said everything that can be said. You will refund me, or you will see your entrails hanging out of your body by tomorrow. Have a good day sir."
  - Re: (Score:2)
    
    by maiden_taiwan ( 516943 ) writes:
    
    Exactly. There is no single body signature for anger [nytimes.com].
too bad (Score:2)

by zlives ( 2009072 ) writes:

too bad it can't smell the fart in its general direction,
How could it tell? (Score:2)

by hoggoth ( 414195 ) writes:

"SUPPORT. HELP. HUMAN. OPERATOR. GET ME A FUCKING HUMAN BEING YOU GODDAMN PIECE OF SHIT! "
processing... processing... processing... anger detected 37% probability
(im not yelling slashdot im not yelling... ok i am but its on purpose let this post go through...)

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

AI Hears Your Anger in 1.2 Seconds (venturebeat.com) 51

AI Hears Your Anger in 1.2 Seconds More Login

AI Hears Your Anger in 1.2 Seconds

Cortana and Siri (Score:3)

AT&T Or Time Warner (Score:2)

Re: (Score:2)

Re: (Score:1)

I'm sorry Dave (Score:2)

It's quick, but not quick enough? (Score:2)

Re: (Score:1)

It's for IVR trees (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Maybe they should consider the fact (Score:2)

"ground truth" (Score:2)

Re: (Score:2)

When my wife gets angry (Score:1)

I can code AI (Score:2)

My view on AI (Score:2)

I test for this with old man profanity (Score:3)

Even German? (Score:2)

It's about empathy towards computers (Score:2)

This isn't AI (Score:3)

Re: (Score:2)

Re: (Score:3)

Bruce Banner (Score:1)

Wells Fargo peeps don't need software to do this. (Score:2)

German (Score:2)

What's its false positive record like? (Score:2)

No, it's recognizing arousal (Score:2)

Re: (Score:2)

Re: (Score:2)

too bad (Score:2)

How could it tell? (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot