Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
AI Technology

AI Hears Your Anger in 1.2 Seconds (venturebeat.com) 51

MIT Media Lab spinoff Affectiva's neural network, SoundNet, can classify anger from audio data in as little as 1.2 seconds regardless of the speaker's language -- just over the time it takes for humans to perceive anger. From a report: Affectiva's researchers describe it ("Transfer Learning From Sound Representations For Anger Detection in Speech") in a newly published paper [PDF] on the preprint server Arxiv.org. It builds on the company's wide-ranging efforts to establish emotional profiles from both speech and facial data, which this year spawned an AI in-car system codeveloped with Nuance that detects signs of driver fatigue from camera feeds. In December 2017, it launched the Speech API, which uses voice to recognize things like laughing, anger, and other emotions, along with voice volume, tone, speed, and pauses.

SoundNet consists of a convolutional neural network -- a type of neural network commonly applied to analyzing visual imagery -- trained on a video dataset. To get it to recognize anger in speech, the team first sourced a large amount of general audio data -- two million videos, or just over a year's worth -- with ground truth produced by another model. Then, they fine-tuned it with a smaller dataset, IEMOCAP, containing 12 hours of annotated audiovisual emotion data including video, speech, and text transcriptions.

This discussion has been archived. No new comments can be posted.

AI Hears Your Anger in 1.2 Seconds

Comments Filter:
  • by Zorro ( 15797 ) on Friday February 08, 2019 @10:38AM (#58089266)

    Anytime Cortana or Siri popscup and gets in the way there will be anger!

    • by syn3rg ( 530741 )
      Fracking toasters...
    • by Anonymous Coward

      Every time I get one of those worthless automated "assistants" instead of live customer support there will be anger. I guess the next step is having the automated "assistant" (it's not AI, sorry) determine the reason that I'm angry directly due to the automated "assistant" itself always being programmed to offer only simplistic choices that have nothing to do with my issue and wasting my time instead of getting a live person on the line. If the issue were as simple as the ones proffered by the automated "as

    • This conversation can serve no further useful purpose. goodbye.

  • 1.2 seconds sounds kinda long for anger detection

    I can detect anger in someone's voice practically immediately, even before they've finished the first word because as a human, I use a number of other clues e.g. facial contortion, body positioning, finger pointing etc.

    1.2 seconds to detect a change in pitch, volume etc. seems too long and I think that's the overall problems with artificial intelligence or machine learning - they're great for massive data sets that have common patterns (or used to build
    • the idea is they detect when you're angry and move you along to a rep faster. Yes, this means for savvy folks they call in already angry, but honestly if you're savvy and being forced into an IVR you're probably already angry anyway since you're calling for a rep to do something you couldn't do online.
      • IVRs are one of those antiquated things that I can't believe still exist. Worse are the 'voice recognition' ones. I have an English accent but live in Canada and I have all but given up on those travesties
    • I can detect it even quicker than that. After being out all hours of the night drinking with my friends, I already know that my girlfriend is going to be angry before I even talk to her. It's like the pre-cogs from Minority Report or something.
    • Q: How are you with detection without visual clues? Like over an average poor cell phone connection with lots of latency. Visual contact is, IMHO, an extremely important source of perception for humans. I suppose that's why we're such crappy internet communicators.
  • that I have a resting bitchy voice. Especially when not talking to a human that speaks english.

  • " the team first sourced a large amount of general audio data ... with ground truth produced by another model."

    So, actually, the program wasn't detecting anger. The program was modelling what a different program detected in the signal.

    • by ceoyoyo ( 59147 )

      You neglected the next part, where they fine tuned it using hand labelled data. If you're training a system that learns (and that includes people) and you've got a automatic system that performs okay, it's often a good idea to do a first round of training on the automatic results. Then you come along with a smaller, higher quality training set to boost performance over what the existing automatic system can do.

      And yes, the term "ground truth" is usually used in stupid way.

  • I can see the future, knows what will happen to me and I know when she's angry and its a lot faster than 1.2 seconds. In fact, 1.2s is what it takes my wife to hit me or give me that death glare stare.
  • I can write AI:

    If volume_before * 1.5 < volume_now:
    then ANGRY!
  • Here is my new view on AI: I think there are a bunch of people out there in industries that did not previously work with computers. Now they are applying common programming tests to variables that mean something in their world and it seems so magical that they call it AI.
  • by xxxJonBoyxxx ( 565205 ) on Friday February 08, 2019 @11:00AM (#58089394)
    If I'm on a call with an automated tree (and I'm sufficiently alone), I often let loose a string of angry "old man" profanity while it's listening just to see if I get get auto-routed to the agent. Hasn't happened too often, but it happens (most often with airlines/creditcards).
       
  • Unless Germans are angry 100% of the time and it's hard coded in the logic.
  • Now they can sense the anger you have towards them for whatever reasons and say "No, please don't throw me out the window" while you are throwing it out the window (or smashing it with a hammer).

  • by sdinfoserv ( 1793266 ) on Friday February 08, 2019 @11:41AM (#58089602)
    Stop calling everything a computer does "AI".. 15 years ago in 2004 is was an IT Director at a large call center that did both inbound (skills based routing) and outbound (predictive dialing). One of the features of our telephone switch back then was real time monitoring that could detect when someone would get agitated or use a "bad " word (like swearing) . When pre-specified thresholds were reached or certain words used, the system would call a supervisor and allow the supervisor to "ghost" (listen but not be heard), "whisper" (coach the agent without being heard by the caller), or take over the call. The terminology 15 years ago was real time monitoring with language recognition heuristics. It worked great then and it was commercially available, it wasn't "AI"...
    • Everything that gets done with neural networks is called "AI" these days. That's how it is, so get used to it, nobody is going to change it just because few people keep calling "Stop". It's even somewhat apropos, it works based on (sometimes failed) training and pretty much nobody can adequately explain how, kind of like natural intelligence in humans.
    • by ceoyoyo ( 59147 )

      You've identified the difference: "The terminology 15 years ago was real time monitoring with language recognition heuristics."

      Heuristics are a set of rules used for decision making. In the context of algorithms, those heuristics are designed by a human and programmed into the system.

      "AI" is a nonspecific term, but if it means anything it means a system that learns from experience. Specifically, it does not use preprogrammed heuristics.

  • Did they test Bruce Banner, he's always angry? Hulk Smash!

    Also does it detect passive aggressive anger? What if I yell "I LOVE YOU" at a pet, vs I whisper "I'm going to put you in the microwave and set it on high for 4 minutes, ohh yes I am, such a bad doggie you are"? What is the algorithm keying on; volume, facial expressions, changes in skin tone, words spoken? And all they did was get close to what a human could do. Come on, I thought computers were faster. Get it down to 0.000001s and I'll be i
  • Wells Fargo customer service reps lately just assume everyone is pissed at them these past few weeks, espcially yesterday and today.

    [stewie] Where's my money?! *WHAM!* Where's my money?!" [/stewie]

  • can classify anger from audio data in as little as 1.2 seconds regardless of the speaker's language

    It was just a coincidence that the German speakers were angry 100% of the time ...

  • Can it tell the difference between a raised voice because of excitement or strong feelings about a matter and a voice raised in actual anger?
  • Not possible. If someone screams "Fuuuuuuuck!!!" at the top of their lungs, there is no way AI can distinguish whether it's anger, pain, frustration, surprise, or even joy, because the source signal may be identical for all of them. At best, this system is detecting high arousal and possibly unpleasant mood.

  • too bad it can't smell the fart in its general direction,

  • "SUPPORT. HELP. HUMAN. OPERATOR. GET ME A FUCKING HUMAN BEING YOU GODDAMN PIECE OF SHIT! "

    processing... processing... processing... anger detected 37% probability

    (im not yelling slashdot im not yelling... ok i am but its on purpose let this post go through...)

If all the world's economists were laid end to end, we wouldn't reach a conclusion. -- William Baumol

Working...