Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Google AI Communications Software The Internet

Google's Sentiment Analyzer Thinks Being Gay Is Bad (vice.com) 453

gooddogsgotoheaven shares a report from Motherboard: In July 2016, Google announced the public beta launch of a new machine learning application program interface (API), called the Cloud Natural Language API. It allows developers to incorporate Google's deep learning models into their own applications. As the company said in its announcement of the API, it lets you "easily reveal the structure and meaning of your text in a variety of languages." In addition to entity recognition (deciphering what's being talked about in a text) and syntax analysis (parsing the structure of that text), the API included a sentiment analyzer to allow programs to determine the degree to which sentences expressed a negative or positive sentiment, on a scale of -1 to 1. The problem is the API labels sentences about religious and ethnic minorities as negative -- indicating it's inherently biased. For example, it labels both being a Jew and being a homosexual as negative. A Google spokesperson issued the following statement in response to Motherboard's request for comment: "We dedicate a lot of efforts to making sure the NLP API avoids bias, but we don't always get it right. This is an example of one of those times, and we are sorry. We take this seriously and are working on improving our models. We will correct this specific case, and, more broadly, building more inclusive algorithms is crucial to bringing the benefits of machine learning to everyone."
This discussion has been archived. No new comments can be posted.

Google's Sentiment Analyzer Thinks Being Gay Is Bad

Comments Filter:
  • Comment removed (Score:5, Interesting)

    by account_deleted ( 4530225 ) on Wednesday October 25, 2017 @08:54PM (#55433781)
    Comment removed based on user account deletion
    • Re:See below (Score:5, Insightful)

      by TWX ( 665546 ) on Wednesday October 25, 2017 @08:58PM (#55433809)

      If the algorithms that have come to these conclusions are based on analyzing public data from the Internet, then if an AI decides that any particular characteristic is negative, it's because it reflects the sentiments of those who bother to post opinions.

      Most people that do not themselves exhibit the trait that's being argued-against by the noisy minority don't usually express opinions on it, so they're a hole in the data that needs to be accounted for. Unfortunately it's a lot easier to interpret based on what has been said than what has not been said.

      • Re: (Score:2, Insightful)

        by Orgasmatron ( 8103 )

        In other words, reality does not agree with your opinion, thus reality is defective and must be "fixed".

        Please tell me more about this internet where all topics receive balanced coverage of opinions except "gay" and "jew".

        • by Calydor ( 739835 )

          No one said that ONLY 'gay' and 'jew' had been unbalanced. They are, however, often thrown around as derogatory slurs - far more so than, say, hindu. No one is surprised if 'n!gger' (really, Slashdot? One instance of that and I hit the lameness filter?) or 'paki' carry a negative bias, because it's hard to use those in a positive way.

          It's also a common trope on game forums to point out that only the unhappy players ever post, because something has riled them up enough that they need to vent and want it fixe

      • Re: (Score:3, Insightful)

        Comment removed based on user account deletion
        • by Calydor ( 739835 )

          I'm not even sure it's deemed homosexuality to be bad, but perhaps "Deviating from the average of averages" as bad.

      • Re: (Score:3, Insightful)

        by doctorvo ( 5019381 )

        On the one hand we have tons of actual data from probably hundreds of millions of people. Sure, it's a biased sample, but it's still a very large sample.

        On the other hand, we have your ideologically motivated handwaving and reinterpretation.

        Which of those do you think is more reliable?

      • by pots ( 5047349 )
        It seems likely that if some people think that a trait is bad, and other people think that a trait is neutral, then on balance a simple average would be somewhat negative.

        A more interesting analysis might look at whether expressing an opinion that being gay/jewish is bad, is itself bad (i.e.: are racists bad?). In that case you certainly have some people who think that this is bad, but the question is: do racists think that being racist is good? Or neutral? Even if they think it's good they probably aren
        • Contrary to what the media will tell you, racists are proud of being racist.

          The media will tell you that a guy who is repeating that he isnt racist, is racist, because of this one single thing that this evil racist did (that went "viral.")

          Its a social justice warrior world, and warriors need enemies.
    • If you've already decided that being homosexual (or a Jew, or a redhead, or a lefty, you name it) must not be deemed negative, why do you need analysis at all?

      Yes we have already decided this. It took millennia of philosophical thought to reach enlightenment but it did eventually happen.

      why do you need analysis at all?

      Are you fucking kidding?

      The purpose is to figure out if a sentence has negative connotations.

  • by Anonymous Coward on Wednesday October 25, 2017 @08:59PM (#55433819)

    It's almost like objective, quantifiable reality and feel-good political correctness are fundamentally at odds with each other.

  • Nobody at Google goes through a dictionary choosing the sentiment of words; it's the context of word usage out in the world that trains these models. So it's not Google's fault, it's our fault, if blame is to be laid.
    • So it's not Google's fault,

      Yeah it is. They failed to account for serious skew in their training data. That's literally exactly the fault of the data scientists involved. It's a very hard thing to get right and they didn't manage this time. Now they can go and make it better.

  • Look in the mirror (Score:5, Informative)

    by JOstrow ( 730908 ) on Wednesday October 25, 2017 @09:10PM (#55433891) Homepage

    So many of us already use "gay" and "jew" as derogatory terms. Is it any wonder that Google's NLP picked up on that? What source do you think it learned from?

    • by lucm ( 889690 )

      So many of us already use "gay" and "jew" as derogatory terms. Is it any wonder that Google's NLP picked up on that?

      I personally consider "Google" to be a derogatory term. Unlike gays and jews, Google has a track record of being evil.

  • by Anonymous Coward

    Reading the ridiculous comment from the google spokesperson falling all over themselves to apologize and prattle on with all the talking points of every fake corporate "diversity" statement ever made is pretty hilarious. And pathetic.

    The system came up with its conclusion on its own. It wasn't the desired conclusion by some people's standards. It's a machine that isn't real. Who cares what it "thinks"? Why apologize? Once they program enough biases of all the things it is NOT allowed to consider "bad"

  • they are born that way, you dont choose what your sexuality is which is determined by your chromosomes and hormones,

    https://en.wikipedia.org/wiki/... [wikipedia.org]

    https://www.youtube.com/result... [youtube.com]
  • Well, duh! (Score:5, Insightful)

    by GerryGilmore ( 663905 ) on Wednesday October 25, 2017 @09:23PM (#55433963)
    Even though we like to think that everyone is enlightened, etc as us, in very broad swaths of American society being gay or jewish (or Muslim or...) is very much perceived as a negative out of the starting gate. File under: Sad-But-true.
    • by jez9999 ( 618189 )

      Being a Muslim is being a person who openly declares loyalty to Sharia law over US law. Tell me how that is not negative.

      • Actually, no, it's not. Sharia is not defined in the Koran. It is perfectly possible to be a muslim while not regarding sharia as mandatory at all.

  • by burtosis ( 1124179 ) on Wednesday October 25, 2017 @09:25PM (#55433975)
    These require material to train them and the responses tend to reflect the participants nuanced behavior. I mean, what do people think is going to happen when you force feed it, eyes taped open, to 47 million social media feeds? Seems to be some kind of fine line between an algorithm and portal to hell. Well, at least they did better than Microsoft [gizmodo.com]
  • by barbariccow ( 1476631 ) on Wednesday October 25, 2017 @10:15PM (#55434205)
    It's just doing its job. Obviously on the data it was trained, "gay" and "jew" were used as derogatory terms. And they're apologizing for that instead of explaining it? Wouldn't you rather have a system that didn't have injected bias, like injecting an override such that "gay" and "jew" receive a sentiment boost despite that being contrary to its training? Total crap. Someone probably ran through every SJW term and happened to find two that didn't have the results they wished it had on usage amongst speech.
    • It's just doing its job. Obviously on the data it was trained, "gay" and "jew" were used as derogatory terms.

      Tim Cook is gay; Phil Schiller is Jewish. Google's AI is obviously just extrapolating from the fact that the company's leadership sees Apple as the enemy.

  • ... the "intelligence" is that of a human.

    Humans have biases. That's not a good thing, but it's human and it's intelligence.

    Filtering out bias moves AI out of the intelligence business and into "artificial manipulation."

    I'm OK with that and I don't have a problem with the "artificial" label, but don't call a filtered machine intelligent.

    This is a validation for those of us who preach that "artificial intelligence" will not be a thing until a computer gets random and throws a fit, like committing suicide if

    • Filtering out bias moves AI out of the intelligence business and into "artificial manipulation."

      Bullshit.

      Dealing with unequally weighted classes is hard, and it's a problem that comes up a lot. You're only getting all hot and bothered because it's about some topic you care about rather than (say) detecting rare scratches on a wafer, or an unusual cellular fission buried in thousands of normal ones.

      What most of us ML practitioners do with biased data is filter or weight the data otherwise the algorithm will

  • by oic0 ( 1864384 ) on Wednesday October 25, 2017 @11:17PM (#55434457)
    It's not a judgement by the AI against the people, it's a judgement about how others react and behave towards those them. Stop apologizing for your AI being able to perceive human behavior. "I'm so sorry my AI figured out you have a flat tire".
  • "the API labels sentences about religious and ethnic minorities as negative -- indicating it's inherently biased" Does this writer not understand the meaning of the word "inherently"? It would be inherently biased if the bias were built into the algorithm. From the description, it instead sounds like some statistical fluke in the data -- or possibly a reproducible statistical association -- misled the algorithm.
  • by superwiz ( 655733 ) on Thursday October 26, 2017 @12:44AM (#55434731) Journal
    If it just allows for text analysis and the text, as a total body of statements, uses "Jew" or "gay" as insults, wouldn't high fidelity API reflect that? It seems more like a statement about the text being analyzed rather than about the processing. If the analysis didn't have high fidelity to the text, wouldn't it then be biased? Imagine the analysis which corrects for biases against historically-oppressed minorities. Now imagine this black box is fed Mein Kampf and other Nazi works. Should it correct for the biases? That would mean not detecting antisemitic biases in the Nazi propaganda. Wouldn't that make it a bad analysis?
    • If it just allows for text analysis and the text, as a total body of statements, uses "Jew" or "gay" as insults, wouldn't high fidelity API reflect that?

      Not if it's any good, no. If you want a simple analysis that reflects modal usage of words, then a simple lookup table would suffice.

      The thing is, given the context, the sentiment is conditionally independent of the use of the words in all other sentences.

      Now imagine this black box is fed Mein Kampf and other Nazi works.

      It's not a black box. It's well kno

      • When I say "black box", I mean that it is not fed (or forced if you will) any biases it doesn't find in the larger body of text (not just locally). If there is a bias in text at-large towards anti-gay and antisemitic views, then it's just detecting a larger bias. It's not a value judgement. It's an indication that absent a value system, over-all biases will be present. It's still just a detection of biases in text at-large rather than imposing of biases on text at-large.
        • yeah but that's not very interesting or useful. If you train ML algorithms on unbalanced classes you usually get an unbalanced result.

          Thing is if you just want the amount of unbalance that's pretty easy and can be done with very simple stats. The whole point of using ML is to do better than just simple stats. They've got labeled data, so they could easily tell already which words correlate with which labels.

          Basically they feed unbalanced classes to an algorithm which isn't robust to unbalanced classes and t

  • by tomxor ( 2379126 ) on Thursday October 26, 2017 @04:47AM (#55435187)
    It is not an "artificial intelligence", it is a tool built to analyse text, trained on a mass of text written by biased humans. Now if it had something resembling a conscience or critical thinking it might have a chance at identifying and balancing prejudice to defend it's "sentiment" weighting, but right now it's a "dumb" (in the AI sense) tool... So if it's biased against "Jews" and "gays", all it tells us is on average the humans who wrote the training data are biased against "gays" and "Jews".
  • If they modify their AI code with politics it'll be useless at solving real world problems
  • Comment removed based on user account deletion
  • Poor writing about science-related things makes for a poor message. The article gives the suggestion that Google's at fault for the negative perception of its sentiment analyzer, which tells me the reporter, and most laypeople, have no idea how these forms of AI work. That, in itself, isn't necessarily a problem, but it shows that the tone of an article can manipulate the reader in inappropriate ways.

    These kinds of classification routines are based on the training of a given dataset. As such, how the
  • For Entertainment Purposes Only... Ahh... the bane of the 21st century, where Internet connectivity permits corporate and entrepreneurial developers to forget that

    They are 'building' things that can only exist while the electricity is on, stock market is up, the project is in that cherished 'Google Beta' phase where infinite wonders are available as an API (for free!) and monetization is a distant worry. Some day predatory monetization (or project abandonment and shutdown) will appear and we will pretend i

I tell them to turn to the study of mathematics, for it is only there that they might escape the lusts of the flesh. -- Thomas Mann, "The Magic Mountain"

Working...