Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
AI Technology

The Algorithms That Detect Hate Speech Online Are Biased Against Black People (vox.com) 328

An anonymous reader shares a report: Platforms like Facebook, YouTube, and Twitter are banking on developing artificial intelligence technology to help stop the spread of hateful speech on their networks. The idea is that complex algorithms that use natural language processing will flag racist or violent speech faster and better than human beings possibly can. Doing this effectively is more urgent than ever in light of recent mass shootings and violence linked to hate speech online. But two new studies show that AI trained to identify hate speech may actually end up amplifying racial bias. In one study [PDF], researchers found that leading AI models for processing hate speech were one-and-a-half times more likely to flag tweets as offensive or hateful when they were written by African Americans, and 2.2 times more likely to flag tweets written in African American English (which is commonly spoken by black people in the US). Another study [PDF] found similar widespread evidence of racial bias against black speech in five widely used academic data sets for studying hate speech that totaled around 155,800 Twitter posts.

This is in large part because what is considered offensive depends on social context. Terms that are slurs when used in some settings -- like the "n-word" or "queer" -- may not be in others. But algorithms -- and content moderators who grade the test data that teaches these algorithms how to do their job -- don't usually know the context of the comments they're reviewing. Both papers, presented at a recent prestigious annual conference for computational linguistics, show how natural language processing AI -- which is often proposed as a tool to objectively identify offensive language -- can amplify the same biases that human beings have. They also prove how the test data that feeds these algorithms have baked-in bias from the start.

This discussion has been archived. No new comments can be posted.

The Algorithms That Detect Hate Speech Online Are Biased Against Black People

Comments Filter:
  • Corrected headline (Score:5, Insightful)

    by Brett Buck ( 811747 ) on Friday August 16, 2019 @03:30PM (#59094846)

    Hate Speech Algorithms do not recognize double standards.

    Fixed that for you.

    • by RobotRunAmok ( 595286 ) on Friday August 16, 2019 @03:48PM (#59094948)

      The programmer trains the AI to see the N-word as hate speech. Fair enough. Then the AI reads the so-called "black twitter" where everyone refers to everyone else as the N-word, and the AI tracks it all as hate speech. Also fair, it's a goddam computer, not a socio-linguist. Ditto the Q-word for the so-called "LGBTQ twitter." Fair, and fair.

      Hysterical. But fair.

      • by cayenne8 ( 626475 ) on Friday August 16, 2019 @05:20PM (#59095244) Homepage Journal
        How about we just got back to the thought that there is no such thing a "hate speech"....there is only speech.

        And that everyone has a right to say what they please, as long as it isn't against the law (slander, libel), or is directly inciting violence.....and everything shy of those special cases is allowed.

        And if you don't like what you're hearing, you can quit listening or go elsewhere.....and grow a bit thicker skin and quick being a snowflake that has to be protected from words.

        Sticks and stones....remember the old saying?

      • by elrous0 ( 869638 ) on Friday August 16, 2019 @05:24PM (#59095256)

        Computers just aren't built to handle the complex mental gymnastics that humans have to do to negotiate the often quite bizarre and nonsensical world of social interaction in a modern western world where the wrong phrase or subtle sentiment, no matter how innocently expressed, can ruin your life. It's very hard for most humans to even understand the "rules."

      • Is It Fair? (Score:4, Insightful)

        by Kunedog ( 1033226 ) on Friday August 16, 2019 @05:43PM (#59095320)

        The programmer trains the AI to see the N-word as hate speech. Fair enough. Then the AI reads the so-called "black twitter" where everyone refers to everyone else as the N-word, and the AI tracks it all as hate speech. Also fair, it's a goddam computer, not a socio-linguist. Ditto the Q-word for the so-called "LGBTQ twitter." Fair, and fair. Hysterical. But fair.

        Unless terms like "cracker" or "redneck" are flagged, it would seem to be (deliberately) biased in favor of black people.

      • by AmiMoJo ( 196126 )

        It's more subtle than that. The first page of TFS gives the example of "I saw him yesterday" vs. the African American English equivalent "I saw his ass yesterday". Apparently the word "ass" triggers the AI.

        Note that African American English (AAE) is recognized as a dialect and actually has its own complex rules etc, much like other dialects such as Southern American English or Jamaican English or Scottish English.

        I wonder what it would make of Scottish English. It's pretty much the ultimate test for any spe

    • by OzPeter ( 195038 )

      Hate Speech Algorithms do not recognize double standards.

      Fixed that for you.

      So using "bastard" as a term of endearment (as allowed for use in Australian english) should actually be considered a double standard?

      • Well it will have to guess their language then. For example, TFS makes reference to a language I've never heard of, called African American English, which may include, based on the topic, frequent use of words that are considered vulgar in regular English. So if a white guy makes a lot of racial slurs on Twitter, then the AI will have to conclude that he speaks the "African American English" language.

        Though this ridiculousness went too far a decade ago. How did that go...âoeIâ(TM)ve never seen a t

    • They'll Fix It (Score:5, Insightful)

      by Kunedog ( 1033226 ) on Friday August 16, 2019 @03:53PM (#59094968)
      See, the flaw is that the current algorithm flags racist speech, no matter what color the speaker is. I'm sure future versions of the anti-racism bot will be programmed/taught to value skin color above all else.
      • The problem is there's no such thing as "racist speech". Speech can be used as a tool for racism, certainly. But there is no word or series of words that is intrinsically racist.

        The word "boy" is one of the most commonly known and widely used examples of "racist speech".

        Human beings can't even distinguish racism from carelessness online. No algorithm can do this.

      • by AmiMoJo ( 196126 )

        Not skin colour. As the study points out, it's the dialect of English being used that is the issue.

        The very first page uses the example of "I saw his ass yesterday". There are white people who talk like that.

        The study isn't claiming racism, it's claiming bias against African American English speakers, who may not actually be African American themselves.

    • So...politicians can't be replaced by AI anytime soon!!! I don't know if that is comforting or a problem.
    • I can't believe I have to say this, but context is a thing. That's why Tokyo Breakfast [youtube.com] is hilarious and this [youtube.com] is not.

      The latter is a parody of Asians taking on the mannerism without understanding them while the other is just someone looking down on people to make themselves feel better.

      Context Matters.
  • by Viol8 ( 599362 ) on Friday August 16, 2019 @03:30PM (#59094850) Homepage

    A far larger proportion of it than other musical genres is misogynistic, aggressive, preening bullshit that advocates a cartoon violence lifestyle that a lot of kids are copying. Rap music no longer reflects the streets, it dictates what happens on them. Terrorism aside you never hear of shootings or stabbings at rock, EMD, jazz, classical or any other type of gigs/concerts, but they're 10 a penny at rap gigs.

    • Oh please, you're being ridiculous.

      You speak like someone who doesn't interact with any kids and hasn't gone to any rap concerts. Maybe you yell at them to get off your lawn?

      Yea, just like all the extremely violent TV shows, movies, and cartoons are causing people to become violent. Don't forget books and comics.

      I'd rather blame bad parenting than whatever entertainment is popular. Of course, parents often have a hard time believing they're responsible for how their little angels act.

    • by AmiMoJo ( 196126 )

      Couldn't immediate find statistics to back that up, but I did find a very recent study that points out that pop music actually has as much violence in it: https://phys.org/news/2019-03-... [phys.org]

      Then again rap is associated with poverty so it's quite likely that it does correlate with violence too. Of course correlation is not causation. Just like playing Mortal Kombat doesn't make people rip each other's spines out, listening to rap music probably doesn't make them stab each other.

  • From VOX? Really? (Score:5, Interesting)

    by Tempest_2084 ( 605915 ) on Friday August 16, 2019 @03:36PM (#59094886)
    Are we really using Vox as a news source these days? Sigh.
    • The writers have an unashamed political bias, sure, but they don't skimp on the fact-checking. I like WaPo for the same reason.

      • by liquid_schwartz ( 530085 ) on Friday August 16, 2019 @04:25PM (#59095082)

        The writers have an unashamed political bias, sure, but they don't skimp on the fact-checking. I like WaPo for the same reason.

        Perhaps the most devious lie is to just use facts that have been cherry picked while omitting any other viewpoint or facts that might go against your narrative. Lying by omission or telling incomplete truths is extremely common for hot button topics. Much like conflating statistics for legal immigrants and illegal immigrants.

        • Much like conflating statistics for legal immigrants and illegal immigrants.

          Or conflating legal asylum seekers with illegal immigrants.

  • Comment removed (Score:5, Insightful)

    by account_deleted ( 4530225 ) on Friday August 16, 2019 @03:37PM (#59094894)
    Comment removed based on user account deletion
    • And it's funny that the same ones that are complaining about "The Man" is acting like "The Man" -- Lockups are for convicts
    • by AmiMoJo ( 196126 )

      No, as the study you didn't read points out, it's not about race. It's about the dialect of English you speak, and it just happens that the one many black people use comes off worse. That dialect is not exclusive to black people though.

  • I'm talking about "the N word". Maybe this will encourage people of a certain race to stop throwing that forbidden to everybody else word around so casualy -- Lockups are for convicts
  • You don't say? If you classify the N-word as "hate speech", some black people use it in every other sentence just for dramatic effect.

  • Did they attempt to correct for education level. I am going to bet so-called African-American English (wait, those wingnuts gave up on Ebonics?) will average out to a much lower education level. In turn, I am willing to bet racial bias largely disappears when you compare based on education / grade level.

    Many adults with a 3rd grade reading and writing level probably look racist to an AI. Because they are.
  • The training data came from humans, it's so far been impossible to scrub the biases out, even if you try to hide the user's race from the AI - it'll pick it up from proxy factors.

    • This is in large part because what is considered offensive depends on social context. Terms that are slurs when used in some settings -- like the "n-word" or "queer" -- may not be in others.

      Thing is, the "algorithm" needs to take into account BOTH the speaker (and the social context they communicate in) AND the listener (and the social context they operate in). In other words the speaker may not feel what they're saying is hateful, but the listener might. That's a harder problem than screening just one or the other.

    • "It's so far been impossible to scrub the biases out, even if you try to hide the user's race from the AI". Not to worry, I'm sure if they keep working they'll manage to massage the data sufficiently to get the answer they want.
  • by WaffleMonster ( 969671 ) on Friday August 16, 2019 @04:33PM (#59095114)

    This particular problem is quite easy to solve.

    Fire thought police, disable the filters and stop censoring people. The problem with nonsense on social media has nothing to do with lack of censorship.

    It has everything to do with poor governance that actively rewards, amplifies and encourages the proliferation of nonsense in order to maximize profit.

  • by blindseer ( 891256 ) <blindseer@@@earthlink...net> on Friday August 16, 2019 @04:41PM (#59095140)

    The reason an algorithm might detect more bias from a certain race might be because that race has a tendency for more bias.

    Just because we find some trend among racial lines does not mean that there is automatically some kind of racism inherent in the system. There will be trends among races but we should still treat people as individuals. We should not excuse bigotry because someone is a member of a given race. By "correcting" the algorithm to account for race isn't "reverse racism", it's racism.

    If there is in fact something in the algorithm that flags someone's post as "hate speech" because of one's race then that needs to be corrected. This can be done by removing any racial identification from the data, which I can only assume was done in the first place. If this still flags more hate from a given racial group then perhaps it would be logical to conclude that some races are more prone to hate speech than other races.

    Again, we need to treat individuals as individuals. If we keep lumping people together by race then we lock people into paths that were chosen by their skin color instead of their own talents, attitudes, etc.

    Treating people as individuals does require that we recognize trends among different groups but that an individual can fall outside of this trend.

    • by GeekBoy ( 10877 )

      Well said

    • I don't disagree, however certain words or forms of speech that are considered blatant hate speech in the hands of someone not in that ethnic group can be socially acceptable speech within the group. It makes it very hard for a computer to discern this unless it has awareness of the ethnicity and social context of the person before it is evaluating. The N* word being the classic example. It must also have fun with us Australians as we frequently use insults as friendly greetings here, was one of the hardest
    • by HiThere ( 15173 )

      Everything you said is reasonable, but it said in the summary that the training data was biased.

  • by Falos ( 2905315 ) on Friday August 16, 2019 @04:55PM (#59095188)

    Humans can't properly codify "hate speech", so why would a computer?

  • WHAT A SURPRISE! (Score:5, Insightful)

    by nnull ( 1148259 ) on Friday August 16, 2019 @05:07PM (#59095210)
    The ones spewing hate speech are actually non-whites? Color me surprised!
  • AI does not discriminate against black people, it apparently discriminates against people who talk (or write) a certain way according to the article. It sounds incredibly racists to assume that all that hate speech is written by people of a certain color.
  • Goodness, trying to identify "hateful" people by what they say and not what they mean is going to backfire? Couldn't see that coming

    Identity politics is trying to put people into boxes, and it just doesn't work. That's why the extreme left is such a hateful group... are you a black person that lives in Chicago that doesn't like democratic stuff? Be prepared to be called a white supremacist by other white guys. Unless they figure out that you are black, in which case it is Uncle Tom.

    Judging people as individ

  • Blacks are racists too - even to each other. It doesn't surprise me it shows up in their speech. Now, of course, they will tweak their software to give the black community a free pass on its racism and hate against whites and each other according to the beliefs of hyper-liberalism.
  • The British writer Agatha Christie wrote a story called

    "ten little Ni66er"

    It was based on a song (the death do follow the song) and has the same title as the song.

    The US version changed the name to "And then there were none"... Other versions kept the original name, translated (for example, in french, the story is called "dix petits nègres"

    There are many references to that word in the story... 10 small statues, the song, the name of the Island... And no racism behind that story !!!

  • Wrong.

    When Afro Americans call each other "niggahs" that IS a hate speech.

    Not against Afro-Americans, but against whites.

    Think about it. Why they started to use it for each other and why they still continue doing so.

"Gravitation cannot be held responsible for people falling in love." -- Albert Einstein

Working...