Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Social Networks The Internet Science

Machine Learning Can't Flag False News, New Studies Show (axios.com) 42

Current machine learning models aren't yet up to the task of distinguishing false news reports, two new papers by MIT researchers show. From a report: After different researchers showed that computers can convincingly generate made-up news stories without much human oversight, some experts hoped that the same machine-learning-based systems could be trained to detect such stories. But MIT doctoral student Tal Schuster's studies show that, while machines are great at detecting machine-generated text, they can't identify whether stories are true or false. Many automated fact-checking systems are trained using a database of true statements called Fact Extraction and Verification (FEVER). In one study, Schuster and team showed that machine learning-taught fact-checking systems struggled to handle negative statements ("Greg never said his car wasn't blue") even when they would know the positive statement was true ("Greg says his car is blue"). The problem, say the researchers, is that the database is filled with human bias. The people who created FEVER tended to write their false entries as negative statements and their true statements as positive statements -- so the computers learned to rate sentences with negative statements as false. That means the systems were solving a much easier problem than detecting fake news.
This discussion has been archived. No new comments can be posted.

Machine Learning Can't Flag False News, New Studies Show

Comments Filter:
  • by SuperKendall ( 25149 ) on Thursday October 17, 2019 @02:38PM (#59319828)

    You know what would go a long, long way to stemming "Fake News"? Detecting the origin of any given writing or media.

    For instance, recently some news channel broadcast footage of a Syrian attack that turned out to be footage from a Kentucky gun range where there were having some kind of firepower demo day. They got that footage through some tweet from a Syrian political figure that they probably couldn't even understand the text from. That "story" would have been stopped cold with a computer that recognized instantly the origin of the footage and told the broadcasters where it was from before they started.

    It goes way beyond just detecting origins of media though, we all know a LOT of modern news is basically written for the news organizations by other vested interest groups (left and right). Wouldn't it be great to have a system that stylistically told us all just what group (or person) some particular "news" story had really come from? Then right away you could instantly perceive the bias.

    Such instant understanding of origin would take a lot of wind out of the sails of outrage stores before they got really hot.

    • by gurps_npc ( 621217 ) on Thursday October 17, 2019 @02:44PM (#59319848) Homepage

      That is a wise statement. In large part because most humans use that method to detect false statements.

      The problems is some humans use "Said by Fox network" to mean false, while others use "Said by CNN" to mean false.

      • That is a wise statement. In large part because most humans use that method to detect false statements.

        The problems is some humans use "Said by Fox network" to mean false, while others use "Said by CNN" to mean false.

        At this point I too often consider "read it on the internet" to mean, probably false.

      • I think a large part of that response stems from the fact that most of what's called news today is heavily editorialized opinion pieces and what people are really doing is putting it into buckets that should be called "coincides with my world view" and "does not align with my world view" that they hastily label as fake news. Very little of the news media exists to try to present objective facts in a neutral manner, and it increasingly attempts to tell you how you should feel about some fact or how that fact
        • by raymorris ( 2726007 ) on Thursday October 17, 2019 @04:22PM (#59320240) Journal

          > Even outlets that try to avoid that as best they can still face the difficult task of making sure that even if they're reporting facts, that they aren't excluding some or presenting them in a way that is designed to lead the individuals consuming that news to wrong or improper conclusions.

          I've noticed that most days Foxnews.com has an article about a cop heroically saved a three year old girl, or donated a month of their salary to a battered woman's shelter, or whatever. These stories are probably perfectly true, and the reporter may have done a good job on the article.

          Also, most days CNN has a "bad cop" story, frequently stretching back to something that happened a long time ago "bad cop up in 2018 shooting for parole". That story is also true. The reporter may have done a great job.

          WHICH stories each source chooses to report on an to highlight drastically changes the view of the world a regular reader gets. If you read / watch CNN daily, you're going to see "bad cop" stories almost every day. If you read Fox every day, you're going to see "good cop" stories almost every day. Eventually that has to effect how you view cops.

      • some humans use "Said by Fox network" to mean false, while others use "Said by CNN" to mean false.

        True, but it should be noted the first group is much more informed than the 2nd.
        https://www.businessinsider.co... [businessinsider.com]

        They found that someone who watched only Fox News would be expected to answer 1.04 domestic questions correctly compared to 1.22 for those who watched no news at all.

    • "That "story" would have been stopped cold with a computer"

      Back in 'the day', this was done by an 'editor'. Seems they are malfunctioning now. Any guess as to the cause of the malfunction in this instance?

      • Back in 'the day', this was done by an 'editor'. Seems they are malfunctioning now.

        Bingo. Sorely missing now. Example: All of the "Russians supplied the Buk for MH017 videos". Why? They show blazing sunshine, the best day to take your Buk for a spin and show it to all the locals. http://www.youtube.com/watch?v... [youtube.com]

        The reality was overcast, drizzle, followed by a torrential downpour: https://www.fagain.co.uk/node/... [fagain.co.uk]

        In fact, the aircraft was rerouted twice in the 10 minutes before it was shot down due to bad weather: https://en.wikipedia.org/wiki/... [wikipedia.org]

        There are no editors any more and the

    • You know what would go a long, long way to stemming "Fake News"? Detecting the origin of any given writing or media.

      Possible and being done now, but rather expensive. Forensic linguistics.

      Language can be analysed and expressed mathematically so you can feed it into a correlation engine. Further to this, due to people being lazy and preserving original elements even when retelling something this can be done even if after a story has been retold.

      Russians have been doing that for ages by the way.

  • by account_deleted ( 4530225 ) on Thursday October 17, 2019 @02:38PM (#59319830)
    Comment removed based on user account deletion
    • is not the negative version of "Greg said his car was blue". It actually has a lot of different meanings:

      It's actually stupider than that. The alleged positive version was "greg says his car is blue." Note that this changes the verb (said/says) from past tense to future (currently says and continues to say), AND the car from "was blue" to "is blue".

    • Also "Greg never said his car wasn't blue" does not claim a statement by Greg.

      "Greg says his car is blue" does.

  • Total BS (Score:4, Informative)

    by lorinc ( 2470890 ) on Thursday October 17, 2019 @02:39PM (#59319832) Homepage Journal

    First, the link in the summary is not even related to the story.

    Second, If you say "ML can't do this", you'd better have a formal theorem with a solid proof. Else, it just reads "Oh no! I'm not qualified enough to train a proper model on a proper dataset and I get crap accuracy!".

    I bet the original article is much less cautious than "ML can't flag fake news".

    • What if machine learning was pointed at this story? If the ML believes this story is true, then it has to believe that it is false.....MIND BLOWN
    • by Calydor ( 739835 )

      Machine learning needs much, much, MUCH more computing power to do this. Arguably it would also need a perfect worldwide surveillance system to do it - otherwise it has no way of knowing whether the story about what just happened in Syria (using that as an example since it's a hotspot right now) is true or false.

      Yes, some stories are posted to be deliberately false. Some are exaggerated. Some are extrapolated and guessed at from incomplete information. Some are posted by journalists who honestly believe the

      • How do you expect a computer to reliably figure out what IS true without, well, knowing literally everything that happens on the planet at all times?

        You can't and don't.

        You gaslight the rubes into believing you have such a system and then you simply use algorithms to suppress your political/ideological opponents and promote your side while labeling anyone who doubts the system as dangerous extremist conspiracy-nut wackjobs who are probably violent terrorists because the computer said so.

        Strat

    • I would say that people seem to be pretty bad at spotting fake news or just at avoiding to fall for some kind of scam in a general sense. Right now machine intelligence is a subset of aspects of human intelligence and often uses immense computational power to make up for some of its shortcomings such that performing a massive number of calculations very quickly gives the appearance of human-like capabilities. It doesn't surprise me that a machine would do poorly with some task that humans also do poorly wit
    • You have to rebind the reference to a new value.

      Okay, I'm done being deliberately dense.

    • Even if you had a data set the size of the world, how does it know truth without a biased human training it?
    • by mlibby ( 142509 )

      Actual URL for TFA appears to be

      https://www.axios.com/machine-... [axios.com]

      Seems like a lot of the commenters here so far won't have RTFA'ed, then. No big surprise there.

  • Correct link should probably be: https://www.axios.com/machine-... [axios.com] not: https://www.axios.com/mick-mul... [axios.com] even though I'm sure that both articles are interesting.
  • This just in: new research shows that MIT researchers give up easily.

    • Yeah, seems just analyzing the story won't ever work, probably. Instead you look at things like who's sharing it, if there are other similar stories with the same basic facts, the "trustworthiness" of the source based on other previous determinations, etc... You munge all that together and you can probably come up with pretty good results.

      E.g. if some story is only seen from old farts on Facebook and Infowars, I'd say there's a 100.9% chance it's fake.

      • Yeah, seems just analyzing the story won't ever work, probably. Instead you look at things like who's sharing

        Right. Basically you reinvent "Page rank", the secret sauce that made Google the behemoth it is today. (btw, the Page in "Page rank" stands for "Larry Page", not "web page".)

  • The link to Axios points to the wrong story. The correct link is:

    https://www.axios.com/machine-learning-cant-flag-false-news-55aeb82e-bcbb-4d5c-bfda-1af84c77003b.html

  • by irving47 ( 73147 ) on Thursday October 17, 2019 @03:34PM (#59320038) Homepage

    12 year-old girls are being charged with felonies for making "finger gun" gestures. Of course the news is hard to filter between real and fake.

  • Next up... (Score:1, Flamebait)

    by PopeRatzo ( 965947 )

    If AIs are unable to identify false news, they may be the ideal Republican voters.

  • It can't work (Score:2, Insightful)

    by Zof ( 6254934 )
    We don't have a definition of reality for it to follow. If it worked based on facts, 90 percent of news stories would be called fake. Every story with "anonymous sources" and other fabricated nonsense unethical trash pay PR firms to spread would all get flagged.
  • It's easy actually. Go to the online forums were all the gullible partisans, extremists, and trolls are. They usually link to spin & fake news so one can scrape, catalog, and count such references.

    Who needs AI when you can trick humans into doing the work?

  • An old and frail political leader has another coughing fit.
    Is that fake news to show the video clip?
    To comment on the news?
    To make a funny meme of the symptoms from the video clips?
    To recall another health related events.
    The inability to travel around more of the USA?
    How to hide all that poor health? Call the video clips it "fake news"?
    When can the good censor step in to support their side of politics and ban the reporting, images, video, memes, links and comments?
  • I think the moral of this story is not that machine learning can't do fact checking, but rather that the FEVER database is not a good training set for ML models.
  • Non-reporting, selective reporting and exaggerated headlines by news organizations do more to change individuals and the public's perception of events and their likelihood of occurrence than fake stories.
  • by Sulik ( 1849922 )
    "In all intellectual debates, both sides tend to be correct in what they affirm, and wrong in what they deny." -- John Stuart Mill

Whoever dies with the most toys wins.

Working...