Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Social Networks Science

GPT-Fabricated Scientific Papers Found on Google Scholar by Misinformation Researchers (harvard.edu) 81

Harvard's school of public policy is publishing a Misinformation Review for peer-reviewed, scholarly articles promising "reliable, unbiased research on the prevalence, diffusion, and impact of misinformation worldwide."

This week it reported that "Academic journals, archives, and repositories are seeing an increasing number of questionable research papers clearly produced using generative AI." They are often created with widely available, general-purpose AI applications, most likely ChatGPT, and mimic scientific writing. Google Scholar easily locates and lists these questionable papers alongside reputable, quality-controlled research. Our analysis of a selection of questionable GPT-fabricated scientific papers found in Google Scholar shows that many are about applied, often controversial topics susceptible to disinformation: the environment, health, and computing.

The resulting enhanced potential for malicious manipulation of society's evidence base, particularly in politically divisive domains, is a growing concern... [T]he abundance of fabricated "studies" seeping into all areas of the research infrastructure threatens to overwhelm the scholarly communication system and jeopardize the integrity of the scientific record. A second risk lies in the increased possibility that convincingly scientific-looking content was in fact deceitfully created with AI tools and is also optimized to be retrieved by publicly available academic search engines, particularly Google Scholar. However small, this possibility and awareness of it risks undermining the basis for trust in scientific knowledge and poses serious societal risks.

"Our analysis shows that questionable and potentially manipulative GPT-fabricated papers permeate the research infrastructure and are likely to become a widespread phenomenon..." the article points out.

"Google Scholar's central position in the publicly accessible scholarly communication infrastructure, as well as its lack of standards, transparency, and accountability in terms of inclusion criteria, has potentially serious implications for public trust in science. This is likely to exacerbate the already-known potential to exploit Google Scholar for evidence hacking..."
This discussion has been archived. No new comments can be posted.

GPT-Fabricated Scientific Papers Found on Google Scholar by Misinformation Researchers

Comments Filter:
  • The Google dilemma continues. How will they cope with bad actors, technology, and harm reduction, while earning profit primarily from advertising and marketing data revenues?

  • The hardest part of writing a paper is actually doing the research. If you just ask ChatGPT to write a paper and don't give it the results of a novel experiment, all it can do is make things up.

    If the paper is written by ChatGPT, and the paper is actually correct and accurate, then I don't see a problem with that. ChatGPT doesn't write very well, but many papers are not written very well.
    • Re:invented data (Score:4, Interesting)

      by martin-boundary ( 547041 ) on Saturday September 07, 2024 @11:12PM (#64771440)
      The big problem with Science today is the proliferation of papers. It doesn't matter if it is accurate and correct: if it isn't original or novel then it still contributes to the information pollution just as much as if it was inaccurate or downright fantasy.

      What is needed in Science is *fewer* papers, of higher quality, that leave sufficiently large gaps that are trivial to bridge by talented researchers. That is by definition not something a tool like ChatGPT, which only interpolates existing knowledge and makes up the rest, can help with.

      Now if you say ChatGPT can help improve the English grammar of the paper, then I will say it doesn't matter, a sufficiently talented researcher can bridge that gap, and in so doing will be forced to think more deeply about the subject matter anyway.

      • by g01d4 ( 888748 )

        proliferation of papers

        I think the proliferation of papers is more to do with ever increasing niche areas of research as an ever increasing number authors strive for originality. Whereas these niches appeal to vanishingly smaller audiences it's easier to sneak in some ChatGPT nonsense.

        • by Shaitan ( 22585 )

          This is largely about social science. You can slip in ChatGPT nonsense anywhere you like in these fields because they are grounded in speculation, shoddy math, and popular opinions rather than physical reality.

      • by Shaitan ( 22585 )

        "The big problem with Science today is the proliferation of papers."

        The big problem with science today is that it is infiltrated with social pseudoscience and the rise of government regimes defining a concept like 'misinformation' so they can block information they disagree with.

        Without manipulation there should be as many papers as there are things to report, no more or less.

        "That is by definition not something a tool like ChatGPT, which only interpolates existing knowledge and makes up the rest, can help

      • The big problem with Science today is the proliferation of papers. It doesn't matter if it is accurate and correct: if it isn't original or novel then it still contributes to the information pollution just as much as if it was inaccurate or downright fantasy.

        What is needed in Science is *fewer* papers, of higher quality, that leave sufficiently large gaps that are trivial to bridge by talented researchers. That is by definition not something a tool like ChatGPT, which only interpolates existing knowledge and makes up the rest, can help with.

        Now if you say ChatGPT can help improve the English grammar of the paper, then I will say it doesn't matter, a sufficiently talented researcher can bridge that gap, and in so doing will be forced to think more deeply about the subject matter anyway.

        I feel like that's more a problem from the outsider perspective. Sure those minor results don't lead to a breakthrough, but those incremental steps add up to help create the bigger breakthroughs.

    • by Calydor ( 739835 )

      With how often ChatGPT is wrong about commonly known things I certainly wouldn't trust it to be right about some new, novel, and extremely esoteric research.

  • We have humans writing papers filled with pseudoscience and regurgitating long discredited ideas, notably from America's right wing to bolster their views. And GPT is sucking up this crackpot nonsense and parroting it forth. What has the world come to when research is becoming as trustworthy as what comes out of some homeless meth addict's mouth?
    • LOL. The right wing has been largely purged from American academia. The left owns it, including its many problems.
      • Science when did the public really care about this? A paper that supports their worldview is good enough for them, even if the "academic" is working from an official sounding 'institute' running out of a storefront.
        • by Shaitan ( 22585 )

          Ironically that is the reason for the pseudoscientific social science garbage you defend getting a toehold and corrupting science in the first place.

    • by Shaitan ( 22585 )

      lol That's a good one. Of course you mean ideas only discredited by garbage pseudoscience in fields which aren't grounded in observations of physical reality and the first place. It's hilarious for someone defending those sorts of ideas to criticize ANYTHING as garbage science.

      Look at the kind of ridiculous and convoluted frameworks you've had to invent. Needing a decade of brainwashing to convince people to agree with your rationalizations, layer-by-layer, doesn't mean you are educated and enlightened, whi

  • To gauge if this is a serious problem, we need to know if the top conferences and journals suffer from accepting these AI-generated papers. We should keep in mind that even before the advent of AI-generated papers, we already had to deal with paper mills that tried to game stats such as h-index or paper counts per school or country. Most of these problems arose from non-top level conferences and journals.

    If AI-generated (and the implication is that such papers are low quality) papers infect arxiv, is that a

  • As far as I'm concerned Google's search results are not to be trusted, given their track record for promoting misinformation, disinformation, malware, scams, & serving the interests of powerful corporations, e.g. remember when they effectively covered up critical news about the BP Mexican gulf oil disaster? They flooded search results with BP sponsored PR rubbish.

    That they're now getting caught for promoting misinformation & disinformation in Google Scholar comes as no surprise. Also, given the p
    • by Shaitan ( 22585 )

      Yup and remember when they suppressed COVID-19 related information on behalf of government pressure from the organization which funded the gain of function research that likely led to the pandemic and was definitely trying to cover up that possibility?

      • No, I think I missed that MAGA rally. Did they all bring their guns?
        • by Shaitan ( 22585 )

          I first became aware that youtube was censoring information related to COVID when a biohacking group I followed produced a vaccine, documenting their work throughout the process for full transparency and then was shut down by youtube.

  • ... if fabricated papers can be published on a platform, and people get fooled by choosing to trust them, I think that's your problem, not that an LLM authored the paper.
    • The problem is the LLMs make it easier to convert vague ideas into papers. This allows the volume to be increased at little cost, with no increase in information content. I.e. it increases the noise level.

  • by argStyopa ( 232550 ) on Sunday September 08, 2024 @09:53AM (#64771934) Journal

    They were posted so someone must know, yes?
    And I presume a paper published in the world has to have an attributed author?

    So are these people being identified, blocked, and banned from science publishing...forever?
    If a "scientist" publishes a gpt-authored paper, they should be hounded out of the field.

    Are they?

    • I would imagine so.
      In computing, publishing is mostly done by ACM and IEEE. Cases of plagiarism and data fabrication are reported to the publisher. They maintain a list (and I believe share it between them) to ban authors caught in non ethical authorship.
      When you run a journal or a conferen that they sponsor, they share a list of banned author.
      If a conferenreceive a paper from them, you just forward to the publisher who handles it
      I have never served on the ethics panel, so I am not sure what the precise sta

  • I don't know what "misinformation" is. But, it does seem highly likely that as the bulk of scientific publications are in English, and the bulk of the world does not speak or rite gud inglish, that they would use LLM software to help them write and/or translate their papers.

    • by HiThere ( 15173 )

      I doubt that the bulk of scientific papers are in English, but the bulk that are covered by Google probably are. So you've identified one strain of the problem. There are others.

      • by Shaitan ( 22585 )

        Indeed, since English is the most universal language it would make far sense not only for the bulk of papers to be written in it but to stop using other languages as the primarily language in academics globally. This should reduce translation errors and miscommunications drastically as well as vastly expanding the pool of readily consumed and shared science across the board for future generations.

        • by HiThere ( 15173 )

          I think China might have a few objections to that. Whether something "makes sense" as a choice depends on what your goals are, and China might be just as happy if a lot of their developments didn't rapidly leak outside the country. (Rapidly is a key word here. I'm not talking about explicit secrecy, but just a barrier that slows diffusion.)

          • by Shaitan ( 22585 )

            From what I understand the primarily reason the Chinese state promotes maintaining a China specific language to help them manage propaganda and information outside of science so I'm sure it would be the same within. I could certainly see slowing down ingestion of outside discoveries/information that conflict with their state propaganda as a priority for the state as well as having more opportunity to contain embarrassing errors and fake research.

            Still, that does seem like something of a compromise on the la

    • That is indeed jappening. I have been reviewing papers from $notenglishspeakingcountry recently which were of way lower quality than what they used to write. I am talking top tier research institution that usually writes very good papers. The last 3 I reviewed from them were almost unreadable. My guess is that they pushed the writing to an AI translator rather than eriting it themselves.
      In all 3 cases I had to request a reject because I could not understand the paper because of its poor language

The question of whether computers can think is just like the question of whether submarines can swim. -- Edsger W. Dijkstra

Working...