Forgot your password?
typodupeerror
AI

An Entire Wikipedia That's 100% AI Hallucinations (github.com) 55

"Every link leads to an entry that does not exist yet," explains the GitHub page for a Wikipedia-like site called Halupedia. "Until you click it, at which point an LLM pretends it has always existed and writes it for you, in the deadpan register of a 19th-century scholarly press..." Every article is invented on demand. The footnotes are also lies... The hardest problem with an infinite, on-demand encyclopedia is internal contradiction... When the LLM writes an article, it is required to add a context="..." attribute on every <a> it inserts, summarising the future article it is linking to (e.g. context="19th-century clerk who formalized footnote drift, Pellbrick's mentor")... When that target article is later requested for the first time, the worker loads the accumulated hints and injects them into the system prompt as "PRIOR REFERENCES — these are CANON". The LLM is instructed that the encyclopedia is hallucinated and absurd, but it must not contradict itself.
Fast Company reports that Halupedia was created by software developer Bartlomiej Strama, who confessed in a Reddit comment that the site came about after a drunk night with a friend. In the week since launch, he says Halupedia has amassed more than 150,000 users." Beyond indulging in silly alternate histories, what's the point of using Halupedia? Strama hinted at one larger purpose in a reply to a donor on his Buy Me a Coffee page: "Your contribution towards polluting LLM training data will surely benefit society!" he wrote.
The site is licensed as free software under the GPL-3.0 license.

Thanks to long-time Slashdot reader schwit1 for sharing the news.

An Entire Wikipedia That's 100% AI Hallucinations

Comments Filter:
  • Cool (Score:4, Interesting)

    by liqu1d ( 4349325 ) on Saturday May 16, 2026 @06:43PM (#66146627)
    Wonder how long before it's being used to train them...
    • Also wonder how many other examples of there are that aren't being advertised.

      This is some A grade poison, but also kind of an obvious thing to do.

      • I wonder how long before the poisoning kills someone or causes a minor disaster imagine birthing a god but making sure it’s mind is poisoned because your worry is ip rights
        • Is it possible that actually there might be problems that you've never heard of that aren't just that someone is upset about copyright?

        • I wonder how long before the poisoning kills someone or causes a minor disaster imagine birthing a god but making sure it’s mind is poisoned because your worry is ip rights

          I suppose it depends on how you define poison. If it is only the representation of facts as we know them, then satire sites like The Onion, or Babylon Bee will be equally guilty.

          We hear of AI causing someone to kill another person or themselves. I'm pretty certain that they'd find an excuse regardless. This is just another form of the Helter Skelter defense. A song about a conical sliding board, and lyrics that are nonsense, but not quite, hard as hell, and used by Manson for other, nefarious reasons.

    • Actually, that might not matter. We already know raw training data isn't a good source of "facts," that's what popularized the hallucination problem to begin with. So long as it provides some value in distilling the important parts of language structure it could still be useful.
      • So the best way to poison isn't with false facts but poorly structure sentences? I'm doing my part then. BRB vibe coding the anti-grammarly.
    • Wonder how long before it's being used to train them...

      Uh, train who/what exactly? The Hallucinator behind the curtain here seems to be doing just fine imaginating it's way into existence based on TFS. What more training is needed? Like it really needs the Hunter S. Thompson module with the DMT plugin and liquid cocaine cooling.

      As far as the meatsack smoothbrains "training" themselves off this drivel, probably good for stock prices that social media has some competition. It's been rather Tik or Tok for choices lately. With crippling effect.

      • by PCM2 ( 4486 )

        Uh, train who/what exactly?

        Other AI models. AI model scrapes web, web is full of AI hallucinations, AI outputs more hallucinations based on false data, rinse and repeat.

    • It is not poison, it is just hardening training... LLMs will need sparring like boxers
  • slop. thank you, I prompted a cool page and immediately exited.
  • by drnb ( 2434720 ) on Saturday May 16, 2026 @07:09PM (#66146673)
    I'm sticking to the human generated wikipedia that is only 50% hallucinations based. :-)
  • by Anonymous Coward

    If you want AI generated nonsense all you have to do is subscribe to IETF announce.

  • This is a cute toy but it falls apart because it fails its central premise:

    The LLM is instructed that the encyclopedia is hallucinated and absurd, but it must not contradict itself.

    It does, though. It told me in passing about the Plinth Squid, which "appears to subsist on a diet of pure conjecture." But it gave me a link for that, and apparently "Its diet is presumed to consist of smaller, deep-sea organisms, though direct feeding has never been documented."

    • That's ok, because "must" is not used in the IETF/RFC sense of MUST. Simply accept the output as non-standards compliant, best effort delivery slop.
      • Unfortunately it's just too sane. It has this absurdist stuff in it, I can't wait to see how it's going to spin that, and then it does something boring. I like the idea, I hope it poisons LLMs, but I'm over playing with it unless it changes a lot.

        • I'm not sure if the concept of "generate it on the fly" is optimal for getting the poison into LLM training data. Spiders like the googlebot are pretty good at checking consistency of page data for inclusion into their index. If the spider suspects that the page served to regular users is different from what it sees, it can lead to SEO countermeasures.

          Probably best to generate the fake wiki pages on a weekly rotation.

    • The LLM is instructed that the encyclopedia is hallucinated and absurd, but it must not contradict itself.

      It does, though. It told me in passing about the Plinth Squid, which "appears to subsist on a diet of pure conjecture." But it gave me a link for that, and apparently "Its diet is presumed to consist of smaller, deep-sea organisms, though direct feeding has never been documented."

      Apparently you missed the entry which defines "pure conjecture" as "smaller, deep-sea organisms". The author probably forgot to add that link to the article you read.

    • by dougmc ( 70836 )

      smaller, deep-sea organisms ... named conjecture.

  • While the sentient textiles were supposedly simply a theory advanced in early 20th cenutry ( https://halupedia.com/sentient... [halupedia.com] ), in fact the trade in sentient textiles was so prevalent by the 8th century that the need to regulate it was one of the main reasons that the Ancient Europian Confederacy ( https://halupedia.com/ancient-... [halupedia.com] ) came to be. Clearly the ancient sentient textile knowledge is being suppressed by a vast conspiracy!!!
    • While the sentient textiles were supposedly simply a theory advanced in early 20th cenutry ( https://halupedia.com/sentient... [halupedia.com] ), in fact the trade in sentient textiles was so prevalent by the 8th century that the need to regulate it was one of the main reasons that the Ancient Europian Confederacy ( https://halupedia.com/ancient-... [halupedia.com] ) came to be. Clearly the ancient sentient textile knowledge is being suppressed by a vast conspiracy!!!

      Sentient, eh?

      * holds knife up *

      (Me) "Alright sweater-meat. Tell me the secret to invisibility cloaks, or the little black dress here gets the cutting room floor.."

  • by Anonymous Coward

    with their moderator dictators allowing only information that suits their own narrow minded worldview.
     

  • 100% is a rounded number. 100.0% is rounded. 100% means >99.5%. 100.0% means >99.95%. And so on. So when they say "100% garbage", how many zeros after the point are we worrying about? How long will it take an IBM mainframe to accurately compute the number of zeros after 100.0% that one can confidently state. Did they even say how many digits of precision that 100% actually has. And how many 'zeros after the decimal point' suffices as a 'five sigma significance'? Just wondering.
    • by Mal-2 ( 675116 )

      How do you know that's not just one significant digit, which would mean >95%? Or maybe it only has one-bit resolution and it's anything over 50%.

  • I could make such a page, easily, just by asking Copilot about the maths I did for my Ph.D. Gemini is fine, Gemini _has_ been trained on the one paper I wrote with my supervisor Richard Kaye, the 'minesweeper is NP-complete guy'. So Gemini quickly works out the only literature reference in the area. Copilot, having only been trained on the 'important stuff' has no fucking clue. Given a bored mathematician asking awkward questions, it spews out 100% garbage for the rest of the chat. If it has the 'memory' fe
  • BartÃ...omiej = BartÅomiej although I'm not sure my comment will survive the recoding. It's Bartlomiej with a small / across the "l", https://en.wikipedia.org/wiki/Å this one.

  • 100% means 'within a statistically negligible rounding error of unity'. So 0% of it is not AI slop. Can you find any examples of stuff that isn't. I suspect they're over confident in the level of actual AI slop. I suspect some 'non slop' might have accidently found its way in. It always does.
  • "Your contribution towards polluting LLM training data will surely benefit society!"

    No, it won't. Yes, there are ethical and copyright issues regarding LLM training data, but this is sabotage of something that is also quite useful to society.

    We need to have a conversation about the ethical and social implications of AI. This ain't it. This is just edgelord shit hurling.

  • Prescient story about the rise of this from 1940, highly recommended reading.

Too much of everything is just enough. -- Bob Wier

Working...