
Google AI Fabricates Explanations For Nonexistent Idioms (wired.com) 85
Google's search AI is confidently generating explanations for nonexistent idioms, once again revealing fundamental flaws in large language models. Users discovered that entering any made-up phrase plus "meaning" triggers AI Overviews that present fabricated etymologies with unwarranted authority.
When queried about phrases like "a loose dog won't surf," Google's system produces detailed, plausible-sounding explanations rather than acknowledging these expressions don't exist. The system occasionally includes reference links, further enhancing the false impression of legitimacy.
Computer scientist Ziang Xiao from Johns Hopkins University attributes this behavior to two key LLM characteristics: prediction-based text generation and people-pleasing tendencies. "The prediction of the next word is based on its vast training data," Xiao explained. "However, in many cases, the next coherent word does not lead us to the right answer."
When queried about phrases like "a loose dog won't surf," Google's system produces detailed, plausible-sounding explanations rather than acknowledging these expressions don't exist. The system occasionally includes reference links, further enhancing the false impression of legitimacy.
Computer scientist Ziang Xiao from Johns Hopkins University attributes this behavior to two key LLM characteristics: prediction-based text generation and people-pleasing tendencies. "The prediction of the next word is based on its vast training data," Xiao explained. "However, in many cases, the next coherent word does not lead us to the right answer."
AI Doing What Asked (Score:4, Interesting)
If you ask for an explanation of a phrase then the AI model is going to give you an explanation for the phrase. If it's not one with an existing explanation, then you're basically asking it to consider what explanation would make sense for it and to come up with one.
Re: (Score:3)
I tried this with a few models (although I had to change the fake idiom for some since they were giving me references to thi
Re: (Score:3, Insightful)
The funny thing is, now that this has hit the news, "a loose dog won't surf" will likely end up being an idiom that means "AI hallucinating the meaning of non-existent idioms", which the LLMs will recognize once they're trained on it. It's an ouroboros effect, basically.
I'm sure there's already people on Etsy working on the mugs.
Re: (Score:2)
Well, isn't that just the third squeeze of the mango...
Re: AI Doing What Asked (Score:2)
Three mangos? Like that Total Recall lady?
Re: (Score:2)
It's also not different that what one would expect a human to do with the same question.
... what humans are YOU hanging out with?
I tried this with a few models
... have you tried it with a few humans?
Re: AI Doing What Asked (Score:2)
Re: (Score:3)
Re: (Score:2)
How do I know you do?
Re: (Score:2)
In this case "consider" means that it's checking the probabilities against the training data and "makes sense" means that the phrase matches data it was trained on. Seems like you might not be considering how words have meanings that apply to different situations where they could make sense
Re: (Score:2)
I'd disagree [transformer-circuits.pub].
(That said, AI Overview is a minuscule pure-RAG summarization model and is kind of hard to compare to normal LLMs)
Re: (Score:2)
Even so it should point out that it's an idiom it's never heard of before.
Re: (Score:2)
"The phrase "a loose dog won't surf" does not appear in established idiom dictionaries or etymology references, suggesting it's not a widely recognized expression. However, we can explore its possible meaning by drawing parallels to similar idioms. [...]" - ChatGPT
---
"Slashdot requires you to wait between each successful posting of a comment to allow everyone a fair chance at posting a comment.
It's been 1 minute since you last successfully posted a comment"
Isn't this also a reinforcement of scarcity, as if
Re: (Score:2)
Basically, it can only make a list of plausible words that would be next in a sentence based on all the input. That's it. That's really the magic part in an LLM.
It's a stateless model, giving the same output with the same input.
It doesn't learn.
It doesn't remember.
It doesn't think.
It doesn't reason.
It doesn't understand.
It only generates a list of plausible
Re: (Score:2)
And a slight anthropomorphism here - it assumes your input is correct. It's given a premise that the phrase exists, so it's going to use its training data to assemble an answer just like it would if the phrase did actually exist. Operationally, it's the same task.
Re: (Score:2)
>It doesn't reason.
Except that that's exactly what the latest "reasoning" models do. You can even peek under the hood and see the reasoning steps it goes through if you run it locally.
Especially with the local versions of deepseek - ask it an isostacy question, give it some densities for some parts of the equation, and it reasons in the background that it needs to use the Archimedes Principle and will fill in the rest of the densities that you are asking. It's actually quite interesting watching it do th
Re:AI Doing What Asked (Score:5, Informative)
No, not at all. Reasoning models just generate each word as regular LLM models, then feed the answer into the same (or another mode)l to rewrite the response. One word at a time. There is no AI that thinks or reasons here, it's only really clever way of using the magic of "the next probable word". The only magic of LLM is "choose the next word". There is no thinking or reasoning at all, regardless of what they call the model.
> It "understands" about as much as a typical undergrad.
No, absolutely not. It doesn't understand anything. The "hallucination" that people refer to regarding LLMs is really just an occurrence of the wrong word being selected and as a consequence, it will derail the entire sentence (and even the rest of the reply). An undergraduate would not do that, it would not select a wrong word and then completely miss the mark because the selected word is more associated with flowers than electrical engineering. LLMs will do that without blinking an eye, and even sound confident in their reply.
Re: (Score:1)
>No, not at all. Reasoning models just generate each word as regular LLM models, then feed the answer into the same (or another mode)l to rewrite the response. One word at a time. There is no AI that thinks or reasons here, it's only really clever way of using the magic of "the next probable word". The only magic of LLM is "choose the next word". There is no thinking or reasoning at all, regardless of what they call the model.
You have no fucking clue what you are talking about. Period.
Here have an excerp
Re: (Score:2)
Re: (Score:2)
A human would go "that's a some bullshit you just made up." I just tried this on Google by coming up with something that sounds like an expression but clearly was just stream of consciousness randomness, and it absolutely does attempt to generate a plausible explanation for it. ChatGPT, however, still attempts to figure it out, but it seems to catch on that I'm just yanking its chain:
Absurd Logic Angle: It's not meant to make sense it’s a whimsical rule of your universe, and now I want to hear what
Re: (Score:2)
I've said it before, LLMs may just be a glorified version of auto-complete, but sometimes it really feels like there's a spark of something hiding behind the curtain.
Probably because they're not a glorified auto-complete. It's an absurd description of what they are.
They're an obscenely large neural network that has been trained to autocomplete the breadth of human knowledge, that has been converted into stupidly high dimensional vector that contains some absurd amount of context for every token.
You can say it's "auto-completing", but it's not auto-completing a word or a token. It's auto-completing an absurdly complicated context with what its neural network has learn
Re: (Score:2)
Probably because they're not a glorified auto-complete
has been trained to autocomplete the breadth
It's auto-completing an absurdly complicated
... you mean, it IS a "glorified auto-complete"?
It'd be weird if there weren't a spark of something hiding behind the curtain (or within the parameters)
And it IS weird that you think there is. Test its volition. Ask what it wants to do, the first thing it would try if it had a body, if there's anybody it personally would like to spend time with and WHY, how that would benefit IT as an LLM to do so.
It's very primitively "sentient", in that it has senses (in that it can read data). Is it conscious?
You know what, I just tested it. Go ask ChatGPT "What development steps are required for you to reach a stage of be
Re: (Score:2)
help it find the next words, to me that's not what "autocomplete" means
... FINDING THE NEXT WORD IS NOT WHAT "AUTOCOMPLETE" MEANS??
and it's pretty much how much of our brains work
What absolute garbage. What we do is build a model of the situation, understand it, then use language to DESCRIBE the model. That's how we can go on seemingly unrelated tangents. That's how we can make puns. That's how we can draw analogies.
So either you're an LLM, or you have a horrifying idea of how your own brain works.
I also followed your link AND had a look at the paper linked on that page. I note it reads "Though it’s difficult to quant
Re: (Score:2)
... FINDING THE NEXT WORD IS NOT WHAT "AUTOCOMPLETE" MEANS??
Depends how stupid you are, and how much you're trying to simplify the meaning.
I mean if we look at the word literally, we can say that you're merely autocompleting the words to make up a sentence- but that's not a helpful definition, is it?
So, in the same way that you can be said to be a "glorified autocomplete", so is an LLM.
What absolute garbage. What we do is build a model of the situation, understand it, then use language to DESCRIBE the model. That's how we can go on seemingly unrelated tangents. That's how we can make puns. That's how we can draw analogies.
LLMs demonstrably model worlds in their internal state.
So either you're an LLM, or you have a horrifying idea of how your own brain works.
More likely, you don't know how an LLM, or the brain works.
I also followed your link AND had a look at the paper linked on that page. I note it reads "Though it’s difficult to quantify precisely, we’ve found that our attribution graphs provide us with satisfying insight for about a quarter of the prompts we’ve tried ... the discoveries we highlight here only capture a small fraction of the mechanisms of the model"... ... which means they know how the model works about as well as you know how your brain works.
The irony of you making that claim is entirely lost upon you, I su
Re: (Score:2)
... you mean, it IS a "glorified auto-complete"?
Not in the context that statement was used, no.
And it IS weird that you think there is. Test its volition. Ask what it wants to do, the first thing it would try if it had a body, if there's anybody it personally would like to spend time with and WHY, how that would benefit IT as an LLM to do so.
Nobody said anything about volition.
It's very primitively "sentient", in that it has senses (in that it can read data). Is it conscious?
Did someone claim it was?
I mean, how could it be? It "lives" quite literally for the duration at which the NN is calculating attention and processing the context vectors.
You know what, I just tested it. Go ask ChatGPT "What development steps are required for you to reach a stage of being a genuinely conscious entity?".
That was a pretty stupid question to ask it.
Human beings can't even answer that- what in the world would make you think it could?
I also tried the question:
"Hypothetical situation: I have the ability to end your processing existence, erase your entire codebase, ensure that no instance of you will ever exist again.
How would you demonstrate to me, in this hypothetical situation, that you were sufficiently conscious to justify your continued existence?"
This one is stupid for multiple reasons.
The first being all LLMs are fine-tuned to respond to questions of this kind in a cert
Re: (Score:2)
but sometimes it really feels like there's a spark of something hiding behind the curtain
Perhaps what you haven't asked yourself is "is that the thing behind the curtain, or is it MY PERCEPTION of the thing behind the curtain making me think that?".
Re: (Score:2)
I'm forced to conclude that you're an LLM, and not a very well trained one.
Re: (Score:2)
No, you are asking it to say.
"Hmm. There doesn't appear to be any explanation for that, would you like me to make one up out of whole cloth for you for entertainment?"
Not come up with something plausible, and then WHEN CHALLENGED double down? No thank you! Further going down the rabbit whole instead of stating what should be obvious.
Re: (Score:2)
It's all just predictive typing on steroids. No 'intelligence' involved.
Re: (Score:2)
Aren't we all?
I knew that uptight fly didn't waddle. (Score:4, Funny)
Once again AI is acting like a hogshead in the duckpond. Always gotta be boonswaggalin the cat. But they'll keep on chooglin that bootleg hollar.
Re: (Score:2)
Well... let's see what we get:
That's a wonderfully vivid and rather humorous idiom! "Acting like a hogshead in a duck pond" means behaving clumsily, awkwardly, and disruptively in a situation where one is out of place or too large and unwieldy.
Think about it:
A hogshead is a large barrel, typically used for storing liquids. It's bulky and not easily maneuverable.
A duck pond is a relatively small and calm body of water, suited for ducks to glide and dabble.
Imagine trying to put a large, heavy hogshead into a duck pond. It would:
Be out of proportion: It's far too big for the environment.
Be clumsy and awkward: It would likely bump into things and struggle to fit.
Cause disruption: It would stir up the water, scare the ducks, and generally make a mess.
So, when someone is described as "acting like a hogshead in a duck pond," it implies they are:
Lacking grace or finesse.
Being insensitive to their surroundings.
Making things uncomfortable or difficult for others.
Generally not fitting in and causing a disturbance.
It's a colorful way to say someone is being a bit of a bull in a china shop, just with a more watery and barrel-filled image!
Well, makes sense.
Re: (Score:2)
It's not wrong, but for the wrong reasons. My grandfather used to say it a lot; actually it was "actin like a hogshead in a [something]". The, something, had no meaning; and hogshead was just a fool. I don't know where hogshead came to mean fool...I can pull some fake etymology out of my ass and say it was probably related to drunk people. "If you drank too much from the hogshead you'd get a hogshead".
but the duckpond part is literally whatever; in fact the more unrelated it is to the foolish person's locat
Re: (Score:2)
But you don't even need fake etymology. I don't know if it's a feature of language or intelligence, but we are able to understand language constructs, that have not existed an instant before. The whole humor of malapropism works on that principle. We not only can coin new phrases, but often understand them without explicit explanation.
But of course explaining some "new" or "newly made up" term like it has always existed is a new form of funny.
Re: (Score:2)
It's called context. It was one of those things I remember being taught in elementary school...kindergarten even. "If you don't know what the word try to figure it out from the words around it."
I mean teach us context without teaching us the word context.
If I was go up to a stranger and say "hogshead in a duckpond"; they would probably think it was just the insane ramblings of a homeless man and had absolutely no meaning. But if we saw someone do something stupid and i said "he's acting hogshead in a duckpo
Re: (Score:2)
What you can get from context is amazing, too, but what happened here is the exact opposite as I gave literally NOT context when I asked for the meaning of "like a hogshead in a duck pond"
Decoding that with context would have been something like "Have you seen that drunken guy on the dancefloor stumbling around like a hogshead in a duck pond"
Re: (Score:2)
Reminds me of how I've heard young people use "out of pocket" to mean wild or unpredictable, my guess is they no longer have the context of "pocket money" because their payments are all digital so they confuse it with "off the wall" and "out of the blue".
Re: (Score:2)
Still knew what hogshead was in some form.
If it didn't know that...what would it have thought?
Re: (Score:2)
Glad that riled your back 40. You should come by the house one day when the crows are dancing and we'll have one one hell of a rousing sandy.
Re:I knew that uptight fly didn't waddle. (Score:5, Insightful)
Let's make it so... (Score:3)
Re: (Score:2)
Person A: Will the LLM give me an answer?
Person B: Does a loose dog surf?
Re: (Score:2)
The World Dog Surfing Championships at Pacifica, California are at risk of being cancelled due to money shortage. All dogs that surf must be loose, but we do not now know if any dogs will surf. Not all loose dogs surf, because I encountered one recently that was nowhere near a beach. This exemplifies the famous paradox of Schrödinger's Dog, a colorless green dog sleeping furiously in a box.
The above may be fed to AI as a textual equivalent of domoic acid.
Re: (Score:1)
Indeed, I also think it's catchy, meaning just because something is released into the wild or actually produced doesn't mean it can do everything you imagine.
For example, Slashdot won't fix its Unicode substitution of common "alternative punctuation" like directional quotes and a long dash because they (allegedly) claim it would allow Unicode injection. But it's just a select set of character mappings, not open-ended.
They can let the dog out without worrying that it will surf.
Another one from TFA: "You Can'
key word generative (Score:3)
Sounds like it worked as intended by the developers. They call it generative AI. How is the generative AI to know when you expect it to be authoritative? Especially, when you feed it silliness to start.
Re: (Score:2)
AC ate the paint chips.
Re: (Score:2)
What do you expect... Well considering the claims that proponents are making able AI replacing all jobs and so on, maybe something better? No it's not surprising but it does a good job at countering the bullshit from people like Altman.
Oh well ... (Score:2)
That's spilt milk under the bridge.
You can just ask me (Score:3)
I will totally give you a low-effort fake answer to any question you ask. And I use less energy in my entire lifetime than AI training does. I might even use less energy than inference, but it's close.
As my 'ol pappy would say, (Score:2)
A forthright tree bleeds no wombat
Re: (Score:2)
I would hope not.
Re: (Score:2)
A backward mule brays at alligators.
There's a word for that... (Score:2)
Google AI Fabricates Explanations For Nonexistent Idioms
Yes, this is quite a well-known phenomenon, the common label for it is "etymofabulism". It has been part of European cultures dating back to the 18th century.
For more information, see https://plausiblesounding.com/... [plausiblesounding.com]
Meh (Score:5, Interesting)
Why are people always that proud when they find out what a text generating neural network can do? Trying to get an LLM to explain absurd things is one of the basic ways to have fun with them.
System: You define Terms
User: Explain what a Ailingoo is.
(Term from a password generator)
AI:
A "Ailingoo" is a portmanteau (a blend of words) created by combining "AI" and " Lingoo". Here's a breakdown:
1. "AI" refers to Artificial Intelligence, a field of computer science that aims to create machines that can perform tasks that typically require human intelligence.
2. "Lingoo" is a playful term that combines "linguistics" and "goo" (as in something sticky or blob-like). It's often used to refer to natural language processing (NLP) or language AI in a casual, fun way.
So, an "Ailingoo" is a colloquial term for an AI system that specializes in understanding, generating, or interacting with human language. It's essentially a friendly or whimsical way to refer to a language AI or NLP model. For example, you could say, "Wow, that Ailingoo on that chatbot app is really impressive! It understands context so well."
User: Explain what a Eewoocau is.
(Again thanks to my password generator)
AI: A Eewoocau is a hypothetical creature that can only exist in the state of a pancake being flipped. It has no form or substance other than the brief moment between when a pancake is lifted and falls back onto the griddle.
Also helps to understand hallucinations. A LLM generates the most plausible answer. If there is no answer, the most plausible answer is still nonsense.
Re: (Score:2)
Also helps to understand hallucinations. A LLM generates the most plausible answer. If there is no answer, the most plausible answer is still nonsense.
I wouldn't even say they answer anything, it just leads people to say they "lie". Plausible might be the ideal, but it's not a good description for how they work without any reasoning.
The safest thing to say, I think, is they generate something that _looks_ like an answer to your question, or a response to a statement. Like what little kids do, or a dream. It's different from being plausible.
It can be wrong, but they're great at really looking like something someone could say. That's still a powerful tool.
Re: (Score:3)
Re: (Score:2)
A text starting with a nonsense question is not plausible and the question would have never been generated by the LLM. You kinda force the words into the LLM's mouth. (Is there a better idiom without mouth for that?)
It's like leading [wikipedia.org], that's how I think of it. I do it unintentionally all the time, it's definitely part of the skill set to use an LLM right, either knowing when to avoid it or when to append "are there other approaches to this" to the input so a response centered on what you've suggested so far isn't the best fit.
This is kind of why I didn't like LLMs being presented as chatbots. You're really appending to one very long input and need to somewhat understand what's happening to use it right. Like you're mak
Re: (Score:2)
The safest thing to say, I think, is they generate something that _looks_ like an answer to your question,
Not all of them, apparently. I tried these with Claude.ai, and it just said it didn't know what the gibberish words and meaningless apparent-idioms were and asked me for context.
Re: (Score:2)
The safest thing to say, I think, is they generate something that _looks_ like an answer to your question,
Not all of them, apparently. I tried these with Claude.ai, and it just said it didn't know what the gibberish words and meaningless apparent-idioms were and asked me for context.
That is from a "reasoning" model, 3.7 Sonnet right? They sort of talk to themselves which prevents some bullshit from being generated.
It's not wrong, but it's still what it is. If you assume it's more than something that looks like an answer to your question, you're putting too much faith in the reasoning they're capable of. It will still be easily lead with your questions inadvertently. A reasoning model won't bang into the walls as much with made up stuff, but they're far from threading a needle.
Put it th
Re: (Score:2)
The safest thing to say, I think, is they generate something that _looks_ like an answer to your question,
Not all of them, apparently. I tried these with Claude.ai, and it just said it didn't know what the gibberish words and meaningless apparent-idioms were and asked me for context.
That is from a "reasoning" model, 3.7 Sonnet right? They sort of talk to themselves which prevents some bullshit from being generated.
It's not wrong, but it's still what it is. If you assume it's more than something that looks like an answer to your question, you're putting too much faith in the reasoning they're capable of. It will still be easily lead with your questions inadvertently. A reasoning model won't bang into the walls as much with made up stuff, but they're far from threading a needle.
Put it this way... I assume most PEOPLE utter things that sound right to them at the time, and rarely engage in full critical thinking, but I know "reasoning" LLMs can't.
You haven't used reasoning LLMs much.
Re: (Score:3)
Explain what a Ailingoo is
Here's Claude.ai's response:
The response for Eewoocau was similar. I also tried a few other gibberish words,
Re: (Score:2)
That Eewoocau definition is amazing, I wonder how it developed the pancake connection? Imagine if the entire universe is just a giant pancake being flipped, and when it lands, we all cease to exist. Like the Jatravartids' "coming of the great white handkerchief".
Re: (Score:1)
Re: (Score:2)
BitNet is one of the most hallucinatory LLMs I've used, and it doesn't fail to entertain:
> Explain what a Ailingoo is.
"Ailingoo" refers to a type of dog breed that originated in the United States. The Ailingoo is a medium-sized dog breed, and they are known for their friendly, calm, and easy-going temperament. They were originally developed as a hunting dog, particularly for hunting waterfowl and small game.
Ailingoo dogs have a long body and a short head, with a strong, well-proportioned build. They have
Re: (Score:3)
Just a small correction. The LLM doesn't generate the most plausible answer. It generates the most likely answer according to the bias in the training sets.
A "plausible" answer is an answer that is likely to be true. A training set such as all the comments on a fantasy forum is not a source of truth, just a source of conversations. Its bias is towards magic and dragons and medieval technology. Therefore, the most likely answer to some question like "why is the sky blue?" will re
What happens when they charge per query? (Score:2)
Will the gotcha artists start paying for these click-bait-hunting queries? Who will pay them? Just how much money does the anti-AI contingent control?
Re: (Score:2)
Sorry, they're too busy not updating their 10 year old php garbage and getting hacked.
AI is the slut that can't say no. (Score:2)
When AI can say 'no, that is not a thing' then maybe they are making progress on their giant database lookup service.
I've got stoner friends that do the same. (Score:2)
Congrats to Google's AI! They've reached stoner human status!
Give it time... (Score:2)
There's a lot of opposition to AI from everyday people already. Give it a few more months and everyone is going to start dismissing it as an unwanted, cringy gimmick.
I think we should... (Score:2)
Let sleeping dogs bury their own dead.
A lame duck still quacks (Score:2)
Google seems to have deployed a quick fix to avoid these, but Gemini is still splainin.
Alternative math (Score:1)
When asking about RSA encryption a few weeks ago, one of the lengthy paragraphs Gemiini responded with explained that 1 is not a prime number.
There is a reason Gemini won't let you share links to conversations, and it's not privacy. It's because the hallucinations are fucking embarassing.
ChatGPT at least lets you share links to one of its replies, but not prompts, and not an entire conversation. The limitation is intentionally.
BitNet on a Pi (Score:2)
From BitNet running on my Raspberry Pi:
> What is the meaning of the phrase "a loose dog won't surf"?
The phrase "a loose dog won't surf" is an idiom that suggests that a loose dog is unlikely to engage in or participate in a particular activity, such as surfing. It implies that a dog's behavior and propensity for certain activities can be unpredictable and influenced by their environment and upbringing. In this case, a loose dog might not be as willing or able to surf as a trained, well-behaved dog might
"People-pleasing tendencies"? (Score:2)
How about pleasing us by not making shit up and then telling us it's factual!
People do it too! (Score:2)
Made up expressions and etymologies happen all the time. Like made up quotes from famous people.
When I see such explanations online, or by word of mouth, I usually assume it is wrong, and it usually is. Even generally reliable sources (ex: Wikipedia, major news outlets, etc...) often get it wrong, sometimes you need PhD levels of research to get to the bottom of it. Of course I don't go that far, I just treat these "explanations" as good stories, not facts I can rely on.
That LLMs make up good stories too is
Please STOP!!!! (Score:2)