Become a fan of Slashdot on Facebook

OpenAI Says Models Programmed To Make Stuff Up Instead of Admitting Ignorance (theregister.com) 90

Posted by msmash on Wednesday September 17, 2025 @01:28PM from the fault-in-our-stars dept.

AI models often produce false outputs, or "hallucinations." Now OpenAI has admitted they may result from fundamental mistakes it makes when training its models. The Register: The admission came in a paper [PDF] published in early September, titled "Why Language Models Hallucinate," and penned by three OpenAI researchers and Santosh Vempala, a distinguished professor of computer science at Georgia Institute of Technology. It concludes that "the majority of mainstream evaluations reward hallucinatory behavior."

The fundamental problem is that AI models are trained to reward guesswork, rather than the correct answer. Guessing might produce a superficially suitable answer. Telling users your AI can't find an answer is less satisfying. As a test case, the team tried to get an OpenAI bot to report the birthday of one of the paper's authors, OpenAI research scientist Adam Tauman Kalai. It produced three incorrect results because the trainers taught the engine to return an answer, rather than admit ignorance. "Over thousands of test questions, the guessing model ends up looking better on scoreboards than a careful model that admits uncertainty," OpenAI admitted in a blog post accompanying the release.

OpenAI Says Models Programmed To Make Stuff Up Instead of Admitting Ignorance

Post Load All Comments

Search 90 Comments Log In/Create an Account

Comments Filter:

No shit (Score:2)

by El_Muerte_TDS ( 592157 ) writes:

Watson.
- Re:No shit (Score:4, Interesting)
  
  by ndsurvivor ( 891239 ) writes: on Wednesday September 17, 2025 @01:46PM (#65666318)
  
  I recently gave AI's a paradox. When an ideal 10uF cap is charged to 100V, it has x charge, when an uncharged cap of 100uF is placed in parallel, the voltage goes to about 9.1V and the total energy is about 0.1x. If energy is neither destroyed nor created, where did the energy go? The AI kind of forgets that I specified an "ideal" capacitor, and makes shit up. If a human can explain this to me, I am all ears. Please keep in mind: "Ideal capacitor", and "equations show this", don't make shit up.
  
  Reply to This Parent Share
  twitter facebook
  Flag as Inappropriate
  - Re: (Score:2)
    
    by shanen ( 462549 ) writes:
    
    Concurrence in general but I think the underlying problem is that it doesn't ask for clarification when it should. Even worse than the verbosity thing.
    - Re: (Score:2)
      
      by ndsurvivor ( 891239 ) writes:
      
      at first I thought the verbosity thing was kind of cool. Now I wish it would give me more precise answers. and.... (me complaining), when I feed it code, I have to specify to change one small part of it, but in general, it seems to want to modify all of it, and in many cases it inserts bugs. So I learned, only give it small snippets.
      - Re: (Score:2)
        
        by larryjoe ( 135075 ) writes:
        
        at first I thought the verbosity thing was kind of cool. Now I wish it would give me more precise answers. and.... (me complaining), when I feed it code, I have to specify to change one small part of it, but in general, it seems to want to modify all of it, and in many cases it inserts bugs. So I learned, only give it small snippets.
        Often the verbosity of the response can be tailored via the appropriate additional qualifying words in the prompt. So, couldn't this be transparently baked into a personalized interface? That is, allow the user to specify the preferred verbosity as part of personalized settings and silently add the appropriate additional words to the prompt each time. The user can still explicitly ask for more verbosity when desired.
        
        Anecdotal experiences with gen AIs (Score:2)
        
        by shanen ( 462549 ) writes:
        
        Well I can report that just telling the genAIs to be less verbose doesn't help, though I recall DeepSeek promising to try.
    - Re: No shit (Score:2)
      
      by Brownstar ( 139242 ) writes:
      
      Have you ever tried to instruct it to ask you for clarifying questions before providing an answer?
      - More anecdotal experiences with gen AIs (Score:2)
        
        by shanen ( 462549 ) writes:
        
        Yes, I have tried to encourage the genAIs to ask questions a number of times in various ways, but with no significant success that I can recall.
        I can definitely recall the results of my last paired experiments. I prepared a short software specification and gave that text to DeepSeek and ChatGPT.
        The DeepSeek result was better in terms of how I described the desired appearance, but one of the four results was wrong. DeepSeek had clearly misunderstood that part of the problem and did NOT ask for clarification.
  - Re: (Score:1)
    
    by Anonymous Coward writes:
    
    The potential energy itself is lost because that stored charge in the small capacitor pushes charge out to spread evenly to the initially uncharged capacitor. The energy itself was used up in the electrons pushing away from eachother. Even with ideal capacitors and zero resistance - the energy need not be lost as heat.
    Sort of like letting a compressed spring go. Even if the spring did no working lifting or pushing against any other object, it still lost potential energy springing itself back to rest.
    - Re: (Score:2)
      
      by ndsurvivor ( 891239 ) writes:
      
      so... we get back to the mystery of Entropy, I guess. Everything just goes to its lowest energy state. It seems to contradict the premise that energy is never gained, nor lost, it is just converted to other forms of energy. If energy can be lost, can it be gained?
      - Re: (Score:1)
        
        by Anonymous Coward writes:
        
        The lumped circuit element model with ideal wires, capacitors, and instantaneous effects is not reality, and ltspice doesn't capture the whole picture. Simulating what would happen with electrons in a capacitor with superconducting plates and superconducting wires in between, would show the effects of magnetic fields and electric fields and EM radiation. In a zero-resistance environment, it would likely oscillate (due to the changing magnetic field and the wires having intrinsic inductance) and dissipate so
  - Re: (Score:2)
    
    by jenningsthecat ( 1525947 ) writes:
    
    I recently gave AI's a paradox. When an ideal 10uF cap is charged to 100V, it has x charge, when an uncharged cap of 100uF is placed in parallel...
    Is the 100uF cap also ideal?
  - Ideal Capacitors not the Problem (Score:2)
    
    by Roger W Moore ( 538166 ) writes:
    
    The problems with the question you stated have nothing to do with ideal capacitors. Your first problem is that you have 'x' as a charge to start with and then as an energy at the end. Then there is the problem that if your first capacitor is charged to 100V and a second is placed _in parallel_ with it that too will be charged to 100V, not 9.1V since connecting something in parallel implies that you are connecting it to an external EMF such that the potential difference across the components is the same as o
    - Re: (Score:2)
      
      by serviscope_minor ( 664417 ) writes:
      
      I though his question was more than clear enough TBH.
      You're being more than needlessly pedantic about what parallel means here to the point of being wrong. It doesn't imply that it's connected to something else. If I have an array caps connected in parallel and disconnect that array from the supply, they have not ceased to be in parallel. Parallel simply implies that you have a bunch of components where all their terminal A's are connected together and all the terminal B's are connected together.
      Once you kn
      - Re: (Score:2)
        
        by Roger W Moore ( 538166 ) writes:
        
        You're being more than needlessly pedantic
        
        No, I'm being appropriately pedantic for a physics question. When you have a capacitor charged to 100V and another is connected in parallel to it the usual implication is that this too will be charged to 100V since connecting something in parallel implies the same pd across the device. If the capacitor were already disconnected from the external power then parallel and series have no meaning since they would both be the same as your circuit now consists of just two capacitors.
        
        Thus, in specifying paralle
        
        Re: (Score:2)
        
        by serviscope_minor ( 664417 ) writes:
        
        If the capacitor were already disconnected from the external power then parallel and series have no meaning since they would both be the same as your circuit now consists of just two capacitors.
        Well, no, because the person clearly said how they were connected. Name the terminals of a capacitor A and B and indicate which capacitor by 1 and 2. A paralleled pair of capacitors has A1 and A2 connected and B1 and B2 connected. You now have a 1 port network with A1/B1 as the ports.
        A series pair has A1 and B2 disc
        
        Re: (Score:2)
        
        by Roger W Moore ( 538166 ) writes:
        
        The literal definition of a parallel circuit is one where the circuit divides and the current is split between two components - look it up. That is not possible if your circuit consists of two capacitors and nothing else. If there is not more than one path for the current the circuit is not parallel. This is not a physics vs. engineering definition, it is THE definition of what parallel means.
        
        Re: (Score:2)
        
        by serviscope_minor ( 664417 ) writes:
        
        The literal definition of a parallel circuit is one where the circuit divides and the current is split between two components - look it up.
        You know I actually reached behind me and grabbed Horowitz & Hill off the shelf just because.
        Looks like someone has h4x0rized it so you can look too:
        https://kolegite.com/EE_librar... [kolegite.com]
        Page 2, Figure 1.1, "parallel connection". See the lack of an EMF in that diagram?
        When the OP said "capacitors in parallel", that's exactly what he means. Those wires on the left and righ
        
        Read the Text (Score:2)
        
        by Roger W Moore ( 538166 ) writes:
        
        Page 2, Figure 1.1, "parallel connection". See the lack of an EMF in that diagram?
        Yes, now see the text directly under the diagram which says, and I quote, "Things hooked in parallel (Figure 1.1) have the same voltage across them.". So no the source of the EMF is not shown but it is clearly there as the text underneath states. Thank you for proving my point.
        OKey dokey, since you keep dodging this question I'll ask again.
        The line you quoted answers the question you asked: if the current does not divide between two or more paths then the devices are not connected in parallel. Apply that to your situation: if there is a current and it splits to pass t
        
        Re: (Score:2)
        
        by serviscope_minor ( 664417 ) writes:
        
        "Things hooked in parallel (Figure 1.1) have the same voltage across them."
        Ah you mean the entire premise of the OP's question?
        So no the source of the EMF is not shown but it is clearly there as the text underneath states. Thank you for proving my point.
        At this point I have to wonder if you know how capacitors work. If those two floating wires are unconnected to an EMF, the two capacitors will still have the same voltage across them, due to whatever charge/energy is stored in them.
        You don't need an externa
        
        Re: (Score:2)
        
        by Roger W Moore ( 538166 ) writes:
        
        If those two floating wires are unconnected to an EMF, the two capacitors will still have the same voltage across them, due to whatever charge/energy is stored in them.
        There are two problems with this. First what happens if the components are resistors? Is this some new rule you have invented that only applies to capacitors while resistors in the exact same arrangement will not be connected in parallel just because they cannot generate their own EMF? Next this argument seems to distinguish whether something is in parallel or series based on the addition of two unconnected wires. If I remove those two, unconnected wires are you trying to tell me that this will convert the
        
        Re: (Score:2)
        
        by serviscope_minor ( 664417 ) writes:
        
        Well, shit. My phone are my reply which was quite long. So sorry my second go will be worse.
        The EMF requirement is yours not mine. I don't see why an emf of 0 is a problem. Just a special case of complex impedances. But given your requirement capacitors fit.
        .
        I've also never heard the term "series loop" before, and that only holds for a pair. What would you call it work 3, 4,5 or more capacitors connected how I specified as parallel?
        When you say"no it only behaves that way", that's what I was talking about.
    - Re: (Score:1)
      
      by ndsurvivor ( 891239 ) writes:
      
      Yes, I did this experiment recently. I connected a 10uF capacitor to an external voltage source, charged it up to 100V, then connected a 100uF in parallel. The equations are 0.5*C*V*V. Energy just disappears in this case.
  - Re: (Score:2)
    
    by dv82 ( 1609975 ) writes:
    
    I recently gave AI's a paradox. When an ideal 10uF cap is charged to 100V, it has x charge, when an uncharged cap of 100uF is placed in parallel, the voltage goes to about 9.1V and the total energy is about 0.1x. If energy is neither destroyed nor created, where did the energy go? The AI kind of forgets that I specified an "ideal" capacitor, and makes shit up. If a human can explain this to me, I am all ears. Please keep in mind: "Ideal capacitor", and "equations show this", don't make shit up.
    Consider the moment just before the two caps are connected to each other. One is charged to a nonzero voltage and the other is at zero. As the conductor attached to one capacitor top plate approaches the conductor attached to the other capacitor top plate (the bottom plates already sharing a common connection), the E field between them will at some point be high enough to exceed the breakdown field of the medium in between (even if it is free space). Then an arc occurs. That arc is observable, maybe even vi
  - Re: (Score:2)
    
    by DDumitru ( 692803 ) writes:
    
    your question is not a paradox, you are just doing the math wrong. The initial state is 100 x .000010 = 0.001 watt hours. The final state is 9.1 x .000110 = 0.001001 watt hours. Remember the end point capacitance is 110 uF. So 9.1 volts make perfect sense.
    
    Now what AI does with this is a completely different question.
    - - Re: (Score:2)
        
        by DDumitru ( 692803 ) writes:
        
        sorry, my watt hours should be watt seconds. And one watt-second does equal one joule. So please explain the math here. moving from 100 V @ 10uF to 9.1V @ 110uF is exactly what you would expect. Just where does the 0.1X come from.
      - Re: (Score:2)
        
        by ndsurvivor ( 891239 ) writes:
        
        Since you seem to have your mind wrapped around this concept, what happens if a 1:1 transformer is placed between the caps? I have shown experimentally that a 100uF cap gets charged to about 9V if placed on the secondary side. The cap that is in parallel with the charged 10uF is also charged to about 9V. The loop in the primary redistributes the electrons, as it should, and in the process creates charge in the secondary. Does this "create energy", or does most of the energy "disappears"? Personally
        
        Re: (Score:2)
        
        by DDumitru ( 692803 ) writes:
        
        I am not sure how a transformer would fit in with a DC capacitance / storage scenario. Transformers imply alternating current, and while they can be 1:1, I have never thought of a transformer as a DC component.
        
        And yes, the AI explanation is not based on the basic formulas. That is actually the problems. Humans can sanity check an answer looking for conservation of energy. AI does not take this step. For these types of problems, a little recursion with some simple application of basic laws of physics sh
        
        Re: (Score:2)
        
        by ndsurvivor ( 891239 ) writes:
        
        To me it isn't a DC thing, it is a transient thing that could be repeated over and over again. So conceptually, it is AC.
        
        Re: (Score:2)
        
        by DDumitru ( 692803 ) writes:
        
        Fine, but the math gets a lot more complicated, especially when the voltage and current waveforms get out of sync (ie, power factor). But the conservation of energy still works with ideal components (which obviously do not actually exist). Also from a practical point of view, most 100 uF capacitors are electrolytic (or they are quite large) which does not play well with AC, but again the "ideal" part of the question does preclude this anyway.
        
        Re: (Score:2)
        
        by ndsurvivor ( 891239 ) writes:
        
        That seems like a BS, kind of pre-conclusion mindset to me. I agree with conservation of energy premise myself, however, how do you reconcile that when a charged 10uF cap is placed in parallel with a 100uF cap with a 1:1 transformer in the loop. Place a 100uF cap on the other side of the 1:1 transformer... Energy seems to be created, or destroyed? Everything charges to about 9V. I did the experiment with electrolytic caps. I have the privilege to be doing R&D in the area of "creating energy out o
    - Re: (Score:2)
      
      by ndsurvivor ( 891239 ) writes:
      
      That implies conservation of current applies, or energy/time? or watt/hours? something like that? I can get my mind around that concept.
      - Re: (Score:1)
        
        by flipk ( 1187739 ) writes:
        
        The energy present in the first capacitor is 1/2 * 10uF * 100V^2 = 0.05 Joules.
        The energy present in the two after stabilization is 1/2 * 110uF * 9.09V^2 = 0.00455 Joules.
        So, 90% of the energy is lost during the connection.
        The "two-capacitor paradox" is well documented. Conservation of energy must count all possible types of energy (not just stored charge), including heat, photons, and electromagnetic radiation during the energy transfer from one capacitor to the other. Even if you have ideal capacitors and
    - Re: (Score:1)
      
      by flipk ( 1187739 ) writes:
      
      the product of voltage (V and capacitance (C) gives you charge (Q), not energy. Q=C*V ; E=1/2*C*V^2.
      The energy present in the first capacitor is 1/2 * 10uF * 100V^2 = 0.05 Joules.
      The energy present in the two after stabilization is 1/2 * 110uF * 9.09V^2 = 0.00455 Joules.
      So, 90% of the energy is lost during the connection.
      The "two-capacitor paradox" is a well documented and well explained thought problem. At the moment of connection, the current flow is extremely high, producing a burst of EM radiation (e
  - Re: (Score:1)
    
    by flipk ( 1187739 ) writes:
    
    The "two-capacitor paradox" is well documented. The first problem is a dimensional error, comparing charge (e.g. coulombs) and energy (e.g. joules). But another problem is that conservation of energy must count all possible types of energy (not just stored charge), including heat, photons, and electromagnetic radiation during the energy transfer from one capacitor to the other. Even if you have ideal capacitors and zero resistance, with infinite current for an infinitesimal time at the moment of connection
    - Re: (Score:2)
      
      by ndsurvivor ( 891239 ) writes:
      
      If the energy in the EM radiation equals the loss of the E=0.5*C*V*V, then that is a satisfactory answer.
  - Re: (Score:1)
    
    by mmogilvi ( 685746 ) writes:
    
    Any of flipk's 3 similar responses does a pretty good job of explaining what is going on: Losing energy to heat and/or EM radiation. Its not really a paradox; just some difficulty fully accounting for everything that affects the total energy of the system, as opposed to conservation of charge approach where energy can basically be ignored.
    It reminds me of the first day of the first physics course I took as a senior in high school. The teacher gave us a pre-test (of multiple-choice questions) as part of a
  - Re: (Score:2)
    
    by strikethree ( 811449 ) writes:
    
    The circuit itself has some of the energy.
- Re: (Score:2)
  
  by Equuleus42 ( 723 ) writes:
  
  Watson.
  Speaking of Watson, Slate recently ran a retrospective article on IBM's Watson playing Jeopardy!. Coincidentally, it also mentioned when Watson made up the response that Toronto was a U.S. city:
  https://slate.com/culture/2025... [slate.com]
- Re: (Score:2)
  
  by guygo ( 894298 ) writes:
  
  aka Modeling their developers.
- Human do the same (Score:1)
  
  by meandmatt ( 2741421 ) writes:
  
  I have noticed that employees that guess get the same reward. The appearance of progress gets more reward than that the guy that got the real answer on paper but was asked to stand down before submitting it becuase someone already turned in a guess.
Wrong explanation (Score:5, Informative)

by WaffleMonster ( 969671 ) writes: on Wednesday September 17, 2025 @01:44PM (#65666308)

They make shit up because they have no meta-cognition and don't know any better.

Reply to This Share
twitter facebook
Flag as Inappropriate
- Re:Wrong explanation (Score:5, Insightful)
  
  by korgitser ( 1809018 ) writes: on Wednesday September 17, 2025 @02:07PM (#65666392)
  
  This is a great time to remind ourselves that a LLM is just a fancy autocomplete engine.
  
  Reply to This Parent Share
  twitter facebook
  Flag as Inappropriate
  - Re: (Score:1)
    
    by Jeremi ( 14640 ) writes:
    
    This is a great time to remind ourselves that a LLM is just a fancy autocomplete engine.
    Well, sure, in the same way a 747 is just a fancy mechanical bird. Which is to say, yes, but no.
    - Re: (Score:2)
      
      by Junta ( 36770 ) writes:
      
      If a bunch of people are marketing a 747 as a vehicle like some alien ship that can pause in mid air, go side to side, and backwards and underwater and into space, then it's worth pointing out it's closer to being a mechanical bird than an alien ship.
- Re: (Score:2)
  
  by Fly Swatter ( 30498 ) writes:
  
  You can't make an AI based on a random number generator. But here we are. The reason is simple, otherwise it would always give the same exact result for a given query.
  
  Hence why your AI has a 'seed' value.
- Re: (Score:2)
  
  by Gideon Fubar ( 833343 ) writes:
  
  Using the word 'programming' in the headline is a stretch to the point of tearing
- Re: (Score:2)
  
  by arglebargle_xiv ( 2212710 ) writes:
  
  the engine to return an answer, rather than admit ignorance
  So it's doing the same thing as any guy would then?
- Re: (Score:2)
  
  by piojo ( 995934 ) writes:
  
  They make shit up because they have no meta-cognition and don't know any better.
  The whole point of this article is to claim that's not true. (I emphasize that we are not talking about the *sense* of knowing things but rather about an information flow that contains redundancy and can differentiate between justified knowledge and anything else.)
  I've been saying there was something wrong in the architecture of these models for ages. If it just turned out the training was just subtly wrong, that would be disappointing. (Have we needlessly been on the wrong path?)
This is good research (Score:2)

by MpVpRb ( 1423381 ) writes:

Instead of claiming that their work is perfect, they look for problems to fix
All companies should do this
Unfortunately, some companies not only try to keep problems secret, but also try to punish those who expose the problems
And that's why I cancelled (Score:5, Insightful)

by Oh really now ( 5490472 ) writes: on Wednesday September 17, 2025 @02:08PM (#65666402)

If you can't trust the info it gives, it's worth nothing.

Reply to This Share
twitter facebook
Flag as Inappropriate
- Re: (Score:2)
  
  by HetMes ( 1074585 ) writes:
  
  Not necessarily. Did the first google hit make you believe everything the page said? Commons sense and healthy skepticism are needed to filter through whatever the AI surfaces. And that's of course only if you use an AI as a source of information instead of putting it in charge.
rtfm :( (Score:1)

by Venova ( 6474140 ) writes:

its getting so frustrating when it wastes my time hallucinating flase answers to questions about my synthesizers and grooveboxes it could very easily just find it in a manual for me and get it right; but it hallucinates things applying to other products and menu structures etc i do read my manuals when i need to but it would be nice if i could just feed a folder of pdf's to some local ai and it could find me accurate things quickly
- - Re: (Score:1)
    
    by Anonymous Coward writes:
    
    Yes, but if you could make a little AI assistant that had knowledge of all your manuals and other important reference material that had already been vetted, would it not be useful and faster to just ask your assistant for that information? It sure sounds like it would be.
    That's actually where I see these LLM type systems working best. You train them on well vetted material for your specific business, it's policies, etc and the AI can more or less be the very easy to use gateway to that information for your
    - Re: rtfm :( (Score:1)
      
      by Venova ( 6474140 ) writes:
      
      sure i can search but i mostly use my phone for reading manuals since i can sit it right there on my box while im working and get it out of the way when i dont need it i open the pdf in my browser; recently made a tab group thats just a bunch of my manuals; but searching via safari is lousey because it puts a bunch of web results and other stuff and my document at the bottom under the keyboard so it just takes more steps to search for the right instance of whatever word which probably wont be about what i
    - Re: (Score:2)
      
      by Gideon Fubar ( 833343 ) writes:
      
      It sounds like you want a search engine. You want to be able to index your own documents with a search engine, right?
      Do you really need to have a conversation with the search engine in which it can misunderstand what you're trying to say, based on some association it learned from reddit posts?
      Technically you can do what you're asking, but it's less efficient and more prone to error than just using a search engine... which you can also set up locally yourself.
Echo chamber effect? (Score:3)

by jenningsthecat ( 1525947 ) writes: on Wednesday September 17, 2025 @02:28PM (#65666448)

It seems to me that LLM hallucinations which remain undetected in the short term, also give rise to false data which both reinforce the hallucination of the model in question and pollute other LLMs with false data. In other words, does one LLM, unchallenged when it makes shit up, adopt its own answer as fact and then corrupt other models which aren't 'guessing' but are simply propagating a lie unknowingly? Do LLMs treat each other's data as authoritative?
Please excuse the anthropomorphism in what I just wrote - it seems to be the most efficient way of conveying the concepts.

Reply to This Share
twitter facebook
Flag as Inappropriate
- Re: (Score:2)
  
  by sarren1901 ( 5415506 ) writes:
  
  You mean is it eating it's own dog food? At this point, probably. A lot of places are using AI for "news" these days. If all that is scrapped, not to mention all the AI produced video. I could definitely see AI learning misinformation that was generated by a previous iteration of itself. The irony. Also very fortellable.
- Re: (Score:2)
  
  by Gideon Fubar ( 833343 ) writes:
  
  taking anthormorphism into account... absolutely yes.
  It's also deeper than that though... seems like you're thinking of ideas as the atomic components that LLMs use, but they actually use word-parts and associations between them. Model collapse can be much more uh... interesting than just having them constantly come up with false statements.
  - Re: (Score:2)
    
    by jenningsthecat ( 1525947 ) writes:
    
    taking anthormorphism into account... absolutely yes.
    It's also deeper than that though... seems like you're thinking of ideas as the atomic components that LLMs use, but they actually use word-parts and associations between them.
    I am profoundly a 'word' person, and a LOT of my thinking takes place explicitly in the language domain. Lately - prompted by considering LLMs and their behaviour, as well as the differences between humans and other animals - I've been playing with the notion that human ideas are at least in large part "word-parts and associations between them".
    - Re: (Score:2)
      
      by Gideon Fubar ( 833343 ) writes:
      
      We... do use those methods, but not exclusively?
      Some functions are not purely semantic. Given how language models actually rely on this perception to work, it's sort of dangerous to commit too hard to that line of thinking IMO.
That's the most annoying thing about AI (Score:1)

by Tillmann ( 859300 ) writes:

That's hands down the most annoying thing about AI. The one thing I miss from all models are more frequent "I don't know" responses. Knowing that the model doesn't know is a hugely valuable piece of information. And far more useful than hallucinations, half-truth that doesn't really fit the question, etc.
Generative vs Factual (Score:5, Insightful)

by devslash0 ( 4203435 ) writes: on Wednesday September 17, 2025 @02:45PM (#65666504)

I think the problem is definition itself. Generative AI. If they need to generate an answer, then good chances are it'll end up whatever the model believes to be correct, statistically speaking.
But on the internet actual facts and answers are rare. Most of help-me threads are 99.99% crap and one or two people providing an actual, helpful response but those few units drown among all the other crap because hey - statistics don't favour truths.
Kind of like in democracy. Two uneducated dropouts have more power than one university lecturer.
But I digress...
If AI models were to return facts, we'd call them search agents, search engines...
Oh, wait.

Reply to This Share
twitter facebook
Flag as Inappropriate
- Re: (Score:2)
  
  by cowdung ( 702933 ) writes:
  
  Did you read the paper?
  The paper is about whether we reinforce it to answer correctly (whatever that means) vs it simply saying "I don't know".
  Most LLMs are rewarded when they give a correct answer. Say: 1 for correct, and 0 for incorrect.
  So they are rewarded to always try an answer even when they aren't certain. The paper suggests rewarding "I don't know". For example, you could give 0.2 points for saying "I don't know", 0 for incorrect, and 1 for correct. This way the model will try to answer but only whe
- Re: (Score:2)
  
  by Dozy Lizard ( 1708728 ) writes:
  
  Kind of like in democracy. Two uneducated dropouts have more power than one university lecturer.
  Yes. The reward system for training humans seems to be similar to the reward system for training AI's in the sense that many (most?) people value an ill informed, confidently delivered opinion, if it confirms their biases, over accurate nuanced, qualified information, whether it comes from a human or a machine!
AI disinformation is poisioning the world (Score:2)

by BrendaEM ( 871664 ) writes:

AI should be banned.
Wow this is very insightful (Score:1)

by tatroc ( 6301818 ) writes:

I had a hunch they worked in some way like this. I felt like I was talking with one of my arrogant coworkers who never thinks they are wrong.
- Re: (Score:2)
  
  by ceoyoyo ( 59147 ) writes:
  
  OpenAI's models, and most of the LLMs, are trained at least in part by having humans rate their conversations. That's what the "chat" in chatGPT stands for. Humans apparently rate chat partners that make up truthy sounding stuff more highly than chat partners that admit they don't know.
  That's a useful finding for a company that makes LLMs. It should be an interesting observation for people who talk to other people too.
  - Re: (Score:2)
    
    by Junta ( 36770 ) writes:
    
    It's the basis of how people get elected and promoted.
    - Re: (Score:2)
      
      by ceoyoyo ( 59147 ) writes:
      
      It's not just that. People exhibit overconfidence all the time. From social media and casual conversation to public policy, It's a deep cognitive bias in our species.
      Religion is maybe the best example. Don't know what the fuck is going on? Just make up a story that sounds good.
Potential for management positions... (Score:3)

by Lavandera ( 7308312 ) writes: on Wednesday September 17, 2025 @02:52PM (#65666532)

Ability to bullshit one's way up is one of key management skills...

Reply to This Share
twitter facebook
Flag as Inappropriate
- Re: (Score:2)
  
  by vbdasc ( 146051 ) writes:
  
  If AI can replace managers, then not everything is yet lost.
Just like people (Score:2)

by Austerity Empowers ( 669817 ) writes:

Nt
They BS instead of admitting they don't know? (Score:2)

by Locke2005 ( 849178 ) writes:

So AI really is just like human beings!
I just had this conversation with my coworker (Score:2)

by Locke2005 ( 849178 ) writes:

Me: "Why are you sending 600 octets?" Him: "Because the AI recommended I send 600 octets." Me: "But the test passes if we only send 44 bytes, so why send 600?" Him: "I'll look at the specs..."
Precision and Recall - not that fucking hard (Score:2)

by thehossman ( 198379 ) writes:

How many billions of dollars and thousands of man hours have AI "researchers" cost humanity, just because they evidently forgot to remember a core principle of machine learning?
https://en.wikipedia.org/wiki/... [wikipedia.org]
It's literally how LLMs work (Score:2)

by Tony Isaac ( 1301187 ) writes:

Everything they spit out is a "hallucination." They generate text that is *plausible* in that it echoes patterns that it has encountered in its training. What's really amazing is how frequently the output is *correct.*
Obvious (Score:2)

by Fons_de_spons ( 1311177 ) writes:

I found this to be pretty obvious when using chatgpt. Especially in the beginning. It pretends to be a know-it-all but comes up with tons of excuses when it gets something wrong, while just a "I do not know" would have been a good answer.
It reminds me of the ways some students try to talk themselves out of not knowing an answer. Funny part is that I spontaneously use the same techniques with chatgpt to figure out if it know what it is talking about.
GPT is unique though, it's responses are similar to a s
Confabulating (Score:2)

by mr100percent ( 57156 ) writes:

In psychology, this is not hallucination. The proper term for making stuff up like this is "confabulation."
In other words (Score:2)

by sconeu ( 64226 ) writes:

Just like most humans.
- Re: (Score:2)
  
  by sconeu ( 64226 ) writes:
  
  Enhancement: Just like all politicians.
It works like this (Score:1)

by leifbork ( 1745672 ) writes:

If you optimize for correctness, and reward being wrong equally as much as admitting lack of knowledge - Then the best thing to do is to guess.
If you have a multiple choice question, and you don't know the answer, you will on average have a higher score if you guess rather than if you leave out answering.
If you're a student taking a test, and you don't know the answer, you will try to write whatever you could come up with, that you find is most likely, rather than abstaining from answering. That makes sens

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

OpenAI Says Models Programmed To Make Stuff Up Instead of Admitting Ignorance More | Reply Login

No shit (Score:2)

Re:No shit (Score:4, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Anecdotal experiences with gen AIs (Score:2)

Re: No shit (Score:2)

More anecdotal experiences with gen AIs (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Ideal Capacitors not the Problem (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Read the Text (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:1)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Human do the same (Score:1)

Wrong explanation (Score:5, Informative)

Re:Wrong explanation (Score:5, Insightful)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

This is good research (Score:2)

And that's why I cancelled (Score:5, Insightful)

Re: (Score:2)

rtfm :( (Score:1)

Re: (Score:1)

Re: rtfm :( (Score:1)

Re: (Score:2)

Echo chamber effect? (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

That's the most annoying thing about AI (Score:1)

Generative vs Factual (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

AI disinformation is poisioning the world (Score:2)

Wow this is very insightful (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Potential for management positions... (Score:3)

Re: (Score:2)

Just like people (Score:2)

They BS instead of admitting they don't know? (Score:2)

I just had this conversation with my coworker (Score:2)

Precision and Recall - not that fucking hard (Score:2)

It's literally how LLMs work (Score:2)

Obvious (Score:2)

Confabulating (Score:2)