Forgot your password?
typodupeerror
AI Math

OpenAI's 'Embarrassing' Math (techcrunch.com) 41

An anonymous reader writes: "Hoisted by their own GPTards." That's how Meta's Chief AI Scientist Yann LeCun described the blowback after OpenAI researchers did a victory lap over GPT-5's supposed math breakthroughs. Google DeepMind CEO Demis Hassabis added, "this is embarrassing." The Decoder reports that in a since-deleted tweet, OpenAI VP Kevin Weil declared that "GPT-5 found solutions to 10 (!) previously unsolved Erdos problems and made progress on 11 others." ("Erdos problems" are famous conjectures posed by mathematician Paul Erdos.)

However, mathematician Thomas Bloom, who maintains the Erdos Problems website, said Weil's post was "a dramatic misrepresentation" -- while these problems were indeed listed as "open" on Bloom's website, he said that only means, "I personally am unaware of a paper which solves it." In other words, it's not accurate to claim GPT-5 was able to solve previously unsolved problems. Instead, Bloom wrote, "GPT-5 found references, which solved these problems, that I personally was unaware of."

This discussion has been archived. No new comments can be posted.

OpenAI's 'Embarrassing' Math

Comments Filter:
  • by test321 ( 8891681 ) on Monday October 20, 2025 @02:57PM (#65738862)

    As a language models, GPT-5 finding references that solved Erdos problems that the professional mathematician and maintainer of the Erdos problem website was not aware of, shows an actual useful skill for the LLM.

    • by drjzzz ( 150299 )
      Agree. It's like saying "I could've known that". Yeah, but you didn't. Even if they are not at the level of Erdos proofs they might have seemed to claim, GPT claims are not "embarrassing", except perhaps to competitors.
      • by grantham ( 49250 ) on Monday October 20, 2025 @03:20PM (#65738908) Homepage
        It's embarrassing that they claimed these problems were "previously unsolved".
      • by jsonn ( 792303 ) on Monday October 20, 2025 @03:23PM (#65738910)
        OpenAI's tweet was worded to suggest that GPT-5 found something new, when it "just" did a literature (or Google for Gen Z) search. If pulled that kind of stunt in a PHD thesis, it would be revoked post haste.
        • OpenAI's tweet was worded to suggest that GPT-5 found something new, when it "just" did a literature (or Google for Gen Z) search. If pulled that kind of stunt in a PHD thesis, it would be revoked post haste.

          Is the criticism that OpenAI is guilty of plagiarism, that they are claiming that they found solutions on their own? It seems more like the issue is that OpenAI thought they were clear in claiming to have found already existing solutions that were not widely acknowledged but others are interpreting the claims as having actually found new solutions.

          Both OpenAI and Meta/Google have their own motivations for claim what they are claiming, but it seems like this is not necessarily a clear slam dunk for either s

      • by dfghjk ( 711126 ) on Monday October 20, 2025 @03:35PM (#65738938)

        "... GPT claims are not "embarrassing", except perhaps to competitors."

        Yes they are, the claims were utterly false.

        • by HiThere ( 15173 )

          They weren't "utterly false", only "substantially false".
          I.e., the problems had been solved. Just not by ChatGPT.

      • by tomkost ( 944194 )
        Reminds me of a time I was at the Houston Art Museum. We're looking at a Jackson Pollack painting (abstract paint splatters). My friend remarked "that's awful, that's so easy, anyone can do that, I could do that!!" Then the lady just behind us says to him. "Yes, you COULD have done, but you DIDN'T, and that's whole point here" We laughed so hard...
        • splat (Score:4, Funny)

          by hawk ( 1151 ) <hawk@eyry.org> on Monday October 20, 2025 @04:28PM (#65739054) Journal

          I've been there.

          I was so frustrated that I taped my banana to the wall!

        • by HiThere ( 15173 )

          Actually Jackson Pollack paintings had (have?) hidden meaning. I don't think anyone knows what that meaning is, but multiple of his (authentic) paintings have been analyzed and there are strong patterns in the way he used colors and the angles at which the lines cross. Presumably that are other patterns that weren't checked for. It's like a coded message that nobody has the key for. But using this you can easily detect fake Pollack paintings.

          So, no, you couldn't do an equivalent painting. Just one that

    • by Anonymous Coward

      One is asserting that GPT-5 can do theoretical math at a level above practicing mathematicians. Other is saying that if you spend a hundred billion dollars digesting all of human knowledge you can build a better search engine.

      A better search engine is not going to do anything of the things AI zealots are promising. We already know that ChatGPT is pretty good as a search engine. But that won't replace human labor; it will just help highly skilled workers who are able to precisely formulate the questions they

    • by jsonn ( 792303 ) on Monday October 20, 2025 @03:33PM (#65738930)
      It's useful, but not groundbreaking. The equivalent to a minimal spanning tree in directed graphs is called an optimal branching. The algorithm for that was originally known Edmonds' algorithm in most of the world after the discoverer. It was later discovered that Chu and Liu already published essentially the same idea two years earlier in a Chinese journal. Even now, it is still often referenced only by Edmonds' name. This happens often enough.
      • by dfghjk ( 711126 ) on Monday October 20, 2025 @03:43PM (#65738960)

        I invented the Bresenham line algorithm, a while later I learned I was not the first. I'm sure that's been replayed a million times.

        Gallo discovered HTLV-3, the virus that causes AIDS. He discovered it by stealing the virus from French researchers who isolated it first and made it available in good faith. So, you know, the details are important. Here, promoters of AI made claims that are utterly untrue, it would be interesting to know if they made these errors out of deceit or ignorance.

        • by jsonn ( 792303 ) on Monday October 20, 2025 @03:48PM (#65738974)
          I absolutely agree. It's a major difference to find a publication somewhere that was unknown and discovering something independently. It's useful to be able to do the former, but vastly different from being able to do the latter. Not verifying what the computer did is the real embarrassment for OpenAI.
          • by HiThere ( 15173 )

            I suspect that the people who had the AI "solve" the problem were aware of what it had done, but that the PR guys got (or understood) a simplified story. No actual lies were necessarily involved, as there wasn't necessarily any intent to deceive.

            OTOH, the statement was substantially false.

            When corporations tell self-promoting falsehoods its not necessarily with any intent to deceive. But you still shouldn't trust them when there's not sufficient transparency. And were this a fraud investigation, I think

    • by dfghjk ( 711126 )

      my thoughts exactly, and also proof that GPT-5 may actually be more intelligent that the a-holes that created it and promote it.

    • by gweihir ( 88907 )

      For very limited values of "useful". If these references were actually really useful, the maintainer would have known them.

    • by N1AK ( 864906 )
      While true, that's not the same thing as solving unsolved problems which is what OpenAI claimed it had done. I don't know how to do a lot of things, but that doesn't mean Google is solving an unsolved problem if I use Google search to find the answer.
  • Physics will always win.

  • That he didn't bother to search, or...

    The volume of garbage that AI spews out has made traditional search engines less useful, or...

    The people who actually solved the problems did not bother to contact the guy who tracks them.

    Quite possibly all these "or" should be replaces with "and".

  • GPTards, I love it! I'll be using that name to refer to all ChatGPT users going forward.

  • ... does sound kinda useful?
  • by JoshuaZ ( 1134087 ) on Monday October 20, 2025 @05:11PM (#65739152) Homepage
    What's really unfortunate here is that due to OpenAI's drastic exaggeration of what happened here it distracts from the real capabilities here. Being able to efficiently find sources in the literature is an incredibly useful tool. And even aside from that there are now multiple examples where professional mathematicians have used GPT-5 in the thinking mode to make progress on math problems. Nothing as major as any Erdos problem, but still clear use. Terry Tao for example used GPT-5 in thinking mode to help locate a counterexample to a conjecture here https://mathoverflow.net/questions/501066/is-the-least-common-multiple-sequence-textlcm1-2-dots-n-a-subset-of-t [mathoverflow.net]. Now, he could have almost certainly done this on his own, but it clearly saved time. Similarly, computer scientist Scott Aaronson used it to get a specific useful suggestion for a function with specific properties he needed that he was then able to use to do a specific thing https://scottaaronson.blog/?p=9183 [scottaaronson.blog]. In neither of these cases was anything deeply groundbreaking done by the LLM. But the LLM clearly helped and likely saved many hours of work otherwise. And these systems continue to improve.
  • by drinkypoo ( 153816 ) <drink@hyperlogos.org> on Monday October 20, 2025 @05:30PM (#65739198) Homepage Journal

    It's hallucinating solutions

  • If these artificial cretins could perform (outside of very limited circumstances), the worlds would already look a lot different. Instead it has all been smoke and mirrors and lies by misdirection. Sure, a lot of people fall for that but that does not say anything about LLMs, it says some not very good things about people.

  • There is an excellent timeline and summary on Reddit/math https://www.reddit.com/r/math/... [reddit.com]
  • RTFM - Per the article, the OpenAI Tweet actually said, "“GPT-5 found solutions to 10 (!) previously unsolved Erds problems and made progress on 11 others.”

    The article doesn't speak to the "made progress on 11 others part," but by all accounts (in the article) GPT-5 did in fact "find" solutions to 10 problems. The Tweet didn't claim that OpenAI solved those 10 problems.

    Words matter.

    It seems to me like the OpenAI engineer was apologizing for the wording of the Tweet since it could be interpreted

Progress means replacing a theory that is wrong with one more subtly wrong.

Working...