OpenAI's 'Embarrassing' Math (techcrunch.com) 41
An anonymous reader writes: "Hoisted by their own GPTards." That's how Meta's Chief AI Scientist Yann LeCun described the blowback after OpenAI researchers did a victory lap over GPT-5's supposed math breakthroughs. Google DeepMind CEO Demis Hassabis added, "this is embarrassing." The Decoder reports that in a since-deleted tweet, OpenAI VP Kevin Weil declared that "GPT-5 found solutions to 10 (!) previously unsolved Erdos problems and made progress on 11 others." ("Erdos problems" are famous conjectures posed by mathematician Paul Erdos.)
However, mathematician Thomas Bloom, who maintains the Erdos Problems website, said Weil's post was "a dramatic misrepresentation" -- while these problems were indeed listed as "open" on Bloom's website, he said that only means, "I personally am unaware of a paper which solves it." In other words, it's not accurate to claim GPT-5 was able to solve previously unsolved problems. Instead, Bloom wrote, "GPT-5 found references, which solved these problems, that I personally was unaware of."
However, mathematician Thomas Bloom, who maintains the Erdos Problems website, said Weil's post was "a dramatic misrepresentation" -- while these problems were indeed listed as "open" on Bloom's website, he said that only means, "I personally am unaware of a paper which solves it." In other words, it's not accurate to claim GPT-5 was able to solve previously unsolved problems. Instead, Bloom wrote, "GPT-5 found references, which solved these problems, that I personally was unaware of."
"GPT-5 found refs, which solved these problems" (Score:5, Interesting)
As a language models, GPT-5 finding references that solved Erdos problems that the professional mathematician and maintainer of the Erdos problem website was not aware of, shows an actual useful skill for the LLM.
Re: (Score:2)
Re:"GPT-5 found refs, which solved these problems" (Score:4, Insightful)
Re:"GPT-5 found refs, which solved these problems" (Score:4, Insightful)
Re:"GPT-5 found refs, which solved these problems" (Score:5, Insightful)
Re: (Score:3)
OpenAI's tweet was worded to suggest that GPT-5 found something new, when it "just" did a literature (or Google for Gen Z) search. If pulled that kind of stunt in a PHD thesis, it would be revoked post haste.
Is the criticism that OpenAI is guilty of plagiarism, that they are claiming that they found solutions on their own? It seems more like the issue is that OpenAI thought they were clear in claiming to have found already existing solutions that were not widely acknowledged but others are interpreting the claims as having actually found new solutions.
Both OpenAI and Meta/Google have their own motivations for claim what they are claiming, but it seems like this is not necessarily a clear slam dunk for either s
Re: (Score:2, Insightful)
Re: "GPT-5 found refs, which solved these problems (Score:2)
You could read it as meaning "discovered something no one else has discovered before", but that's adding an extra layer of interpretation on top of the literal reading.
Re: (Score:2)
Re: (Score:2)
Re:"GPT-5 found refs, which solved these problems" (Score:4, Insightful)
"... GPT claims are not "embarrassing", except perhaps to competitors."
Yes they are, the claims were utterly false.
Re: (Score:3)
They weren't "utterly false", only "substantially false".
I.e., the problems had been solved. Just not by ChatGPT.
Re: (Score:2)
splat (Score:4, Funny)
I've been there.
I was so frustrated that I taped my banana to the wall!
Re: (Score:3)
Actually Jackson Pollack paintings had (have?) hidden meaning. I don't think anyone knows what that meaning is, but multiple of his (authentic) paintings have been analyzed and there are strong patterns in the way he used colors and the angles at which the lines cross. Presumably that are other patterns that weren't checked for. It's like a coded message that nobody has the key for. But using this you can easily detect fake Pollack paintings.
So, no, you couldn't do an equivalent painting. Just one that
Re: (Score:2)
What you wrote reminded me of Barnett Newman painting Who’s afraid of red, yellow, and blue which definitely is not my style, and younger me would most likely written off as more or less nonsense, but watching the video Who’s afraid of modern art: Vandalism, video games, and fascism [youtube.com] I learned that the colours used are actually hard to reproduce [youtu.be] and requite some skill I was unaware of.
Still not my style, but I can now appreciate that other people might have it as their.
not nearly in the same realm (Score:2, Interesting)
One is asserting that GPT-5 can do theoretical math at a level above practicing mathematicians. Other is saying that if you spend a hundred billion dollars digesting all of human knowledge you can build a better search engine.
A better search engine is not going to do anything of the things AI zealots are promising. We already know that ChatGPT is pretty good as a search engine. But that won't replace human labor; it will just help highly skilled workers who are able to precisely formulate the questions they
Re:"GPT-5 found refs, which solved these problems" (Score:4, Interesting)
Re:"GPT-5 found refs, which solved these problems" (Score:4, Interesting)
I invented the Bresenham line algorithm, a while later I learned I was not the first. I'm sure that's been replayed a million times.
Gallo discovered HTLV-3, the virus that causes AIDS. He discovered it by stealing the virus from French researchers who isolated it first and made it available in good faith. So, you know, the details are important. Here, promoters of AI made claims that are utterly untrue, it would be interesting to know if they made these errors out of deceit or ignorance.
Re:"GPT-5 found refs, which solved these problems" (Score:4, Informative)
Re: (Score:2)
I suspect that the people who had the AI "solve" the problem were aware of what it had done, but that the PR guys got (or understood) a simplified story. No actual lies were necessarily involved, as there wasn't necessarily any intent to deceive.
OTOH, the statement was substantially false.
When corporations tell self-promoting falsehoods its not necessarily with any intent to deceive. But you still shouldn't trust them when there's not sufficient transparency. And were this a fraud investigation, I think
Re: (Score:2)
my thoughts exactly, and also proof that GPT-5 may actually be more intelligent that the a-holes that created it and promote it.
Re: (Score:2)
For very limited values of "useful". If these references were actually really useful, the maintainer would have known them.
Re: (Score:2)
LLMs are a red herring (Score:2)
Physics will always win.
A better search engine, or... (Score:2)
That he didn't bother to search, or...
The volume of garbage that AI spews out has made traditional search engines less useful, or...
The people who actually solved the problems did not bother to contact the guy who tracks them.
Quite possibly all these "or" should be replaces with "and".
Re:A better search engine, or... (Score:4, Insightful)
Papers published in small journals have been overlooked since before the time of Gauss (who overlooked Lobachevsky).
Great Name! (Score:1)
GPTards, I love it! I'll be using that name to refer to all ChatGPT users going forward.
Re:Stone Tossers. (Score:5, Insightful)
It's a website run mainly by one guy....
"This website was made by Thomas Bloom, a mathematician who likes to think about the problems ErdÅ's posed."
It's not like an official website endorsed by the author or anything. It's just a website of a guy who has some interest in that particular set of problems.
"Is the database up to date (e.g. the open/solved status of each problem)?
No, but that is the eventual goal"
https://www.erdosproblems.com/... [erdosproblems.com]
It's like holding some trainspotter responsible for not knowing about the existence of a particular niche train somewhere in the world.
Er ... (Score:2)
Brilliant Scientist Presents His Creation (Score:2)
From Young Frankenstein:
https://www.youtube.com/watch?... [youtube.com]
What's unfortunate here (Score:4, Insightful)
it's not doing math (Score:3)
It's hallucinating solutions
Re: (Score:2)
No, it's not hallucinating, it's showing math from the reference data
LLM peddlers lying. What else is new? (Score:2)
If these artificial cretins could perform (outside of very limited circumstances), the worlds would already look a lot different. Instead it has all been smoke and mirrors and lies by misdirection. Sure, a lot of people fall for that but that does not say anything about LLMs, it says some not very good things about people.
Timeline and Summary (Score:1)
If you want to pick nits... (Score:2)
RTFM - Per the article, the OpenAI Tweet actually said, "“GPT-5 found solutions to 10 (!) previously unsolved Erds problems and made progress on 11 others.”
The article doesn't speak to the "made progress on 11 others part," but by all accounts (in the article) GPT-5 did in fact "find" solutions to 10 problems. The Tweet didn't claim that OpenAI solved those 10 problems.
Words matter.
It seems to me like the OpenAI engineer was apologizing for the wording of the Tweet since it could be interpreted