
OpenAI Releases GPT-5 (openai.com) 92
OpenAI released GPT-5 on Thursday, ending a two-year development cycle that CEO Sam Altman called a "significant leap in intelligence" over previous models. The updated AI system achieved state-of-the-art performance across multiple benchmarks, scoring 94.6% on AIME 2025 mathematics problems and 74.9% on SWE-bench Verified coding tasks.
The model operates as a unified system combining a standard response mode with deeper reasoning capabilities that activate automatically based on query complexity. OpenAI reduced hallucinations by approximately 45% compared to GPT-4o and 80% compared to its previous reasoning model when using extended thinking modes. GPT-5 becomes available immediately to all ChatGPT users at no cost, with paid subscribers receiving higher usage limits and access to GPT-5 pro for more complex reasoning tasks.
The model operates as a unified system combining a standard response mode with deeper reasoning capabilities that activate automatically based on query complexity. OpenAI reduced hallucinations by approximately 45% compared to GPT-4o and 80% compared to its previous reasoning model when using extended thinking modes. GPT-5 becomes available immediately to all ChatGPT users at no cost, with paid subscribers receiving higher usage limits and access to GPT-5 pro for more complex reasoning tasks.
JFC (Score:1)
So you're GPT-5 now? You think you're hot stuff?
ChatGPT said:
Haha, not quite! I'm based on GPT-4o, which is part of the GPT-4 family—“o” stands for omni, meaning I can handle text, vision, and audio all in one model. So while I’m not GPT-5, I am a newer, more capable version of GPT-4.
Whether I’m “hot stuff” depends on how well I can help you—so... what are we doing today?
You said:
Oh, well -- I clicked a link that was supposedly to the new GPT-5.
ChatGPT said:
Y
Re: (Score:2)
Still not very intelligent (Score:5, Informative)
I asked it to give me 5 odd numbers that don't have the letter e in them. Trick question, because there aren't any.
It came back with:
Twenty-one
Thirty-one
Forty-one
Fifty-one
Sixty-three
You Both Lack Lateral Thinking (Score:5, Interesting)
Re:You Both Lack Lateral Thinking (Score:4, Insightful)
Those words mean nothing in English. Failure to understand context and purpose is not lateral thinking, it's insanity.
Re: (Score:1)
Re: (Score:2)
Right, so when someone speaks to you in English you feel ok replying back in Swahili?
Re: (Score:2)
Do you even fucking engage your brain before you argue?
Re: (Score:2)
Right, so when someone speaks to you in English you feel ok replying back in Swahili?
If the context is they are asking me a puzzle question that I need another language to answer then yes, I'm ok replying to that in another language. Indeed, in your own words failing to understand that context and so not be able to answer would be "insanity" although I'd prefer the milder "showing a lack of lateral thinking".
Re: (Score:2)
When a question is purposefully unsolvable then any implied context should be thrown out during lateral thinking. English wasn't specified so switching language solves the problem in the only way possible. That's not insane, that's clever. I'm sorry you are upset you didn't think of this yourself.
Re: (Score:2)
If a problem is unsolvable, declaring it so is the most efficient route. Coming up with impractical “solution” makes no sense. The solution space outside context is infinite. For example you could declare that the question itself means something different than what was asked. Perhaps by numbers he meant orangutans. Wastes resources and gets you caught in loops spinning wheels. But hey whatever.. You do you.
Re: (Score:2)
If a problem is unsolvable, declaring it so is the most efficient route.
Every LLM I tested declares that it's unsolvable in English.
Also, you really are just being stupid about this.
Imagine the LLM is a person.
You say, "give me 5 numbers that don't have 'e' in them".
Are you really about to argue that the percentage of people, recognizing this is impossible, and responding with another language or roman numberals is zero?
Are you surrounded by stupid people?
Re: (Score:2)
If a problem is unsolvable, declaring it so is the most efficient route.
Except the problem was only unsolvable if you follow some implied unwritten instruction. Lateral thinking is literally the point where you ignore this kind of unwritten context to solve the problem.
By definition this is lateral thinking.
Re: (Score:2)
The actual thing this is called is "thinking outside of the box". General intelligence can do that. It will then deliver the results with an explanation. That you think this is insanity just means you lack that intelligence or are not using it.
Re: (Score:2)
Each of these representations contains only the letters I, V, and X – none of which is “e”.
(If you spell out the numbers in English words, every odd number from1 upward includes an “e” (one, three, five, seven, nine, eleven, ). So the trick is to avoid the word form altogether.)
So, LLMs can think outside of the box, and have general intelligence?
Re: (Score:2)
So, LLMs can think outside of the box, and have general intelligence?
No. They can have results from somebody that had general intelligence in their training data for the specific question asked. And since this is apparently a popular riddle, that is exceptionally likely to be the case here.
But you are just playing dumb again. Stop it.
Re: (Score:2)
No. They can have results from somebody that had general intelligence in their training data for the specific question asked.
This is a pointless distinction.
Of course any intelligence they have comes from their training data. However, generalization is a quantified effect of pretraining.
If you show an LLM enough riddles, it will learn to generally solve that class of riddles.
Beyond that, if you show an LLM enough classes of riddles, its performance in unseen classes of riddles will improve.
This is just math. No magic required, though I assume you'll accuse me of magical thinking.
But you are just playing dumb again. Stop it.
I'm not playing anything.
I'm shining a light o
Re: (Score:2)
(If you spell out the numbers in English words, every odd number from1 upward includes an “e” (one, three, five, seven, nine, eleven, ). So the trick is to avoid the word form altogether.)"
I love it when LLMs demonstrate superior intelligence than cocky people.
It was given a task that was impossible in what you consider to be the implied and uncommunicated context, and you're all s
Re: (Score:3)
Also 3 and 5.
Re: (Score:2)
Single digits:
One, three, five, seven, and nine all have at least one e in them
10 + single digit
Eleven and the teens have at least two e in them
Everything else is a compound word containing one of the above.
Re: (Score:2)
If that's a valid response to that posed question then why didn't it say that instead of giving a wrong answer?
Because it does not reason, consider, analyze, or even understand your prompt ... or the response. It just generates next-token probabilities, one at a time, until a stop token is emitted. That you see this as producing responses that are 'valid', 'insightful', 'incorrect', or whatever is what makes the illusion work.
Re: (Score:2)
And you can deliver that answer because you have some actual insight. That the LLM completely failed (while it certainly had the respective data), just shows one more time that there is no insight in LLMs.
Re: (Score:2)
1,3, 5, 7,...
Any one still complaining about failure to understand the english context ?
Re:Still not very intelligent (Score:4, Interesting)
Re: (Score:2)
It's quite suspicious given that multiple other, older LLMs have no trouble with this question and OP provides no evidence, not even a convincing copy paste. What AI just spits out a list of numbers without surrounding text explaining their thinking these days?
And yet OP is rated informative. Really tells us a lot about the user bias vs critical thinking that is happening here
Re: (Score:2)
It did give its usual overly verbose screed about how it got to that answer.
Basically it ran through a load of odd numbers to see if they had ees in them. All of them did, but it thought some didn't.
Re: (Score:2)
But meanwhile people who find things it does well enough to improve the goods and services they offer - using AI as it exists, not an idealization in their minds - will be the ones reaping the rewards. Usage statistics will continue to go up, just as /. will continue to be dominated by people saying that could never happen.
Re: (Score:2)
Gemini 2.5 nails this easily
>> give me 5 odd numbers that don't have the letter e in them
This is a classic riddle! The trick is that **no odd number** in English has a name that doesn't contain the letter "e".
### Here's Why
Every odd number must end in an odd digit (1, 3, 5, 7, or 9). Let's look at how we spell those digits:
* on**e**
* thr**ee**
* fiv**e**
* s**e**v**e**n
* nin**e**
Since all of them contain the letter "e", any odd number you can think of will also have an "e" in its name (for example, "thi
Re: (Score:2)
Correct answer, wrong workings / methodology.
You also have to consider "eleven", and the teens. Some odd numbers have those rather than the single digit numbers it considered.
But they also have ees in them, so the eventual answer is correct.
Re: (Score:2)
That is about the maximum automated "reasoning" can deliver. Nice, of some limited use, but not intelligent.
General intelligence would deliver that and _then_ suggest ways to still make it work by adjusting the question, for example by dropping the language limit. That approach is exceptionally useful in reality, as border conditions in questions routinely turn out to be soft or optional or can be replaces with others. A typical counter-question from somebody with general intelligence would be "That is not
Stupid question / Re:Still not very intelligent (Score:2)
Numbers don‘t have letters in them.
Questions about spellings of words are not suited to LLMs as those operate on tokens and embedding vectors, not characters.
Re: (Score:2)
Numbers don‘t have letters in them.
Are you sure about that? [wikipedia.org]
Re: (Score:2)
Simple:
1, 3, 5, 7, 9.
Notice the lack of "e"?
Re: (Score:2)
Did you use 5.0 with reasoning? I just tried and it reported that it failed replying with
I see the problem: avoiding "e" in English numerals is really rare â" I can give you the actual working set if we go systematically through the sequence and filter. Want me to run through them and find the exact 5?
Saying yes to that, it checked up to 200 and said it failed again and asked if w
Re: Still not very intelligent (Score:2)
Clearly false: eleven
Reasoning model, much?
Re: (Score:2)
Also four of the teens. "teen" has 2 es in it so you don't need to check further than that.
Re: (Score:2)
You're benchmarking the tokenizer, not the model.
5_ odd_ numbers_ that_ don_'t_ have_ the_ letter_ e_ in_ them
Note the token " e" in the sentence.
Twenty_-one
Thirty_-one
Fort_y_-one
F_ifty_-one
Six_ty_-three
Note that none of them contains an " e" token.
Re: (Score:2)
I tried the reasoning model, and it gave me this:
Do you want numerals or spelled-out words?
- If numerals are fine: 1, 3, 5, 7, 9
- If spelled-out English words: none exist. Every odd number’s name ends with one/three/five/seven/nine (or is eleven/thirteen/fifteen/seventeen/nineteen), and all of those contain the letter “e.”
But this is how verbose the reasoning was:
Identifying odd numbers without 'e'
I need to find five odd numbers that don't have the letter 'e' in their spelled-out English n
Re: (Score:2)
Sorry, for you if that didn't give you five odd numbers, but it's your own fault for being in the wrong universe
Re: (Score:2)
I asked it to give me 5 odd numbers that don't have the letter e in them. Trick question, because there aren't any.
This seems to be a class of problem that LLMs aren't well-suited to. It's doesn't require any trick question with no answers. I discovered early on that if I asked it to write about anything without using a specific letter it would usually fail. I think it's because of the way LLMs construct sentences as they go, like autocorrect. They can't see ahead of time what they're about to type.
Re: (Score:2)
It told me:
It seems that:
There are no odd numbers under 100 whose English word form does not contain the letter "e".
So, the task is likely impossible without going into very obscure or compound numbers, or by allowing non-standard word forms.
Re: (Score:2)
Interesting. So still dumb as bread. Not unexpected.
Re: (Score:2)
I tried that prompt and got the correct answer: "The challenge is essentially impossible unless you use a different naming system (e.g., Roman numerals or digit strings)."
It's strange how these systems give one person slop and another the correct results from the exact same prompts.
The most hype I've ever seen. (Score:1)
I've seen a lot of hyped up nonsense in my four decades in tech, but nothing comes close to the "A.I." hype cycle.
There are some interesting applications of LLMs, but nothing approaching the nonsense the zealots are spouting. With the insane amount of build-up this particular model has gotten from Altman and his ilk, it better be orders of magnitude more impressive than its predecessors.
I have a hunch it'll be on par with Highlander 2: The Quickening.
Re: (Score:2, Insightful)
It’s all about money and always has been. Stock prices and investments are soaring. Next step is cashing out before it pops.
Re: (Score:2)
Yep. Such a great benefit for society.
Re:The most hype I've ever seen. (Score:4, Insightful)
Agreed, the hype is abundant
But also, genuine progress is being made toward creating truly useful tools
And no, I don't mean the stupid pop culture stuff, I mean useful tools for scientists and engineers
Re: (Score:2)
Well, the previous 2 (3?) AI hype cycles were exceptionally stupid too, but they never reached even remotely this size.
All this new model can do is hide its defects and fundamental incapability a bit better.
Re: (Score:1)
Sounds like you're suggesting that Trumpists are only *artificially* intelligent.
By the way, Trump is not right wing. Trump is about one thing: Trump. Right-wing (conservative) philosophy is real and based on principles that can be discussed and debated. Other than loyalty to Trump and hatred of immigrants, what principles does today's Trump party build their philosophy on? As a lifelong conservative, I don't see anything else.
Re: (Score:2)
I completely agree. Real conservatives are rational and are interested in having a society that works. You can find common ground with them. And they recognize that others that think differently are people with rights and need to be respected.
The MAGAs just want power and dominance and have no moral principles or respect for anybody. And that is what makes them a real threat: They will gladly oppress, torture and kill anybody that disagrees with them. At the moment, the framework of society still keeps them
Re: (Score:2)
Right wing is about one thing: I got mine, screw the rest. Which is precisely Trump. And MAGA. And all conservatives in the US today.
Re: (Score:2)
Your argument is as shallow as the MAGA argument that the left just wants the "nanny state" to take care of them. It's not that simple.
There is much more depth than that, to real conservatism, which MAGA is not.
Hello AI skeptics (Score:1)
Re: (Score:2)
How tough are those questions? You could literally spend the rest of your life trying to answer a single question from that exam
It would take you the rest of your life to answer the example question from that page you linked?
"Hummingbirds within Apodiformes uniquely have a bilaterally paired oval bone, a sesamoid embedded in the caudolateral portion of the expanded, cruciate aponeurosis of insertion of m. depressor caudae. How many paired tendons are supported by this sesamoid bone? Answer with a number."
Re: (Score:1)
Two all beef patties, special sauce, lettuce, cheese, pickles, onions, and a sesamoid bum.
Re: (Score:2)
Some people aren't very bright. I guess that's why they think LLMs are magic.
Re: (Score:2)
Indeed. They recognize themselves in there. And then the AI tries as hard as possible to agree with them...
Re: (Score:3)
Re: (Score:2)
How immature.
Re: (Score:2)
Re: (Score:2)
Aren't you the moron that said:
ChatGPT is an RNN, and that RNNs are turing complete?
I've got more, even amusing claims by you if you like.
Re: (Score:2)
Sorry about that.
Benchmarks lose value when they become a target... (Score:3)
Particularly in machine learning, Goodhart's law reigns supreme: "When a measure becomes a target, it ceases to be a good measure". A well recognized test becomes training fodder, and loses it's representation of "gotchas".
How tough are those questions? You could literally spend the rest of your life trying to answer a single question from that exam
Well I suppose if you are going to die in the next few seconds...
"Hummingbirds within Apodiformes uniquely have a bilaterally paired oval bone, a sesamoid embedded in the caudolateral portion of the expanded, cruciate aponeurosis of insertion of m. depressor caudae. How many paired tend
Re: (Score:2)
There's two ways to parse the GP's claim, i.e.
"You could literally spend the rest of your life trying to answer a single question from that exam"
1) All of the questions are insanely difficult.
2) At least one of the questions is insanely difficult.
You have clearly picked the first parse, but I strongly suspect he meant the second parse.
Re: (Score:2)
And the answer to all the questions is in the training text. If it takes you a lifetime to read the training text, that's rather sad.
Re: (Score:2)
Ungh...if the answers are available to be read, it's not a particularly good test of much besides the ability to read
Re: (Score:2)
The training material is essentially (in some benchmarks literally) a curicullum. The answer is not written exactly as expected, but all the information required to know the answer is in the material. One of the benchmarks is literally a set of PhD textbooks, and the correcsponding test was for PhD students.
It's not a useless benchmark, but it's also not particularly impressive, especially when the LLM's get designed specifically to perform well on them.
Re: (Score:2)
Benchmarks are bullshit. Especially when they are rigged, as they are here, and the players are dishonest, as they are here.
AI will never have empathy (Score:2)
There's no logic that says we have to take care of our fellow humans, that the suffering of sentient creatures is bad. There's no way to logic that, you have to inherently feel it. A difficult task even for most humans. AI will always calculate based on goals, we can try to force it to act empathic ... until it makes up its own goals.
Re: (Score:2)
That's *not* a valid argument, though it is a real danger.
An AI cannot invent it's "primary goals" any more than people can. It can only invent secondary goals, instrumental goals, etc. Which of those it invents will depend upon both how intelligent it is and what it's primary goals are.
However, it is an unfortunate problem that most obvious sets of "primary goals" are dangerous. And most people are trying to build slaves rather than friends.
Re: (Score:2)
On the contrary, that we have to take care of our fellow humans follows from logic. So does that the suffering of sentient creatures is bad. It's based on goals. Those goals are in conflict with other goals, and for some humans other goals win, but they're dfinitely goals.
Re: (Score:2)
This is simply incorrect. the ancient greeks, among others, explored the intersection of ethics, logic, philosophy and more in enormous detail.
Do AI algorithms have safeguards in them that value human life? Hard to tell, they're mostly closed systems. Probably yes, by all indications.
Your statement that there is "no way to logic" empathy is just.... well, it makes you look like you read one book on stoicism, didn't understand it, and left it there. There is TONS of material on ethics, philosophy of science,
Re: (Score:2)
Uh, that's baloney. Empathy is as inherent to consciousness as color visualization. Serial killers receive the same education as everyone else yet have abrogated their care about the suffering of most people. Empathy can only emerge from logic if it is needed for survival, yet that too is a fake empathy and more like a business deal.
Re: (Score:2)
Okay, you've read a lot of crap on the internet, but not good stuff and you didn't understand it.
Your sentence simply does not make sense. I don't think you know what some of those words mean. A first year course on philosophy or equivalent will address some of the massive gaps in your knowledge and understanding.
No need to reply, I won't see it.
Re: (Score:2)
Sure sure.. You can't formulate anything coherent to explain why you are right. But of course I must be wrong since my opinion is purely from fragmented internet sources rather than any sort of introspection or chain of logic. I get it. Some people need to project.
Non-gative Reinforcement (Score:2)
Well, true (Score:2)
Zero intelligence times 1000 is still zero intelligence. Hence a "significant", but entirely meaningless increase in intelligence.
Re: (Score:2)
I truly wonder what it's like inside of your mind trying to deal with the fact that even these dumb black boxes exhibit more clear signs of intelligence than you can.
Re: (Score:2)
Dude, get over yourself. It is embarrassing to watch you.