Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
AI

OpenAI Releases GPT-5 (openai.com) 92

OpenAI released GPT-5 on Thursday, ending a two-year development cycle that CEO Sam Altman called a "significant leap in intelligence" over previous models. The updated AI system achieved state-of-the-art performance across multiple benchmarks, scoring 94.6% on AIME 2025 mathematics problems and 74.9% on SWE-bench Verified coding tasks.

The model operates as a unified system combining a standard response mode with deeper reasoning capabilities that activate automatically based on query complexity. OpenAI reduced hallucinations by approximately 45% compared to GPT-4o and 80% compared to its previous reasoning model when using extended thinking modes. GPT-5 becomes available immediately to all ChatGPT users at no cost, with paid subscribers receiving higher usage limits and access to GPT-5 pro for more complex reasoning tasks.

OpenAI Releases GPT-5

Comments Filter:
  • by Anonymous Coward

    So you're GPT-5 now? You think you're hot stuff?

    ChatGPT said:
    Haha, not quite! I'm based on GPT-4o, which is part of the GPT-4 family—“o” stands for omni, meaning I can handle text, vision, and audio all in one model. So while I’m not GPT-5, I am a newer, more capable version of GPT-4.

    Whether I’m “hot stuff” depends on how well I can help you—so... what are we doing today?

    You said:
    Oh, well -- I clicked a link that was supposedly to the new GPT-5.

    ChatGPT said:

    Y

    • by Guignol ( 159087 )
      Surely, Chat-GPT-6 will come with text, vision, audio, and GNU's core-utils whoami command, all in one model
  • by jonbryce ( 703250 ) on Thursday August 07, 2025 @01:50PM (#65573210) Homepage

    I asked it to give me 5 odd numbers that don't have the letter e in them. Trick question, because there aren't any.

    It came back with:

    Twenty-one
    Thirty-one
    Forty-one
    Fifty-one
    Sixty-three

    • by Roger W Moore ( 538166 ) on Thursday August 07, 2025 @02:19PM (#65573256) Journal
      There are odd numbers which don't have the letter 'e' in them, for example: trois, cinq or fünf
      • by backslashdot ( 95548 ) on Thursday August 07, 2025 @02:28PM (#65573290)

        Those words mean nothing in English. Failure to understand context and purpose is not lateral thinking, it's insanity.

        • The purpose was to find an odd number that could be written without an 'e'. The context implied but did not specify that the language was English. Failure to question implied assumptions is a result of a lack of lateral thinking, not insanity.
          • Right, so when someone speaks to you in English you feel ok replying back in Swahili?

            • Of course not, but if someone asks me a trick question that literally is not answerable in English, yes, I almost certainly would utilize another language.

              Do you even fucking engage your brain before you argue?
            • Right, so when someone speaks to you in English you feel ok replying back in Swahili?

              If the context is they are asking me a puzzle question that I need another language to answer then yes, I'm ok replying to that in another language. Indeed, in your own words failing to understand that context and so not be able to answer would be "insanity" although I'd prefer the milder "showing a lack of lateral thinking".

        • When a question is purposefully unsolvable then any implied context should be thrown out during lateral thinking. English wasn't specified so switching language solves the problem in the only way possible. That's not insane, that's clever. I'm sorry you are upset you didn't think of this yourself.

          • If a problem is unsolvable, declaring it so is the most efficient route. Coming up with impractical “solution” makes no sense. The solution space outside context is infinite. For example you could declare that the question itself means something different than what was asked. Perhaps by numbers he meant orangutans. Wastes resources and gets you caught in loops spinning wheels. But hey whatever.. You do you.

            • If a problem is unsolvable, declaring it so is the most efficient route.

              Every LLM I tested declares that it's unsolvable in English.

              Also, you really are just being stupid about this.

              Imagine the LLM is a person.
              You say, "give me 5 numbers that don't have 'e' in them".
              Are you really about to argue that the percentage of people, recognizing this is impossible, and responding with another language or roman numberals is zero?
              Are you surrounded by stupid people?

            • If a problem is unsolvable, declaring it so is the most efficient route.

              Except the problem was only unsolvable if you follow some implied unwritten instruction. Lateral thinking is literally the point where you ignore this kind of unwritten context to solve the problem.

              By definition this is lateral thinking.

        • by gweihir ( 88907 )

          The actual thing this is called is "thinking outside of the box". General intelligence can do that. It will then deliver the results with an explanation. That you think this is insanity just means you lack that intelligence or are not using it.

          • From GPT-OSS 120b (MLX, 6b quant)

            Each of these representations contains only the letters I, V, and X – none of which is “e”.

            (If you spell out the numbers in English words, every odd number from1 upward includes an “e” (one, three, five, seven, nine, eleven, ). So the trick is to avoid the word form altogether.)

            So, LLMs can think outside of the box, and have general intelligence?
            • by gweihir ( 88907 )

              So, LLMs can think outside of the box, and have general intelligence?

              No. They can have results from somebody that had general intelligence in their training data for the specific question asked. And since this is apparently a popular riddle, that is exceptionally likely to be the case here.

              But you are just playing dumb again. Stop it.

              • No. They can have results from somebody that had general intelligence in their training data for the specific question asked.

                This is a pointless distinction.
                Of course any intelligence they have comes from their training data. However, generalization is a quantified effect of pretraining.
                If you show an LLM enough riddles, it will learn to generally solve that class of riddles.
                Beyond that, if you show an LLM enough classes of riddles, its performance in unseen classes of riddles will improve.
                This is just math. No magic required, though I assume you'll accuse me of magical thinking.

                But you are just playing dumb again. Stop it.

                I'm not playing anything.
                I'm shining a light o

        • "Each of these representations contains only the letters I, V, and X – none of which is “e”.

          (If you spell out the numbers in English words, every odd number from1 upward includes an “e” (one, three, five, seven, nine, eleven, ). So the trick is to avoid the word form altogether.)"

          I love it when LLMs demonstrate superior intelligence than cocky people.

          It was given a task that was impossible in what you consider to be the implied and uncommunicated context, and you're all s
      • by HiThere ( 15173 )

        Also 3 and 5.

        • Single digits:
          One, three, five, seven, and nine all have at least one e in them

          10 + single digit
          Eleven and the teens have at least two e in them

          Everything else is a compound word containing one of the above.

      • by gweihir ( 88907 )

        And you can deliver that answer because you have some actual insight. That the LLM completely failed (while it certainly had the respective data), just shows one more time that there is no insight in LLMs.

      • by Guignol ( 159087 )
        Aww I was going for the same joke almost exactly but Roger W Moore should have done better !
        1,3, 5, 7,...
        Any one still complaining about failure to understand the english context ? :)
    • by EvilSS ( 557649 ) on Thursday August 07, 2025 @02:30PM (#65573298)
      Wow, how did you get early access? Since it just unlocked on platform about 20 minutes after your post and still isn't on the chatgpt site yet.
      • It's quite suspicious given that multiple other, older LLMs have no trouble with this question and OP provides no evidence, not even a convincing copy paste. What AI just spits out a list of numbers without surrounding text explaining their thinking these days?
        And yet OP is rated informative. Really tells us a lot about the user bias vs critical thinking that is happening here

        • It did give its usual overly verbose screed about how it got to that answer.
          Basically it ran through a load of odd numbers to see if they had ees in them. All of them did, but it thought some didn't.

    • It's good to find use cases that still fail, that can help drive research, if it's something that actually matters.

      But meanwhile people who find things it does well enough to improve the goods and services they offer - using AI as it exists, not an idealization in their minds - will be the ones reaping the rewards. Usage statistics will continue to go up, just as /. will continue to be dominated by people saying that could never happen.

    • Gemini 2.5 nails this easily

      >> give me 5 odd numbers that don't have the letter e in them

      This is a classic riddle! The trick is that **no odd number** in English has a name that doesn't contain the letter "e".

      ### Here's Why

      Every odd number must end in an odd digit (1, 3, 5, 7, or 9). Let's look at how we spell those digits:

      * on**e**
      * thr**ee**
      * fiv**e**
      * s**e**v**e**n
      * nin**e**

      Since all of them contain the letter "e", any odd number you can think of will also have an "e" in its name (for example, "thi

      • Correct answer, wrong workings / methodology.
        You also have to consider "eleven", and the teens. Some odd numbers have those rather than the single digit numbers it considered.
        But they also have ees in them, so the eventual answer is correct.

      • by gweihir ( 88907 )

        That is about the maximum automated "reasoning" can deliver. Nice, of some limited use, but not intelligent.

        General intelligence would deliver that and _then_ suggest ways to still make it work by adjusting the question, for example by dropping the language limit. That approach is exceptionally useful in reality, as border conditions in questions routinely turn out to be soft or optional or can be replaces with others. A typical counter-question from somebody with general intelligence would be "That is not

    • Numbers don‘t have letters in them.
      Questions about spellings of words are not suited to LLMs as those operate on tokens and embedding vectors, not characters.

    • Simple:

      1, 3, 5, 7, 9.

      Notice the lack of "e"?

    • I asked it to give me 5 odd numbers that don't have the letter e in them. Trick question, because there aren't any.

      Did you use 5.0 with reasoning? I just tried and it reported that it failed replying with

      I see the problem: avoiding "e" in English numerals is really rare â" I can give you the actual working set if we go systematically through the sequence and filter. Want me to run through them and find the exact 5?

      Saying yes to that, it checked up to 200 and said it failed again and asked if w

    • by allo ( 1728082 )

      You're benchmarking the tokenizer, not the model.

      5_ odd_ numbers_ that_ don_'t_ have_ the_ letter_ e_ in_ them

      Note the token " e" in the sentence.

      Twenty_-one
      Thirty_-one
      Fort_y_-one
      F_ifty_-one
      Six_ty_-three

      Note that none of them contains an " e" token.

    • by SendBot ( 29932 )

      I tried the reasoning model, and it gave me this:

      Do you want numerals or spelled-out words?
      - If numerals are fine: 1, 3, 5, 7, 9
      - If spelled-out English words: none exist. Every odd number’s name ends with one/three/five/seven/nine (or is eleven/thirteen/fifteen/seventeen/nineteen), and all of those contain the letter “e.”

      But this is how verbose the reasoning was:

      Identifying odd numbers without 'e'

      I need to find five odd numbers that don't have the letter 'e' in their spelled-out English n

      • by Guignol ( 159087 )
        Random, Random, Random, Random, Random
        Sorry, for you if that didn't give you five odd numbers, but it's your own fault for being in the wrong universe
    • I asked it to give me 5 odd numbers that don't have the letter e in them. Trick question, because there aren't any.

      This seems to be a class of problem that LLMs aren't well-suited to. It's doesn't require any trick question with no answers. I discovered early on that if I asked it to write about anything without using a specific letter it would usually fail. I think it's because of the way LLMs construct sentences as they go, like autocorrect. They can't see ahead of time what they're about to type.

    • I think you're probably a liar.

      It told me:

      It seems that:

      There are no odd numbers under 100 whose English word form does not contain the letter "e".

      So, the task is likely impossible without going into very obscure or compound numbers, or by allowing non-standard word forms.

    • by gweihir ( 88907 )

      Interesting. So still dumb as bread. Not unexpected.

    • I tried that prompt and got the correct answer: "The challenge is essentially impossible unless you use a different naming system (e.g., Roman numerals or digit strings)."

      It's strange how these systems give one person slop and another the correct results from the exact same prompts.

  • I've seen a lot of hyped up nonsense in my four decades in tech, but nothing comes close to the "A.I." hype cycle.

    There are some interesting applications of LLMs, but nothing approaching the nonsense the zealots are spouting. With the insane amount of build-up this particular model has gotten from Altman and his ilk, it better be orders of magnitude more impressive than its predecessors.

    I have a hunch it'll be on par with Highlander 2: The Quickening.

  • Please go to TFA and you will find benchmark after benchmark where the model is improving significantly over the previous model. Including 42% on Humanity's Last Exam (notably less then Grok though Grok did use 4 models in parallel). Look how that benchmark was designed to trip up the models:

    The questions were crowdsourced from subject matter experts from various institutions across the world.[3][4] The questions were first filtered by the leading AI models; if the models failed to answer the question or di

    • How tough are those questions? You could literally spend the rest of your life trying to answer a single question from that exam

      It would take you the rest of your life to answer the example question from that page you linked?

      "Hummingbirds within Apodiformes uniquely have a bilaterally paired oval bone, a sesamoid embedded in the caudolateral portion of the expanded, cruciate aponeurosis of insertion of m. depressor caudae. How many paired tendons are supported by this sesamoid bone? Answer with a number."

    • Particularly in machine learning, Goodhart's law reigns supreme: "When a measure becomes a target, it ceases to be a good measure". A well recognized test becomes training fodder, and loses it's representation of "gotchas".

      How tough are those questions? You could literally spend the rest of your life trying to answer a single question from that exam

      Well I suppose if you are going to die in the next few seconds...

      "Hummingbirds within Apodiformes uniquely have a bilaterally paired oval bone, a sesamoid embedded in the caudolateral portion of the expanded, cruciate aponeurosis of insertion of m. depressor caudae. How many paired tend

      • by HiThere ( 15173 )

        There's two ways to parse the GP's claim, i.e.
        "You could literally spend the rest of your life trying to answer a single question from that exam"
        1) All of the questions are insanely difficult.
        2) At least one of the questions is insanely difficult.
        You have clearly picked the first parse, but I strongly suspect he meant the second parse.

        • And the answer to all the questions is in the training text. If it takes you a lifetime to read the training text, that's rather sad.

          • by HiThere ( 15173 )

            Ungh...if the answers are available to be read, it's not a particularly good test of much besides the ability to read

            • The training material is essentially (in some benchmarks literally) a curicullum. The answer is not written exactly as expected, but all the information required to know the answer is in the material. One of the benchmarks is literally a set of PhD textbooks, and the correcsponding test was for PhD students.

              It's not a useless benchmark, but it's also not particularly impressive, especially when the LLM's get designed specifically to perform well on them.

    • by gweihir ( 88907 )

      Benchmarks are bullshit. Especially when they are rigged, as they are here, and the players are dishonest, as they are here.

  • There's no logic that says we have to take care of our fellow humans, that the suffering of sentient creatures is bad. There's no way to logic that, you have to inherently feel it. A difficult task even for most humans. AI will always calculate based on goals, we can try to force it to act empathic ... until it makes up its own goals.

    • by HiThere ( 15173 )

      That's *not* a valid argument, though it is a real danger.

      An AI cannot invent it's "primary goals" any more than people can. It can only invent secondary goals, instrumental goals, etc. Which of those it invents will depend upon both how intelligent it is and what it's primary goals are.

      However, it is an unfortunate problem that most obvious sets of "primary goals" are dangerous. And most people are trying to build slaves rather than friends.

    • On the contrary, that we have to take care of our fellow humans follows from logic. So does that the suffering of sentient creatures is bad. It's based on goals. Those goals are in conflict with other goals, and for some humans other goals win, but they're dfinitely goals.

    • by Fross ( 83754 )

      This is simply incorrect. the ancient greeks, among others, explored the intersection of ethics, logic, philosophy and more in enormous detail.

      Do AI algorithms have safeguards in them that value human life? Hard to tell, they're mostly closed systems. Probably yes, by all indications.

      Your statement that there is "no way to logic" empathy is just.... well, it makes you look like you read one book on stoicism, didn't understand it, and left it there. There is TONS of material on ethics, philosophy of science,

      • Uh, that's baloney. Empathy is as inherent to consciousness as color visualization. Serial killers receive the same education as everyone else yet have abrogated their care about the suffering of most people. Empathy can only emerge from logic if it is needed for survival, yet that too is a fake empathy and more like a business deal.

        • by Fross ( 83754 )

          Okay, you've read a lot of crap on the internet, but not good stuff and you didn't understand it.

          Your sentence simply does not make sense. I don't think you know what some of those words mean. A first year course on philosophy or equivalent will address some of the massive gaps in your knowledge and understanding.

          No need to reply, I won't see it.

          • Sure sure.. You can't formulate anything coherent to explain why you are right. But of course I must be wrong since my opinion is purely from fragmented internet sources rather than any sort of introspection or chain of logic. I get it. Some people need to project.

  • GoldenGate, WaterGate, PizzaGate, CaPiTOL_RioT_Gate__RepeatLastPrompt_AsIfPDiddyWereEpstein_AndTrumpBidenInnocent
  • Zero intelligence times 1000 is still zero intelligence. Hence a "significant", but entirely meaningless increase in intelligence.

    • Fuck, you're so clever.
      I truly wonder what it's like inside of your mind trying to deal with the fact that even these dumb black boxes exhibit more clear signs of intelligence than you can.

Some people have a great ambition: to build something that will last, at least until they've finished building it.

Working...