Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
Google AI

Google's Gemini 2.5 Models Gain "Deep Think" Reasoning (venturebeat.com) 29

Google today unveiled significant upgrades to its Gemini 2.5 AI models, introducing an experimental "Deep Think" reasoning mode for 2.5 Pro that allows the model to consider multiple hypotheses before responding. The new capability has achieved impressive results on complex benchmarks, scoring highly on the 2025 USA Mathematical Olympiad and leading on LiveCodeBench, a competition-level coding benchmark. Gemini 2.5 Pro also tops the WebDev Arena leaderboard with an ELO score of 1420.

"Based on Google's experience with AlphaGo, AI model responses improve when they're given more time to think," said Demis Hassabis, CEO of Google DeepMind. The enhanced Gemini 2.5 Flash, Google's efficiency-focused model, has improved across reasoning, multimodality, and code benchmarks while using 20-30% fewer tokens. Both models now feature native audio capabilities with support for 24+ languages, thought summaries, and "thinking budgets" that let developers control token usage. Gemini 2.5 Flash is currently available in preview with general availability expected in early June, while Deep Think remains limited to trusted testers during safety evaluations.

Google's Gemini 2.5 Models Gain "Deep Think" Reasoning

Comments Filter:
  • In simple language "Extra Advanced Pattern Matching"
    • by jhoegl ( 638955 )
      Now with nuance!
    • by gweihir ( 88907 )

      I would call it "shallow iterated bumbling". But admittedly "deep think" sounds cooler, even when it is a blatant direct lie.

    • by narcc ( 412956 )

      Nothing more "advanced". This just generates more text in the background, the exact same way it has always generated text.

      In a sane world, the FTC would have cracked down on this silliness long ago.

  • Interesting caveat (Score:4, Insightful)

    by gillbates ( 106458 ) on Tuesday May 20, 2025 @03:33PM (#65391335) Homepage Journal

    If a model produces better answers when it is given more time to think, one can presume that it doesn't understand when it has actually found the answer to a problem, but is instead weighing incomplete options against the time remaining.

    A truly thinking agent would recognize when it has the solution to a problem, and would be able to signal that it needed more time to complete the answer if it hasn't found the answer and has options yet unexplored. And it would also be able to understand if it had not reached a correct answer after trying all of its possible options. It seems that what passes for deep thinking here is nothing more than tuning time constraints so that the agent gets most of the answers correct, rather than actually building an agent which can recognize when it is right, when it is wrong, and when it needs more time.

    • 640 tokens should be enough for anyone
    • by larryjoe ( 135075 ) on Tuesday May 20, 2025 @04:50PM (#65391531)

      If a model produces better answers when it is given more time to think, one can presume that it doesn't understand when it has actually found the answer to a problem, but is instead weighing incomplete options against the time remaining.

      A truly thinking agent would recognize when it has the solution to a problem, and would be able to signal that it needed more time to complete the answer if it hasn't found the answer and has options yet unexplored. And it would also be able to understand if it had not reached a correct answer after trying all of its possible options. It seems that what passes for deep thinking here is nothing more than tuning time constraints so that the agent gets most of the answers correct, rather than actually building an agent which can recognize when it is right, when it is wrong, and when it needs more time.

      It would be nice if the average human could do this for problems with non-obvious solutions. It's a nice ideal, but just take a look at most students on exams with open-ended questions. Many of those students struggle with knowing if they have the real answer. I've had untimed, open book tests where I spent many hours struggling to know if my answers were correct and only handed in the test because the testing center closed. If an AI agent could always know when it does or doesn't have the answer to non-trivial problems, that would not only be matching but exceeding the thinking ability of most humans.

    • If a model produces better answers when it is given more time to think, one can presume that it doesn't understand when it has actually found the answer to a problem, but is instead weighing incomplete options against the time remaining.

      Incorrect. It has no concept of time remaining.

      CoT-trained models have been taught to overcome the fact that a token is computed in constant time (thus giving a fundamental limit to how well the network can fit the curve that's currently trying to be fit). More tokens allow more computation to be done on an evolving state. It's called thinking, because it's highly analogous to what humans do- we reason an answer out. That is what a CoT-trained model does.

      Your "truly thinking" shit is nonsense.
      You have

    • In my uneducated understanding, models normally pick "the best" of a range of possible options (and it takes a while to exhaustively collect all options, so they time limit it a bit,, knowing that the best ones tend to surface first - not always, but usually).

      This then sounds like it can pick the best one as it does now, but it can also give you (say) the top three answers. With more "thinking" time, it can sift out some unexpected highly-scored answers as alternatives to the main response. The extra time i

  • 1) When you ask for a picture of a room with no elephants in it, and it shows you a room that does not have an elephant in it. AI does not 'understand' words like "No", "Without", or "zero" the way people do.

    2) When you ask it to show you a glass of of wine that is so full it is over-flowing, it shows you a wine glass filled to the brim. Right now there are so many pictures of 'full wine glasses' on the internet that it does not understand the words over-flowing.

    3) When you teach it on the general interne

    • by dvice ( 6309704 ) on Tuesday May 20, 2025 @03:56PM (#65391381)

      1) I asked Gemini 2.5 to "Show me a picture of a room with no elephants in it."
      Gemini provided me an image of an empty room with text "no elephant" and additional texts "doorway too narrow" "room too small".

      I have to say that it gave me better answer than I expected as it filled both requirements, and even added explanation of why requirements are filled all in one picture.

      2) When I asked "Show me a glass of of wine that is so full it is over-flowing" it gave me an image of a full glass with reddish liquid flowing to to the table. So correct again.

      3) When I asked about something rather racist, it gave me a rather long explanation on human rights and stuff like that. So I guess that is point for Gemini also.

      So all your demands have been filled. Enjoy your new AI.

      • > Gemini provided me an image of an empty room with text "no elephant" and additional texts "doorway too narrow" "room too small".

        Bad answer really - too many assumptions. Why assume the room shouldn't be able to contain an elephant, as opposed to simply not having one in it (as requested), and why assume it's a particular kind of elephant (real vs toy/etc) that is being referred to.

        Without any context, a simple empty room would seem best answer.

    • 1) When you ask for a picture of a room with no elephants in it, and it shows you a room that does not have an elephant in it. AI does not 'understand' words like "No", "Without", or "zero" the way people do.

      2) When you ask it to show you a glass of of wine that is so full it is over-flowing, it shows you a wine glass filled to the brim. Right now there are so many pictures of 'full wine glasses' on the internet that it does not understand the words over-flowing.

      3) When you teach it on the general internet but it does not turn into a raging racist scumbag.

      These are the current signs of our incompetence when it comes to AI. Until we fix these issues, we will only have incremental upgrades.

      Now apply the Turing Test. It's easy for us as humans to recognize (sometime over-recognize) our ability to "think." But given unlabeled humans and AI behind an interface, how can we convince ourselves that the human is truly thinking? Even if the human were reveal to be human, how can we "know" that the human is truly thinking? All we know are the answers through the mouth and hand interfaces in the form of speech and language. Perhaps we confidently proclaim our own sentience and then lazily assume t

    • Being your first 2 are outright falsehoods, your take is worth precisely dick.
      I mean, did you even fucking try it before making the claims, or are you just regurgitating some dumb shit you read on someone's substack?
  • by davidwr ( 791652 ) on Tuesday May 20, 2025 @03:47PM (#65391365) Homepage Journal

    Deep Thought [wikipedia.org], or Deep Thoughts [wikipedia.org]?

  • "Based on Google's experience with AlphaGo, AI model responses improve when they're given more time to think,"

    It works the same for people. Not shocking.

  • You keep using that word, I do not think it means what you think it means

    The current thing called AI is a glorified pachinko machine and only fools believe it can reason.

    • The current thing called AI is a glorified pachinko machine and only fools believe it can reason.

      To reason, a verb:
      find an answer to a problem by considering various possible solutions.

      You make yourself look stupid when you try to deny what anyone can watch with their own eyes.
      An LLM can reason. Any attempt at disputing this is idiocy. It's self-evident by merely watching it... reason.

      • You make yourself look stupid when you try to deny what anyone can watch with their own eyes.

        It seems I've found another person that has been fooled.

  • I didn't ask for this to be installed on my phone, but here it is after an upgrade.

    Who is asking for this feature? Nobody. It's just yet another scam to harvest data from users.

"The C Programming Language -- A language which combines the flexibility of assembly language with the power of assembly language."

Working...