Google Gemini 1.5 Pro Leaps Ahead In AI Race, Challenging GPT-4o (venturebeat.com) 11

Posted by BeauHD on Friday August 02, 2024 @09:00AM from the AI-arms-race dept.

An anonymous reader quotes a report from VentureBeat: Google launched its latest artificial intelligence powerhouse, Gemini 1.5 Pro, today, making the experimental "version 0801" available for early testing and feedback through Google AI Studio and the Gemini API. This release marks a major leap forward in the company's AI capabilities and has already sent shockwaves through the tech community. The new model has quickly claimed the top spot on the prestigious LMSYS Chatbot Arena leaderboard (built with Gradio), boasting an impressive ELO score of 1300.

This achievement puts Gemini 1.5 Pro ahead of formidable competitors like OpenAI's GPT-4o (ELO: 1286) and Anthropic's Claude-3.5 Sonnet (ELO: 1271), potentially signaling a shift in the AI landscape. Simon Tokumine, a key figure in the Gemini team, celebrated the release in a post on X.com, describing it as "the strongest, most intelligent Gemini we've ever made." Early user feedback supports this claim, with one Redditor calling the model "insanely good" and expressing hope that its capabilities won't be scaled back. "A standout feature of the 1.5 series is its expansive context window of up to two million tokens, far surpassing many competing models," adds VentureBeat. "This allows Gemini 1.5 Pro to process and reason about vast amounts of information, including lengthy documents, extensive code bases, and extended audio or video content."

Google Gemini 1.5 Pro Leaps Ahead In AI Race, Challenging GPT-4o

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 11 Comments Log In/Create an Account

Comments Filter:

prestigious? (Score:2, Insightful)

by itamblyn ( 867415 ) writes:

Since when are LLM leaderboards prestigious? And is anyone surprised there are new models coming out with larger context lengths? I wouldn't call that a shockwave through the tech community...
- Re: (Score:2)
  
  by Junta ( 36770 ) writes:
  
  Yeah, qualitative experience hasn't tracked the scale of the quantitative measures being bragged about for a while...
- Re:prestigious? (Score:5, Insightful)
  
  by Rei ( 128717 ) writes: on Friday August 02, 2024 @10:05AM (#64675260) Homepage
  
  LMSYS is different from most leaderboards [lmsys.org]. It's not some fixed open set of questions, but rather it's A-B testing by humans, manually rating which answers they think are better.
  A naive implementation of transformers suffers from O(N^2) scaling with respect to context windows, whereas tricks like rope scaling to extend windows reduce quality. So achieving large contexts at high quality is an achievement. Some alternative architectures like Mamba don't suffer from O(N^2) scaling but aren't as mature.
  
- Re: (Score:2)
  
  by Compaq Disk Rereader ( 10425332 ) writes:
  
  yeah the tone in this "article" (built with Gradio) is not right.
Still mostly useless? (Score:2)

by gweihir ( 88907 ) writes:

Though so. That score is really meaningless.
May need to give it away (Score:2)

by substance2003 ( 665358 ) writes:

They may have something better than ChatGPT 4o but the free version was horrible compared to the GPT3.5 so I stopped trying to use it.
I don't feel an incentive to pay to try their 'better' production.

At this point Google would have to pay me to use their shoddy software. I'm banking on Gemini to end up in the Google graveyard.
- Re: (Score:2)
  
  by echo123 ( 1266692 ) writes:
  
  They may have something better than ChatGPT 4o but the free version was horrible compared to the GPT3.5 so I stopped trying to use it.
  I don't feel an incentive to pay to try their 'better' production.
  My experience mirrors yours. Gemini seems primarily focused on manipulating Google Docs files, FWIW. Maybbe it was trained on 'anonymous' Google Docs user data. Made anonymous the same way gmail is anonymous yet still indexed and scored 'for the user's overall experience within Google'.
  Also for what it is worth, I have great success 'coding by prompt' writing Drupal Form API code, presumably because claude.ai and chatGPT 4o were trained on open-source github/ gitlab examples.
  And since my code is also open-s
  - Re: (Score:2)
    
    by substance2003 ( 665358 ) writes:
    
    Yes I have been using ChatGPT for coding sinple stuff like Powershell and Python scripts. I appreciated it's ability to clean up and help me improve code.
    I tried to do the same with Gemini and when it hit a barrier, I put the code over to ChatGPT that was able to figure out what eluded Gemini. Not good when one A.I has to be used to correct another one.
hahaha in your dream (Score:1)

by elcor ( 4519045 ) writes:

/. is a cess pool of corporate promotion but reality is a bitch i use gemini 1.5 pro daily and it's about half as useful as those two others it hallucinates constantly which the others don't anymore and the code is nonsense
Ugh, the world can't get together to limit energy (Score:2)

by piojo ( 995934 ) writes:

Ugh, the world leaders have made no progress on limiting energy use. LLMs account for a growing and significant amount of the world's energy usage, and we still can't coordinate to limit individual organizations from using as much power as they like. The cost of electricity (and other fuels) simply does not balance the growing harm. The article next to this one is about a 10 degree C rise in Arctic temperature. That's huge! It's not like I don't want summers to be less sweltering, but this will wreak havoc

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Google Gemini 1.5 Pro Leaps Ahead In AI Race, Challenging GPT-4o (venturebeat.com) 11

Google Gemini 1.5 Pro Leaps Ahead In AI Race, Challenging GPT-4o More Login

Google Gemini 1.5 Pro Leaps Ahead In AI Race, Challenging GPT-4o

prestigious? (Score:2, Insightful)

Re: (Score:2)

Re:prestigious? (Score:5, Insightful)

Re: (Score:2)

Still mostly useless? (Score:2)

May need to give it away (Score:2)

Re: (Score:2)

Re: (Score:2)

hahaha in your dream (Score:1)

Ugh, the world can't get together to limit energy (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot