Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
AI Google Microsoft

OpenAI's Lead Over Other AI Companies Has Largely Vanished, 'State of AI' Report Finds (yahoo.com) 35

An anonymous reader shares a report: Every year for the past seven, Nathan Benaich, the founder and solo general partner at the early-stage AI investment firm Air Street Capital, has produced a magisterial "State of AI" report. Benaich and his collaborators marshal an impressive array of data to provide a great snapshot of the technology's evolving capabilities, the landscape of companies developing it, a survey of how AI is being deployed, and a critical examination of the challenges still facing the field.

One of the big takeaways from this year's report, which was published late last week, is that OpenAI's lead over other AI labs has largely eroded. Anthropic's Claude 3.5 Sonnet, Google's Gemini 1.5, X's Grok 2, and even Meta's open-source Llama 3.1 405 B model have equaled, or narrowly surpassed on some benchmarks, OpenAI's GPT-4o.ââBut, on the other hand, OpenAI still retains an edge for the moment on reasoning tasks with the release of its o1 "Strawberry" model -- which Air Street's report rightly characterized as a weird mix of incredibly strong logical abilities for some tasks, and surprisingly weak ones for others.

Another big takeaway, Benaich told me, is the extent to which the cost of using a trained AI model -- an activity known as "inference" -- is falling rapidly. There are several reasons for this. One is linked to that first big takeaway: With models less differentiated from one another on capabilities and performance, companies are forced to compete on price.ââAnother reason is that engineers for companies such as OpenAI and Anthropic -- and their hyperscaler partners Microsoft and AWS, respectively -- are discovering ways to optimize how the largest models run on big GPU clusters. The cost of outputs from OpenAI's GPT-4o today is 100-times less per token (which is about equivalent to 1.5 words) than it was for GPT-4 when that model debuted in March 2023. Google's Gemini 1.5 Pro now costs 76% less per output token than it did when that model was launched in February 2024.â

OpenAI's Lead Over Other AI Companies Has Largely Vanished, 'State of AI' Report Finds

Comments Filter:
  • by gweihir ( 88907 ) on Friday October 18, 2024 @02:49PM (#64875195)

    Crappy and unfixable as their product is. All they had is more lies and something dressed up to look nicely a bit earlier.

    It really does not matter. All LLMs are crap and will remain crap. The approach is not suitable for anything but a slightly better search engine, but I recently found out ChatGPT is crap at that as well, because it could not actually provide reasonable sources for statements it made. Apparently referencing sources is not done using the training data, but by web search. With that, I can simply do a web search directly and save time.

    • LLMs fantastically solved the problem of how to masquerade spam as legitimate content undetectably to search engines.

      LLMs may indeed only produce crap, but that's great news for all the shitmongers out there.

      • by ebunga ( 95613 )

        And it wasn't that long ago that they were exclusively available only as the deluxe AI-based version of article respinners used for SEO spam for the content factories used in affiliate marketing schemes.

    • by MightyMartian ( 840721 ) on Friday October 18, 2024 @02:55PM (#64875219) Journal

      People that say this sort of thing never actually seem to use it for more than making fart poems. I have used ChatGPT to combine multiple dissimilar reports into a single collated report in the proper tense. It still needed some massaging, but for me to do it would have been several hours work, and ChatGPT puked out a page of text that I just had to tweak and add some graphs to. It's terrible for a lot of problems (the code it creates is nightmarish, but it's actually not bad with SQL), but for language-based problems, providing you understand its limitations and how to give it instructions, it's definitely boosted my productivity.

      • by gweihir ( 88907 )

        Your invalid AdHominem is just that: invalid.

      • People that say this sort of thing never actually seem to use it for more than making fart poems. I have used ChatGPT to combine multiple dissimilar reports into a single collated report in the proper tense. It still needed some massaging, but for me to do it would have been several hours work, and ChatGPT puked out a page of text that I just had to tweak and add some graphs to. It's terrible for a lot of problems (the code it creates is nightmarish, but it's actually not bad with SQL), but for language-based problems, providing you understand its limitations and how to give it instructions, it's definitely boosted my productivity.

        This is true, but it is also not the hype. The hype is that 2028 GPT powered robots will put an end to the need for any human workers and what’s left of humanity will subsist on Soylent Green, Corpse Starch and just enough UBI handouts to prevent us from revolting and eatingo our AI bro overlords raw.

      • Re: (Score:3, Interesting)

        by dfghjk ( 711126 )

        "...it's definitely boosted my productivity."

        Because your job is to produce things that aren't needed, and we're perhaps better off without, such as ramming together dissimilar "reports" made by others and adding "some graphs". You know, things that worthless tools can "puke out".

    • by Ed_1024 ( 744566 ) on Friday October 18, 2024 @03:37PM (#64875347)
      I think one of the more telling things is that Apple declined to invest in them. It was not because they (Apple) were short of cash so it must have been because they just did not see anything remarkable in the product, or anything they could not do at least as well or better themselves. $6B or whatever it was buys a lot of custom silicon AI hardware and electricity...
      • Keep in mind Apple is the original proprietary model in tech that everyone is trying to be now. Microsoft has branded hardware devices, for instance. Vendor lock in and root access to your device is extremely profitable.

        I would likely put Apple's strategy at: We saw what they are doing at OpenAI and we'll just take the idea and develop it ourselves. Who tf wants to pay license fees?!
    • Seems to me the only times "AI" is truly beneficial is at strictly defined tasks with vetted data. Which proves that garbage in is garbage out, and humans produce loads of it which does not make for good general training data.

      In my experience, when it comes to programming, so far it hasn't produced better results than a decent internet search.
      As for fiction writing, it hasn't produced anything more than the usual plot ideas in a generic fashion, so it hasn't grown beyond 99% of screenwriting including the b

      • by ebunga ( 95613 )

        To be able to craft the proper input you have to be an expert in the subject domain and the tool, and to make sure it's not generating garbage. To validate the results you have to do actual research to verify any citations, and also be a subject domain expert.

        The only value in genAI right now is for bullshit-heavy jobs like SEO spam and business consultants that speak moon language to begin with.

        • by gweihir ( 88907 )

          Yes, that seems to be the main "use": A negative one. As somebody called it, "better crap". Still crap and the last thing the world needs is more of it.

      • by gweihir ( 88907 )

        Seems to me the only times "AI" is truly beneficial is at strictly defined tasks with vetted data.

        Maybe. LLMs have a tendency to hallucinate even with good data.

    • Re: (Score:2, Insightful)

      by thegarbz ( 1787294 )

      It really does not matter. All LLMs are crap and will remain crap.

      False. LLMs are just a tool. The implementation as random chatbots are crap. On the flip side they are incredibly useful when fed specific information in their training and when they provide specific references to source documents.

      You using only a bunch of shitty ones doesn't make all LLMs crap.

      • by gweihir ( 88907 )

        Nope. But is looks like you still do not understand LLMs at all. For example, "hallucinations" do _not_ go away with carefully selected training data.

  • Great. In that case, can we get a new punchgob in place of this Altman guy? I'm sick of his punchable gob.
  • by Bill, Shooter of Bul ( 629286 ) on Friday October 18, 2024 @02:55PM (#64875217) Journal
    First of all, most of the innovation was done by Google*.
    Second of all OpenAi has had a huge brain drain.

    I think Google did not release on purpose, it wanted someone else to be hit with all the bad publicity from the bad ai results. Now that has happened it can swoop in and collect the accolades and dollars while letting Open Ai get all the blame.
  • I guess the next step investors will finally realize AI is just an empty money pit and will not provide much value.
  • by MpVpRb ( 1423381 ) on Friday October 18, 2024 @03:30PM (#64875319)

    While it's likely that future AI will be a very useful tool, and early versions like AlphaFold are already producing results, today's consumer focused AI offerings are just crap generators that produce stuff that appears to be well written, but is in fact, crap. It's kinda like a BS artist, who confidently claims expertise while spewing nonsense

  • by rsilvergun ( 571051 ) on Friday October 18, 2024 @03:38PM (#64875349)
    About a Google engineer complaining that he was an expert mathematician spending his days trying to figure out how to get people to click on advertisements.

    I wonder how many thousands of hours of incredibly valuable time from highly skilled mathematicians is going to be spent figuring out how to replace workers making 10 to 12 bucks an hour so that money can be pocketed by a handful of Nepo babies and oligarchs.

    I sometimes wonder what our species could accomplish if we didn't spend so much time and effort pleasing our kings and queens. I guess we call them CEOs now.
    • by ebunga ( 95613 )

      Wonder if that guy is still alive? Must be pretty depressing knowing the highest achievement you'll ever hit is increasing ad clicks by 0.00000002%

      • The shitloads of money he got paid probably helped. But yeah everybody has to pay the bills. Of course you get a lot of public University educators and professors and teachers assistants who live in abject poverty so that they can work on science stuff that eventually becomes incredibly profitable products for other people to make money off of.

        I've known guys like that and honestly they don't care about the money they're just completely obsessed with their area of expertise. The problem is you usually
    • by Rinnon ( 1474161 )

      I sometimes wonder what our species could accomplish if we didn't spend so much time and effort pleasing our kings and queens. I guess we call them CEOs now.

      Following along with your analogy, I would suggest that it isn't the CEO that has replaced the monarch... it is the market itself. Which is far more terrifying, because you can't overthrow, imprison, or guillotine the market.

      • It's not a real thing. It's an abstraction we use to understand the systems people put in place.

        At the end of the day it's the CEOs that actually control and manipulate the market. It is not nor has it ever been nor will it ever be "free". That's a lie the CEOs and other members of the ruling class tell you so that you will leave a power vacuum they can fill.

        It's the same reason they tell you not to organize into labor unions and the same reason they tell you both sides are bad when it comes to voti
  • by Whateverthisis ( 7004192 ) on Friday October 18, 2024 @04:22PM (#64875469)
    I think they still have the lead in wild unrealistic statements, [tomshardware.com]insane predictions [venturebeat.com], and poor financial [reuters.com] planning [cnbc.com].

People who go to conferences are the ones who shouldn't.

Working...