Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
AI

AI Companies Hit Development Hurdles in Race for Advanced Models (yahoo.com) 27

OpenAI's latest large language model, known internally as Orion, has fallen short of performance targets, marking a broader slowdown in AI advancement across the industry's leading companies, according to Bloomberg, corroborating similar media stories in recent days. The model, which completed initial training in September, showed particular weakness in novel coding tasks and failed to demonstrate the same magnitude of improvement over its predecessor as GPT-4 achieved over GPT-3.5, the publication reported Wednesday.

Google's upcoming Gemini software and Anthropic's Claude 3.5 Opus are facing similar challenges. Google's project is not meeting internal benchmarks, while Anthropic has delayed its model's release, Bloomberg said. Industry insiders cited by the publication pointed to growing scarcity of high-quality training data and mounting operational costs as key obstacles. OpenAI's Orion specifically struggled due to insufficient coding data for training, the report said. OpenAI has moved Orion into post-training refinement but is unlikely to release the system before early 2024. The report adds: [...] AI companies continue to pursue a more-is-better playbook. In their quest to build products that approach the level of human intelligence, tech firms are increasing the amount of computing power, data and time they use to train new models -- and driving up costs in the process. Amodei has said companies will spend $100 million to train a bleeding-edge model this year and that amount will hit $100 billion in the coming years.

As costs rise, so do the stakes and expectations for each new model under development. Noah Giansiracusa, an associate professor of mathematics at Bentley University in Waltham, Massachusetts, said AI models will keep improving, but the rate at which that will happen is questionable. "We got very excited for a brief period of very fast progress," he said. "That just wasn't sustainable."
Further reading: OpenAI and Others Seek New Path To Smarter AI as Current Methods Hit Limitations.

AI Companies Hit Development Hurdles in Race for Advanced Models

Comments Filter:
  • Could it be (Score:5, Insightful)

    by Rosco P. Coltrane ( 209368 ) on Wednesday November 13, 2024 @09:03AM (#64942339)

    That AI is overhyped and the bubble is well overdue for the burst it richly deserves?

    • Re:Could it be (Score:5, Informative)

      by jonsmirl ( 114798 ) on Wednesday November 13, 2024 @09:21AM (#64942365) Homepage

      With most tech the first 90% is easy to quickly achieve. And then the last 10% takes decades to sort out. I don't see why AI should be any different.

      • With most tech the first 90% is easy to quickly achieve. And then the last 10% takes decades to sort out. I don't see why AI should be any different.

        Call me back when it's more than a glorified albeit extremely convincing chatbot.

      • With most tech the first 90% is easy to quickly achieve. And then the last 10% takes decades to sort out. I don't see why AI should be any different.

        True. Generalizing further, the pace of research advancements tends to be extremely lumpy. Big advances followed by slow incremental success mixed with confusing results followed after some time by another advance.

        Sometimes the targeted research goal is never achieved, and sometimes the research achieves success after following the cycle described above. That is, the lull after an advance neither necessarily indicates failure nor success.

    • Re: (Score:2, Interesting)

      Could be. We hope.

      I would suggest that there are psychological factors influencing what's going on. The people calling the shots on financing are... shall we say, deluded, uninformed, and investing with emotions, basically FOMO. Believe it or not, there's too much loose change in the world. Hedge funds for instance, by virtue of massive amounts of capital, and a bit of luck, can amass shitloads of money, and it has to go somewhere. There's too much money chasing profits. So the lineup of loaded dummies is l
      • Their customers have been trapped in the idioms of the tools, and cannot conceptualize solutions outside that trap.

        You're right on that one.

        Here's a little anecdote that happened a few months ago:

        One of the new recruits in our company, fresh out of college, was assigned to work on a project that requires a Python class I wrote. One day he popped into my office and asked me how to perform some I/O function. I told him the name of the method in the class.

        15 minutes later, he came back and told me the method doesn't exist. Whaa...? So I opened the file and sure enough, there was the method.

        You know what he told me? "But it

        • Talk about brainwashing...

          No, that's teaching coding by the numbers, teaching the tradesman to be another consumer. He didn't learn, or wasn't taught what the tool can't do. It's probably the first time he worked on a project so big, that it couldn't be fully loaded into a code editor.

    • by dvice ( 6309704 )

      AI is overhyped and underhyped. Mainly the LLM is overhyped and on the other hand we have underhyped AlphaFold2 and AlphaFold3.

    • by gweihir ( 88907 )

      Hopefully. This insanity has to stop.

  • by sacrilicious ( 316896 ) <qbgfynfu.opt@recursor.net> on Wednesday November 13, 2024 @09:20AM (#64942363) Homepage

    OpenAI has moved Orion into post-training refinement but is unlikely to release the system before early 2024.

    I can't wait that long!

  • by memory_register ( 6248354 ) on Wednesday November 13, 2024 @09:22AM (#64942367)
    It's a very large, cleverly-designed word prediction system. We're finally hitting some hard limits, and it will help temper our expectations.

    Don't get me wrong, I think LLMs are really useful and will do a lot of good, just don't mistake them for actual intelligence.
    • A fundamental problem with these models is they only use words and their associations. It doesn't visualize the problem as humans often do. For example, there's Maxwell's equations, but Maxwell also described how he came up with his equations where he imagined physical models to explain the equations. Like rotating vortex tubes. Which if you look at, primatively describes what's taught in school today about atoms and electrons, but the atom hadn't been discovered yet (Maxwell 1850s-70s, atom 1908). Scientis
      • To correct myself, not quite accurate to say 1908 as "atom discovery," more of a range where greater understanding of it was made, but it was around that time period.
    • It's a very large, cleverly-designed word prediction system. We're finally hitting some hard limits, and it will help temper our expectations. Don't get me wrong, I think LLMs are really useful and will do a lot of good, just don't mistake them for actual intelligence.

      Ridiculous. What we need is a book of infinite wisdom to train the models on. Then the more power the greater the wisdom. Like forever. I don’t understand why people think computer science is hard.

      • Ridiculous. What we need is a book of infinite wisdom to train the models on.

        The Philosophers Union is here to see you about demarcation of turf. Somebody just say 42 and be done with it.

    • by Dster76 ( 877693 )

      cleverly-designed

      The cleverness was the ability to allow learning from large quantities of unstructured data, just like people can

      finally

      ???

      hard limits

      ???

      it will help temper our expectations

      Your expectations already sound tempered?

      actual intelligence

      Akshualling can be done to people too, slashdot is primarily a platform for this

  • I don't understand these repeated calls for more data. For almost everything you would want a LLM to solve now, the answer very likely already is in the training data for the current models. For e.g. coding challenges I doubt there's any facts, information etc. missing in the data, so it is weird it somehow comes down to a question of 'volume'. Isn't it a question of how well the model works with the data it has got? While I think LLM's are more powerful than many skeptics give it credit for (just statistic
    • While I think LLM's are more powerful than many skeptics give it credit for (just statistical machines that find similar things in training data etc.) I do think that if size of training data is a limit as of now, it suggests LLM's seemingly do not have sufficient cognitive depth.

      LLM are fancy autocomplete so the novel aspects of its output is more of an existing gap filler where existing patterns neatly outline something a bit like negative space in art. But it seems to not be able to hold concepts together in a long chain of reasoning, logic, or coding because it truly understands nothing even if it’s capable of novel output. LLM are likely just a small component of the AI investors and the public are looking for and we simply need a few hundred or thousand such diverse sy

    • by dvice ( 6309704 )

      Google Deepmind agrees with you. They want to create AI that can be trained with less data, not with more like OpenAI wants to do it.

    • > I don't understand these repeated calls for more data.

      Basically because:

      1) Up until GPT-4 class models, the LLM "scaling laws" (more data/parameters/compute = more better) were intact. Of course this wasn't a law, just a rearview-mirror empirical observation, but absent any reason to stop it made sense to keep scaling until improvement leveled out... which it now appears to be doing. Of course they can keep adding specific data to improve specific benchmarks, but those aren't going to help the average

    • Finally, someone says this. Saying that all the knowledge in the world is not enough to create something as intelligent as a human is insane. The amount of knowledge the average human (including their own experiences) has is minimal compared to that. Obviously, an ANN that properly processes all that knowledge into insight can be AGI. We just need to find the architecture/topology that is capable of that.

      The amount of data is not the bottleneck for creating AGI.

  • You might have missed the train already (compared to 2023), but if you're okay doing gruntwork for 20 USD/hr and know how to code, there are clients spending a lot on fabricating training data for models like Orion. One I saw recently is dataannotation.tech. I don't see these models going anywhere fast, but may as well take advantage of the optimistic vendors while you can.

  • we already stole everything to train the previous generation on.
  • In the 1960s and 1970sAI progressed at an astounding rate. So much so that many were already predicting human-level AI in short order. Many of them were no slouches: e.g. Marvin Minsky. I have a little book in my library that goes under the title Experiments in AI for Small Computers, published in 1981. The preface to this book unabashedly asserts that AI researchers working on certain types of AI programs should take precautions lest their programs suddenly become intelligent and get out of control. This 4
    • Tell me, what percentage of global GDP was going to AI-related matters in that era?
      A: Less than 0.0001% (~10 million out of ~1000 billion)

      It's ~0.2% (~200 billion out of ~100 000 billion) in 2023. And growing.
      Also, pretty much every comp science department is working hard on core AI research now, with tons of other STEM departments involved in application research simultaneously (there's been a bunch of resistance, but a lot of them are now finally embracing AI, with biology being a notable area of developm

  • In 30 years we went from T9 (predictive word) on phones to GPT (predictive response). Note that ground for GPT was LSTM was described in 1995 (5 years after T9). Hopefully some funding ends up in RnD departments that do novel things instead of wasting money on training "similar" model.
  • KNOW ANYTHING. A machine does not have, and will NEVER have, any intrinsic knowledge about anything.

    Any baby can be handed a ball, and will rapidly "get" what a ball is, and any kiddo knows a ball, recognizes any variation of ball, knows what to do with a ball, etc. A computer will never be able to do this... it's just a pile of millions of transistors. There's no way for a computer to grok anything. You can give a computer the biggest database known to man, the most rapid ability to plow through the datab

Surprise your boss. Get to work on time.

Working...