Yann LeCun Raises $1 Billion To Build AI That Understands the Physical World (wired.com) 61
An anonymous reader quotes a report from Wired: Advanced Machine Intelligence (AMI), a new Paris-based startup cofounded by Meta's former chief AI scientist Yann LeCun, announced Monday it has raised more than $1 billion to develop AI world models. LeCun argues that most human reasoning is grounded in the physical world, not language, and that AI world models are necessary to develop true human-level intelligence. "The idea that you're going to extend the capabilities of LLMs [large language models] to the point that they're going to have human-level intelligence is complete nonsense," he said in an interview with WIRED.
The financing, which values the startup at $3.5 billion, was co-led by investors such as Cathay Innovation, Greycroft, Hiro Capital, HV Capital, and Bezos Expeditions. Other notable backers include Mark Cuban, former Google CEO Eric Schmidt, and French billionaire and telecommunications executive Xavier Niel. AMI (pronounced like the French word for friend) aims to build "a new breed of AI systems that understand the world, have persistent memory, can reason and plan, and are controllable and safe," the company says in a press release. The startup says it will be global from day one, with offices in Paris, Montreal, Singapore, and New York, where LeCun will continue working as a New York University professor in addition to leading the startup. AMI will be the first commercial endeavor for LeCun since his departure from Meta in November 2025. [...]
LeCun says AMI aims to work with companies in manufacturing, biomedical, robotics, and other industries that have lots of data. For example, he says AMI could build a realistic world model of an aircraft engine and work with the manufacturer to help them optimize for efficiency, minimize emissions, or ensure reliability. LeCun says AMI will release its first AI models quickly, but he's not expecting most people to take notice. The company will first work with partners such as Toyota and Samsung, and then will learn how to apply its technology more broadly. Eventually, he says, AMI intends to develop a "universal world model," which would be the basis for a generally intelligent system that could help companies regardless of what industry they work in. "It's very ambitious," he says with a smile.
The financing, which values the startup at $3.5 billion, was co-led by investors such as Cathay Innovation, Greycroft, Hiro Capital, HV Capital, and Bezos Expeditions. Other notable backers include Mark Cuban, former Google CEO Eric Schmidt, and French billionaire and telecommunications executive Xavier Niel. AMI (pronounced like the French word for friend) aims to build "a new breed of AI systems that understand the world, have persistent memory, can reason and plan, and are controllable and safe," the company says in a press release. The startup says it will be global from day one, with offices in Paris, Montreal, Singapore, and New York, where LeCun will continue working as a New York University professor in addition to leading the startup. AMI will be the first commercial endeavor for LeCun since his departure from Meta in November 2025. [...]
LeCun says AMI aims to work with companies in manufacturing, biomedical, robotics, and other industries that have lots of data. For example, he says AMI could build a realistic world model of an aircraft engine and work with the manufacturer to help them optimize for efficiency, minimize emissions, or ensure reliability. LeCun says AMI will release its first AI models quickly, but he's not expecting most people to take notice. The company will first work with partners such as Toyota and Samsung, and then will learn how to apply its technology more broadly. Eventually, he says, AMI intends to develop a "universal world model," which would be the basis for a generally intelligent system that could help companies regardless of what industry they work in. "It's very ambitious," he says with a smile.
Excellent! (Score:4, Funny)
They just have to build a simulation the size of the Universe and the gods themselves will pop out of Heaven to congratulate them.
Re: (Score:1)
Re: (Score:2)
Quite likely. If you're in the business of brute forcing the world without regard to the physics laws that make it move, what are your other options?
Re: (Score:3)
Re: (Score:2)
They can use minecraft as a simulation. It's a trillion billion times larger than the known universe or something.
Re: (Score:2)
They just have to build a simulation the size of the Universe and the gods themselves will pop out of Heaven to congratulate them.
Guess it's time to re-read Olympos and Illium by Dan Simmons.
You can lead a bot to solder.. (Score:3)
So, we want to teach AI about the physical world. Huh. Some would argue the body-less entity would merely need a few volumes on physics to understand that. Are investors going to start funding apple orchards near the data centers when we get to the part on gravity or what?
I'm reminded of a variant on a related theme; You can lead a bot to solder, but you can't make it think.
Re: (Score:2)
Actually, this is a problem being worked on by everyone working on robots. And LOTS of progress is being made, though it's usually not described in quite the terms used here.
Re: (Score:2)
As we kmow, AI can't 'reason', so it can't extrapolate one idea to another.
They very much can extrapolate, this technology would be rather pointless if they couldn't.
If however, someone were able to (say) teach AI about gravity, and have it work out the trajectory of a
ball thrown across a field, then it would be a remarkable achievement.
People can work out the trajectory of a ball from past experience without needing to learn about gravity. AI can do the same shit.
Re: (Score:1)
AI cannot extrapolate, per construction. They can interpolate better than anyone, but as soon as you leave its dataset, it has no clue anymore.
You can feed the best AI a trillion photos of cats, if none of them included a black cat, it will be fundamentally unable to tell you that a picture of a black cat contains a cat.
The illusion that it can extrapolate comes from the fact that these models are fed with humongous amounts of data, so even just interpolating is still mostly good enough as you won't go near
Re: (Score:2)
AI cannot extrapolate, per construction. They can interpolate better than anyone, but as soon as you leave its dataset, it has no clue anymore.
You can feed the best AI a trillion photos of cats, if none of them included a black cat, it will be fundamentally unable to tell you that a picture of a black cat contains a cat.
The illusion that it can extrapolate comes from the fact that these models are fed with humongous amounts of data, so even just interpolating is still mostly good enough as you won't go near the edge of the dataset.
An artificial algorithmic "mind" knows what it knows.
A human mind can conceive that there are things it does not know.
So AI can sort through and retrieve an item and that item's entire ranked adjacent items, faster and more reliably than any human brain. But it cannot break its own rules. You can ask AI to generate an image of a cat with nine tails and zebra striped fur, and it can do that because you - the human mind - prompted it to invoke those specific rules (cat, nine, tails, zebra, fur) and combine th
Re: (Score:2)
AI cannot extrapolate, per construction. They can interpolate better than anyone, but as soon as you leave its dataset, it has no clue anymore.
AI can generalize which includes extrapolation. If you are able to provide a concrete example of what an AI would not be able to do because it cannot "extrapolate" you can probably make some cash developing AI benchmarks for it.
You can feed the best AI a trillion photos of cats, if none of them included a black cat, it will be fundamentally unable to tell you that a picture of a black cat contains a cat.
AIs are more than capable of doing this.
The illusion that it can extrapolate comes from the fact that these models are fed with humongous amounts of data, so even just interpolating is still mostly good enough as you won't go near the edge of the dataset.
That everything is an illusion because somewhere something must exist in the training dataset is a position highly resistant to falsification.
A real world example a few months ago I asked an AI to provide sample code using connection id in Op
Re: (Score:2)
They very much can extrapolate, this technology would be rather pointless if they couldn't.
That depends on what "AI" you are talking about. LLM's certainly can't extrapolate. The technology is called a Transformer and it assigns a probabilistic value to a list of next probable tokens (word or part of word). The Transformer (model) is stateless and deterministic. It only generates the probability list for a single next token each time it is run and it has no memory of previous runs. It has no clue if the most probably token will be selected or the least probably token (unlikely, but still), that
Re: (Score:2)
That depends on what "AI" you are talking about. LLM's certainly can't extrapolate.
Yea they can.
The technology is called a Transformer and it assigns a probabilistic value to a list of next probable tokens (word or part of word). The Transformer (model) is stateless and deterministic. It only generates the probability list for a single next token each time it is run and it has no memory of previous runs.
Of course it does, that memory takes the form of the KV cache.
It has no clue if the most probably token will be selected or the least probably token (unlikely, but still), that is configured by the temperature settings. So no, it can select probably next tokens, that is not the same as extrapolate anything. It doesn't reason or think like that at all.
If I created a magic box where you entered some text it and completes the sentence I could label that box as "just an autocomplete" this label would not be wrong. Even if I could enter "Tonights winning lotto numbers are: " and it actually spits out tonights winning lotto numbers every time it is asked... it is still just an autocomplete.
Of course the label I ascribed to the box does not say anything meaningful about the capabilitie
Re: (Score:2)
Of course it does, that memory takes the form of the KV cache.
The Transformer itself has no state, so no. It has no clue if those tokens were generated earlier by itself or if it's an entirely new input. Doesn't matter where those generated tokens are stored, the actual processing by the transformer has not memory other than the input. Given the same input it will always generate the same output. It is stateless and deterministic.
Re: (Score:2)
It only generates the probability list for a single next token each time it is run and it has no memory of previous runs.
The Transformer itself has no state, so no. It has no clue if those tokens were generated earlier by itself or if it's an entirely new input.
You said it has no memory and it does. Now you seem to be making a different argument.
Doesn't matter where those generated tokens are stored, the actual processing by the transformer has not memory other than the input. Given the same input it will always generate the same output. It is stateless and deterministic.
What is the relevance of the "stateless and deterministic" claim? Are you arguing this somehow imposes constraints on model capabilities thereby preventing "extrapolation"? This seems like a non-sequitur.
Re: (Score:2)
You said it has no memory and it does. Now you seem to be making a different argument.
It's not a different argument. I guess you are not really motivated to try and understand the difference between a KV cache to reduce compute and the meaning of stateless in this discussion. So I think I'll end the discussion here. You are of course welcome to think that Transformers can actually think and reason, enough people do, and that fuels the hype for the moment.
I'm hoping everyone can get more realistic understanding what the technology actually can do, and what it can't. So we can start to use
Re: (Score:2)
It's not a different argument. I guess you are not really motivated to try and understand the difference between a KV cache to reduce compute and the meaning of stateless in this discussion. So I think I'll end the discussion here.
You said "it has no memory of previous runs" and I pointed out this statement is incorrect. The cache is just a computed representation of the models context.
You are of course welcome to think that Transformers can actually think and reason, enough people do, and that fuels the hype for the moment. I'm hoping everyone can get more realistic understanding what the technology actually can do, and what it can't. So we can start to use it for scenarios where it is useful. But that will take time.
My only claim is LLMs can "extrapolate". I've made no other assertions. Despite repeatedly asking for an explanation why memory or determinism is relevant to the issue of "extrapolation" in the first place none has been forthcoming.
Re: (Score:2)
Ok. So memory and state are connected. When I said that the model has no memory of previous runs you countered by mentioning KV caching, which led me to believe that you eithe
Re: (Score:2)
No. Think about how, say, dogs understand physics. Obviously not via Newton's "laws" (or should I say, Newton's very useful mathematical approximations). Dogs navigate the world and 'understand' concepts like threats, prey, and mates well enough to persist in the world.
What LeCun is proposing is largely what self-driving cars already do. Waymo isn't driven by a Large "Language" Model that predicts wor
True human-level intelligence (Score:2)
The "A" in AI stands for *artificial.* Artificial cannot be "true" human intelligence. It may be able to do amazing things, but that does not make it "true" intelligence.
Re: (Score:2)
This is uselessly metaphysical.
Re: (Score:2)
I'm just calling out the hyperbolic claims of this company.
Re: True human-level intelligence (Score:3)
Re: (Score:2)
It will start cogitating...synthetically.
Excellent idea (Score:2)
sounds like a contradiction in terms (Score:2)
"AMI could build a realistic world model of an aircraft engine and work with the manufacturer to help them optimize for efficiency, minimize emissions, or ensure reliability"
So it seems to me that they would build a realistic model of an aircraft engine, the word "world" here is meaningless. If you don't have a realistic model, then you have no model or a bad model. There are realistic models in some other world? So they are using possible worlds models for a modal logic, but those are not models of this wo
Re: (Score:2)
Even if it's just another approach to an "expert system," I'm still glad someone is working on something other than glorified sentence-completion.
Why? (Score:2)
Why is the goal to create superhuman intelligence? Do we need something smarter than us? Are you trying to get us all killed??
Re: (Score:2)
Why is the goal to create superhuman intelligence? Do we need something smarter than us? Are you trying to get us all killed??
Greed, billionaires want magical AI genies that will do their bidding because they are not already rich and powerful enough.
Re: (Score:2)
Video summary of The Culture, slow start gets better: https://www.youtube.com/watch?... [youtube.com]
Re: (Score:2)
People are getting rich over the bubble and speculation. That's it really. Remember how for a brief period if your company mentioned blockchain the stock would jump? That fizzled out but they found something else that stuck.
I don't understand the logic (Score:2)
What makes world models any different from any of the other models? You are just training them on different stuff that operates on a much lower level than existing LLMs. Even if you were able to train models to the point where they are relevant for simulations what does this get you?
"LeCun argues that most human reasoning is grounded in the physical world, not language"
What reasoning skills do feral children have?
Re: (Score:2)
I'm not a big fan of LeCun - his level of recognition seems far in excess of his actual accomplishments, and his main claim to fame seems to a somewhat questionable claim to have invented CNNs, a long time ago.
That said, I do think LeCun is correct (but hardly alone) is saying that LLMs won't get us to AGI, and that we need a different approach, more akin to animal intelligence.
While LeCun does talk about animal intelligence, there is also this focus on "world models" and physical grounding, and it's not cl
Re: (Score:2)
The real difference between the animal intelligence approach and an LLM is that while an LLM predicts training sample continuations, and stops learning once it is trained, an animal predicts the real world (via it's perceptual inputs), including how the world reacts to it's own actions, and learns continually.
This is an implementation issue. There is no reason you can't loop outputs of any world model, LLM..etc. back into the models LTM. In fact people do exactly this in a supervised manner when training LLMs. The issues with this approach (accumulation of error, over fitting..etc) is the same in both cases. This is somewhat easier in cases where an objective function can be clearly evaluated.
Re: (Score:2)
Not really - continual learning from real-world inputs completely disrupts the whole "pre-train then serve to everybody" LLM approach. Instead you've now got every model instance running and experiencing different things and needing real-time learning.
Not only do you need a billion or so instances of that real-time learning algorithm running in parallel vs the "build a datacenter, train once" approach, but you need to invent that so-far elusive incremental training algorithm in the first place.
You could sho
Re: (Score:2)
This is an implementation issue. There is no reason you can't loop outputs of any world model, LLM..etc. back into the models LTM.
Not really - continual learning from real-world inputs completely disrupts the whole "pre-train then serve to everybody" LLM approach. Instead you've now got every model instance running and experiencing different things and needing real-time learning.
You say not really while at the same time offering implementation related objections. I'm not sure what to make of this. There are approaches available to merging and differentially training models (e.g. LoRA) with relatively small amounts of compute.
The industry is going this way anyway because there is significant value in custom training models on corporate datasets.
Not only do you need a billion or so instances of that real-time learning algorithm running in parallel vs the "build a datacenter, train once" approach, but you need to invent that so-far elusive incremental training algorithm in the first place.
I don't understand this line of argument. What makes world models any different in these regards? No matter the model you pick, no matte
Re: (Score:2)
LoRA is just an efficient way to fine tune a single model. It's not about merging different models.
Merging models is not even well-defined. What would it mean? What would be a principled criteria for deciding how to merge them when there are conflicting weight updates needed?
How do you address the privacy concerns of merging models? Are you really proposing to merge proprietary/private data from multiple companies and/or individuals then redistribute the merged changes to everyone? Sounds like a non-starter
Re: (Score:2)
LoRA is just an efficient way to fine tune a single model. It's not about merging different models.
Merging models is not even well-defined. What would it mean?
Merge in the context of LoRAs is taking multiple LoRAs and applying them to the same model. In the context of models it is combining multiple models into one.
What would be a principled criteria for deciding how to merge them when there are conflicting weight updates needed?
One of my all time favorite models is a frankenmerge of two slightly altered versions of itself. It's stupid that it works at all. I vaguely remember seeing references to papers on how to do this shit yet I don't pretend to have a clue of the details.
Sure the industry is fine tuning models woith LoRA, but they are NOT then sharing their private updates with each other !!
I think this is obvious. For proprietary you start with a model that best meets your needs and cust
Re: (Score:2)
> Personally I would be surprised if world models offered anything of value given they operate at such a low level.
You're thinking of the animal approach in the wrong way. Forget all the "world model" type, and just think of it as a predictive model, a near cousin of an LLM, that learns to predict next perceptual input(s) rather than next token form an historically gathered training set.
Let's also note that the input to an LLM really isn't text or symbolic sub-word tokens - it's really the high dimension
Re: (Score:2)
"What reasoning skills do feral children have?"
You underestimate the feral kid at your peril. He beat the Humungous's gang with a boomerang and a little help from Mad Max.
So then what? (Score:2)
You can train an AI on a the physical world. But then what? Yes, it will be good at copying us and doing tasks for us that are repetitive. But will it have the ability to innovate and do something new?
Re: (Score:2)
The business doesn't care about that. They want robots to understand the world and navigate/interact with it better than we do so that they can replace labor with robots. They want to build the Terminator.
Neccessary but not sufficient (Score:4, Insightful)
Always interesting how these people gloss over that. Essentially a lie by misdirection.
Incidentally, it is not known whether it is necessary either.
That said, there will never be AGI in LLMs. The approach does not support it. The one thing striking in the current AI hype is how many people without a clue are making grand predictions.
Re: (Score:3)
Re: (Score:2)
2nd line of the story.
Re: (Score:2)
Re: (Score:2)
Not my fault if you are illiterate ...
Re: (Score:2)
That said, there will never be AGI in LLMs.
Of course not; but, something similar will be a part of AGI.
Chemist here. (Score:2)
I suggested building AI world models in 1985 (Score:2)
https://archive.org/details/pr... [archive.org]
"Autonomous factories with intelligence: world models from sensory data"
But I also suggested there would be a big risk in doing that -- which is one reason I stopped working on building AI and robotics a few years after that.
And since then I have developed my sig -- which I feel is the single most important thing to know about AI and robotics (and other advanced technology):
"The biggest challenge of the 21st century is the irony of technologies of abundance in the hands of t
Lesson #1 (Score:2)
Teach it about that power cord feeding it.
Psychology (Score:2)
Make AI Understand Mortality by Banning it (Score:2)
I really don't like the current-gen AI (Score:2)
Mostly because they are just statistical models. They don't understand anything at all. They just know that 'when b is near c, that often means we'll have an x followed by a y' That could be pixels or letters or wave-forms.
With that basis, it's almost miraculous that they do as good of a job as they are at 'pretending' to give coherent answers. That's why I always say, "AI is great, as long as it doesn't have to be correct."
I'd love to have AI that 'understood.' It didn't 'make up' answers, it
Three out of four isn't bad (Score:2)
"a new breed of AI systems that understand the world, have persistent memory, can reason and plan, and are controllable and safe,"
Pick three.