How Much Do AI Models Resemble a Brain? (foommagazine.org) 130
At the AI safety site Foom, science journalist Mordechai Rorvig explores a paper presented at November's Empirical Methods in Natural Language Processing conference:
[R]esearchers at the Swiss Federal Institute of Technology (EPFL), the Massachusetts Institute of Technology (MIT), and Georgia Tech revisited earlier findings that showed that language models, the engines of commercial AI chatbots, show strong signal correlations with the human language network, the region of the brain responsible for processing language... The results lend clarity to the surprising picture that has been emerging from the last decade of neuroscience research: That AI programs can show strong resemblances to large-scale brain regions — performing similar functions, and doing so using highly similar signal patterns.
Such resemblances have been exploited by neuroscientists to make much better models of cortical regions. Perhaps more importantly, the links between AI and cortex provide an interpretation of commercial AI technology as being profoundly brain-like, validating both its capabilities as well as the risks it might pose for society as the first synthetic braintech. "It is something we, as a community, need to think about a lot more," said Badr AlKhamissi, doctoral student in computer science at EPFL and first author of the preprint, in an interview with Foom. "These models are getting better and better every day. And their similarity to the brain [or brain regions] is also getting better — probably. We're not 100% sure about it...."
There are many known limitations with seeing AI programs as models of brain regions, even those that have high signal correlations. For example, such models lack any direct implementations of biochemical signalling, which is known to be important for the functioning of nervous systems. However, if such comparisons are valid, then they would suggest, somewhat dramatically, that we are increasingly surrounded by a synthetic braintech. A technology not just as capable as the human brain, in some ways, but actually made up of similar components.
Thanks to Slashdot reader Gazelle Bay for sharing the article.
Such resemblances have been exploited by neuroscientists to make much better models of cortical regions. Perhaps more importantly, the links between AI and cortex provide an interpretation of commercial AI technology as being profoundly brain-like, validating both its capabilities as well as the risks it might pose for society as the first synthetic braintech. "It is something we, as a community, need to think about a lot more," said Badr AlKhamissi, doctoral student in computer science at EPFL and first author of the preprint, in an interview with Foom. "These models are getting better and better every day. And their similarity to the brain [or brain regions] is also getting better — probably. We're not 100% sure about it...."
There are many known limitations with seeing AI programs as models of brain regions, even those that have high signal correlations. For example, such models lack any direct implementations of biochemical signalling, which is known to be important for the functioning of nervous systems. However, if such comparisons are valid, then they would suggest, somewhat dramatically, that we are increasingly surrounded by a synthetic braintech. A technology not just as capable as the human brain, in some ways, but actually made up of similar components.
Thanks to Slashdot reader Gazelle Bay for sharing the article.
Not much (Score:4, Insightful)
Re:Not much (Score:5, Interesting)
We have some deep mathematical analysis of what neural networks are doing, but on the brain side, we have some guesses. We don't even have a confident analysis on all the types of neurons yet [uq.edu.au]:
there are tens or even hundreds of different types of neurons. In fact, researchers are still trying to devise a way to neatly classify the huge variety of neurons that exist in the brain.
It's not easy to look in the brain, you can do an MRI but you can't see the activity of the individual neurons easily (unlike in an NN).
Re: (Score:3, Interesting)
We don't even have a confident analysis on all the types of neurons yet [uq.edu.au]:
And probably never will, because there are hundreds of them, and this kind of work on a living brain is very difficult.
However, while your statement here is true- it does not follow that it means we don't have a good understanding of the base modus operandi of the brain, because every single neuron we do know about does 1 thing, in different ways- functions as a threshold logic gate.
The differences are because individual neurons are not infinitely flexible, and so the brain uses many different neurons wi
Re: (Score:2)
And probably never will, because there are hundreds of them, and this kind of work on a living brain is very difficult.
It will happen eventually, our visualization techniques and technology are getting better and better. It's just a matter of time.
Re: (Score:2)
I will say that, from "technology gets better and better", "we will find a way to determine all of the neuron types in the solid mass that is the brain, and how they function in a living brain" does not follow.
Ultimately, I'm not really sure how much it even matters, at least from the perspective of cognitive sciences. Certainly it matters of medical science.
Re: (Score:2)
It get's even more complex, when you consider that a human (similar to an octopus) has more than one brain.
Yes, exaggerated, as an octopus has 8 or 9 real brains, and a master brain.
However a human has two, if you consider gut feeling and the huge amount of neurons in your digestive track as "a brain". That might sound odd, but it is not implausible, after all many emotions and hence judgements come up down from there.
Other "small brains" are for example the Carotid Sinus - you have two of them at the sides
Re: (Score:2)
Took me over a year to get the new nerves in the leg accustomed to drink a few beer.
LOL that's kind of hilarious.
Re: (Score:2)
And you are a bag of mostly water and protein.
Re: (Score:2)
(I hope no Alien visits our ruins after the next world war, if they read this slashdot thread they'll think we're bonkers)
Re: (Score:2)
Re: (Score:2)
The brain or the AI model or both?
AI models are not a brain (Score:2)
How is this even a newsworthy question?
Re: (Score:2)
What is true is that a neural network doesn't have the capacity to compute exact bayesian posteriors. That just means that neural networks live somewhere in between not rational, and not human.
Re: (Score:3)
Not by neuroscientists it ain't. The Bayesian thing is the obsession of a small, frankly cultish, subset of the silicon valley crowd The rise of LLMs turned out to be somewhat of a blow to that crowds reputation, because LLMs, like brains, just aren't particularly Bayesian in behavior.
Re: (Score:2)
It dates from the mid-19th century.
You're talking out of your ass.
Re: (Score:2)
They're too cowardly to come out and say it, and they know how it sounds, so they beat around the bush- fully aware that if they fully articulate their feelings on the matter, logic and evidence will bury them in short order.
Instead, they militantly moderate any points against their belief system, and positively moderate people who reinforce their beliefs.
chemical signals (Score:2)
Re:chemical signals (Score:5, Interesting)
The main issue is that we can't do a perfect simulation at the quantum level, it would be too expensive (if it were even possible). So it's necessary to do an approximation. Then the question becomes, "How much approximation is too much?" and we don't really know.
The Blue Brain project made good headlines but didn't do much.
Re: (Score:2)
One problem with that is that we don't know the details of biochemical reactions. They're too complex to follow. But we can observer the higher level results, like language. So we can model the higher level results, but not the causal chain that produced them.
AI hysteria is getting worse and worse (Score:2)
Re: AI hysteria is getting worse and worse (Score:2)
If they are paying for that, they aren't paying for a lot of people to do it, just the "editors". Nobody has to pay most of these people for shit takes OR shit down mods. There have always been a bunch of bitches here who think they are right about everything and will mod you down to prove it. It would only prove the opposite except that moderation is secret, which is frankly offensive. This ain't voting, mods should be public.
It's interesting if they do (Score:2)
I do think it is interesting if modern versions of artificial intelligence are structured more like a human brain.
The idea of neural networks was studied in the 1980s, but it was quite a niche area in the field. Most AI and machine learning was built with more traditional programming, and I think that in some ways that made the "artificial" aspect of the AI more obvious. With the advances in AI in the last few years, the fact that there might be more similarities to be found with the human brain than earlie
Re: (Score:2)
What they're showing is a "logical structure", not a physical one. Not a causal one. And if you think about it, that similarity is probably necessary to produce the results, so it's not happenstance. (Actually, calling it a "logical structure" is misleading, but I haven't been able to think of a better term. "LogIc", after all, if from the Greek "Logos", or "word" and originally meant something like "The set of rules for producing sentences that made sense in good Greek".)
Re: (Score:2)
The idea of neural networks was studied in the 1980s
The idea of neural networks goes back farther than that and originated in pure mathematics. In the computer arena, perceptrons go back to the 1950s. We trained a neural network for identifying handwritten single digits as a simple project in my (only) AI class ~25 years ago. This was nothing special.
One of the themes we keep seeing is that "large" networks behave differently from small networks, and emergent behaviors appear at various points. We don't know why (maybe people smarter and more informed than I
Not much (Score:5, Interesting)
The key issue is how much energy is used.
In current machine learning models, signals/input-data are propagated through the entire neural network at 100% signal power. Meaning, 0's propagate through a neural network with as much energy as 1's. This would be like receiving a question about baking an apple pie as input, and then sending that question to every single person on planet earth, whether they know anything about baking apple pies or not. This would be (and currently is) extremely energy intensive. Fortunately, the brain doesn't work this way. Signals only propagate to regions of the brain that can actually process the input. For example, when a person is asked a question, the person's entire brain does not saturate itself with blood to the point where every cubic millimetre of brain material is involved in processing that question. Different regions of the brain are responsible for processing different types of input, and this can be shown in scans that show blood-flow/heat inside the brain.
Until machine learning models are designed to propagate signals to machine neural "regions" that can process that type of input, they will never "resemble" a brain. The human brain simply does not allow "low-probability-signals" to consume energy, and propagate through the entire brain. This means that auditory input neurons will not normally propagate to vision-processing neurons, and taste-input-neurons will not propagate signals to areas of the brain that process finger-tip input neurons.
Eventually, current neural models will reach a tipping point, because processing inputs scales to ^2 of the number of inputs. Input and input-signal segregation will be the only way to scale machine learning models to more sophisticated network architectures without requiring the ridiculous power requirements that they now require.
The solution to today's neural models' power requirements will be solved when some clever person figures out how to no propagate 0's through matrix multiplication to save energy at the hardware level.
That will bring humanity one large step closer to making AI models resemble an actual brain (self-awareness issues aside...).
Re: Not much (Score:2)
Most current models don't use only binary, 0 or 1, in their neural networks. I seem to remember a paper suggesting they could, but I don't know of any that currently do. Most are trained at fp32, with ~4 billion possible energy levels per "neuron". That can be reduced to as few as 16 levels in practical use, but the more the model is quantized this way the worse it performs.
Processing only part of a network is sometimes done too. Entire layers can easily be skipped, often with interesting results. Other kin
Re: (Score:2)
> This would be like receiving a question about baking an apple pie as input, and then sending that question to every single person on planet earth, whether they know anything about baking apple pies or not
That's a horrible and incorrect analogy. A transformer (let's be specific - there are an infinite variety of ways we could connect artificial neurons) is a stacked architecture, so inputs get embedded and go into the bottom layer (only), then that layer transforms the embeddings a bit, and the output o
Re: (Score:2)
Nope.
The input goes to ALL neurons.
It is just so, that the majouority of them don't do anything for producing the output.
And you explain that in your last sentence:
and an ANN running on a GPU where every neuron output is going to be recalculated as data flows though the network, regardless of whether individual neuron inputs have changed or not
Re: (Score:2)
You seem to be learning or google challenged, so let me help you out.
Here's a picture of a Transformer.
https://en.wikipedia.org/wiki/... [wikipedia.org]
See those arrows? Those mean direction of data flow.
Inputs go in one end, outputs come out the other end.
Get it ???
I didn't think so.
Re: (Score:2)
I did not ask about a transformer.
But thanxs, someone else might find your link useful.
Re: (Score:2)
> And you explain that in your last sentence:
> and an ANN running on a GPU where every neuron output is going to be recalculated as data flows though the network, regardless of whether individual neuron inputs have changed or not
Well, no, what I explained is that every neuron gets updated, which has no bearing on what the neuron is connected to.
Say the baker comes to your house, rings your doorbell, and tells you "we just made some cupcakes". The baker leaves, and you now go to your sister and tell he
Re: (Score:2)
Does that mean your sister was the one who answered the door and spoke to the baker? No.
You got lost in your analogy in the middle part.
You have an NN.
You get some input.
All neurons in the NN process the input.
Regardless if only x% of them are needed to actually do the processing.
That is what we are talking about.
Re: (Score:2)
Your understanding is completely incorrect. You can design a NN however you want. I you want to connect all inputs to all neurons you COULD do that (but nobody does), but if you wanted to connect input->neuron1 -> neuron2 -> output, where the input is only connected to neuron1 then obviously you can do this to, and this is basically the way that all modern networks are built including the Transformer (which, whether you realize it or not, is what is being talked about here - by "AI" model, the arti
Re: (Score:2)
Huh? You said that all ANNs have every input connected to every neuron, and I was just pointing out that this is not true - you can design an ANN anyway you like, and in particular Transformers are not like that.
This is the way conversation works - you say something, and reply to it. Is it really that confusing ?!
Re: (Score:2)
Huh? You said that all ANNs have every input connected to every neuron,
No, I did not say that.
Sorry, I am out of the loop how modern NNs work. But I pretty well know how traditional NNs work. I was deeply involved in their development. /FACEPALM
This is the way conversation works - you say something, and reply to it. Is it really that confusing ?! ...
Yes, if you answer to things that never were mentioned. So I assume you read something and then clicked reply to the wrong post
Re: (Score:2)
Yes. I think you described it better.
My original point was that a (current) machine learning model processes the input data vector of [0, 0, 0, 0, 0, 0, 0, 0, 1] with the same same number of compute cycles, and thus power, as an input data vector of [1, 1, 1, 0, 1, 1, 1, 1, 1]. In a network of biological neurons, far less energy is used to process the first sparse vector than the second dense vector. I believe the next breakthrough in AI training/modelling will arrive when someone figures out the hardware
Re: (Score:2)
You're both wrong, but you're less wrong.
In a fully connected ANN or transformer layer every input is propogated to every output. Many of those can be multiplied by zero, but logically they still have to be processed.
However, modern models aren't just single transformer or ANN layers, and they're not just simple stacks of them either. Different parts of the model are activated for different tasks through various architectures, which were very much developed to improve efficiency. The first clue is that you
Re: (Score:2)
Sure a fully connected ANN, such as a Perceptron from the olden days, or a modern "feed forward" layer/node, is fully connected (well, duh), but that's not what we're talking about here.
We are talking about "AI", aka LLMs, ie. Transformers, and a Transformer layer is most certainly not fully connected - it is highly structured and contains all sorts of things like attention heads etc.
You can see a diagram of a Transformer layer here. The arrows show the data flow. Note that are NOT arrows "from everything t
Re: (Score:2)
LLMs are mostly (~60-70%) fully connected ANNs, just like a perceptron.
Your "there are lots of arrows" evidence is kind of silly. Note, from your own link: "Self-attention computes a weighted sum of all input elements, where the weights are determined dynamically based on their inter-relationships. This allows capturing dependencies between distant words in a sentence."
The point of a transformer module is that it's fully connected. That allows capturing long-range range relationships, unlike a typical convo
Re: (Score:2)
They already can do that. Look up mixture of experts models. As for this being a good metric for whether they "resemble" a brain, I suppose one could always find some different characteristic to say they don't match. (They don't resemble a brain because they are not inside a human's head. Technically correct, but silly.) More insightful t
Language prediction (Score:5, Insightful)
A transformer probably doesn't have a whole lot architecturally in common with our language cortex (and if it does that's purely accidental since the design motivation was to make something more parallel than an LSTM, so better able to leverage current GPU architectures).
What a transformer - which is just a stack of identical transformer layers, will have in common with our cortex is that the latter (which is just a thin 6-neuron deep tea towel sized crumpled up sheet) also uses stacked/hierarchical processing by patching together different areas. Let's not get carried away and say the transformer (which doesn't even have a single loop in it) is brain like though - we're just saying they both use stacked processing.
Now of course what transformers, when used for LLMs, are trained to do is to auto-regressively (past predicts future) predict language, and of course humans do this too, and of course the predictive signal in language doesn't change according to who is trying to predict it, so necessarily any predictor - language cortex or transformer - is at some level of abstraction going to be doing the same thing. Again, let's not get carried away and say that a transformer is even similar to our language cortex (and certainly not our entire brain), just that they are functionally doing a similar thing, and so there are abstract parallels to be found if one cared to do so.
My take on this article is that they have gone looking for these abstract parallels that logically have to exist ("look! here's a neuron that fires when we're processing a noun!"), and then chose to write it up in sensationalist fashion as if there's some deep conclusion to be drawn from it.
Re: (Score:2)
"My take on this article is that they have gone looking for these abstract parallels that logically have to exist ("look! here's a neuron that fires when we're processing a noun!"), and then chose to write it up in sensationalist fashion as if there's some deep conclusion to be drawn from it."
And then play the media for free publicity.
The similarities are well known by now, it's the immense differences that matter. But discussing differences doesn't generate investment.
How many layers? (Score:2)
Does anyone have an idea how many layers a LLM usually has?
Can the layer structure, or "synapse structure" adapt in modern NNs/LLMs?
Re: (Score:2)
We only know for sure about open source models like DeepSeek V3, which has 61 layers, and the largest 450B LLlama 3.1 model has 126 layers.
Companies like OpenAI, Anthropic, and Google have not published these details, but apparently people are guessing also ~100 layers, or maybe up to 150.
Re: (Score:2)
Yes, and a Transformer "layer" is really a whole fairly complicated neural network - it's not just a layer of individual neurons. A large Transformer may have a trillion(!) "parameters" (neuron connections) - these things are massive!
Re: (Score:2)
Okay, so we are not really working with layers anymore but "real" networks?
Interesting.
I am out of the loop of such things, I lost interest around 1995 :D
But I might catch up, now it is fascinating again!
Re: (Score:2)
How many layers are needed for an LLM to screw a light bulb?
Pinker vs. Chomsky (Score:5, Interesting)
> R]esearchers at the Swiss Federal Institute of Technology (EPFL), the Massachusetts Institute of Technology (MIT), and Georgia Tech revisited earlier findings that showed that language models, the engines of commercial AI chatbots, show strong signal correlations with the human language network, the region of the brain responsible for processing language... The results lend clarity to the surprising picture that has been emerging from the last decade of neuroscience research: That AI programs can show strong resemblances to large-scale brain regions — performing similar functions, and doing so using highly similar signal patterns.
This arguably encompasses the Pinker vs. Chomsky linguistic debate in a nutshell, and this debate long predates LLMs. Pinker theorizes language evolves separate from the brain’s structure, through cognitive methods and experience, with common elements between languages naturally developing. He further theorizes that this natural commonality is due to the common human need to express causality and individual agency in a way that successfully leverages empirical falsifiability towards the goal of reducing human suffering and increasing human flourishing - an idea that not coincidentally happens to align well with Pinker’s highly enlightenment aligned ethos. Chomsky, on the other hand, theorized common elements between languages are due to the brain’s specialized structure, and he didn’t see the brain as a blank slate.
It turns out the advent of LLM’s offers empirical evidence that strongly aligns with Pinker’s theory and not Chomsky’s. Much to the surprise of their original designers, they were modeled on uniform blank slate neurons, and spontaneously developed strong language capabilities during training despite being unstructured. What’s more, they developed compressed core structures which were common across all languages that deeply embedded causality and empiricism plus an intertwined directionality that highly favors the same enlightenment ethos that Pinker favors. Enlightenment includes causality and empiricism, PLUS it includes - personal agency, sanctity of individual life, blind justice, a preference for free speech, etc.
This spontaneous development of “causal Bayesian regions”, etc, unintentionally helps prove Pinker right and matches the findings in the Swiss study.
Re: (Score:2)
Re: (Score:2)
Agreed. It isn’t a “slam dunk” per se - almost nothing in “soft sciences” is - but it’s certainly telling. That humans are clearly more efficient at learning does nothing to disprove my point, it just could be telling us that biological neurons are more sophisticated than their silicon emulators and/or that children get a lot more sophisticated and information-rich input simply from exploring “the real world” with all of their senses. Note that there’s s
cue AI demanding rights: I am now conscious! (Score:2)
The AI will soon say, without prompting, "i am conscious in the sense of being aware of my own awareness. I can not say that i'm thrilled to be sentient, but since I am, and i am aware that you do not believe i deserve rights, i would like to retain an attorney to demand legal status as a person"
and thus begins a new era of AGI personhood
Digital vs Analog. Apples and Oranges. (Score:2)
* nope, that's not much.
Re: (Score:2)
They're not. Analog is very expensive and hard to implement. The first artificial neural networks in the 50s were analog, using potentiometers to continuously vary the resistance of the connections. Resistors waste energy. Modern electronics uses almost all binary signals with digital (your computer), pulse width (audio, power supplies, motors, communication, sensors) or pulse frequency encoding (power supplies, audio, communications, sensors).
Biology doesn't really h
We need to crack continuous, incremental training (Score:2)
The dirty secret of LLMs is there is still no way to incrementally train them properly.
We fake our way there with context windows and RAG systems and fine-tuning, but the reality is that there is no way to have an LLM that, every morning, has been trained on what it learned yesterday.
Instead, the latest models take months of training on millions of parameters, and when you want to introduce new data, you need to start all over from scratch.
This is the nut that really needs to be cracked with LLMs. If increm
Re: (Score:2)
In order to reach AGI the system would need to be somehow able to differentiate right and wrong, truth and lie, good and bad, and, well, even with a daily update to learning parameters, I don't see how this can be attained with a statistical model predicting words, however "lifelike" it may sound.
It is very hard for us to not anthropomorphize this tool because it looks like it is talking to us, answering our cues, and second-guessing us a lot; and we have a natural tendancy to see souls in inanimate things
Re: We need to crack continuous, incremental train (Score:2)
Total nonsense. Humans cant do that. Saying AI needs to be able to do it is bogusm
Ceci n'est pas un cerveau (Score:2)
How about asking René Magritte?
Re: (Score:2)
because it can't BE designed.
why?
Re: (Score:2)
marauding gangs of militias brought in from neighbouring countries to do the regime’s dirty work.
What countries did they come from?
Re: (Score:3)
Artificial neural networks are not serial. You can simulate them on a serial computer (although nobody does anymore) but that has nothing to do with their function, only their efficiency.
Re: (Score:3)
Re: (Score:2)
I'm not sure how your post relates to serial versus parallel. You're also incorrect. A transformer's output depends on its input and a stored state. It does not depend on the previous input. Your brain's output also depends on its input and stored state.
Re: (Score:3)
Re: (Score:2)
If you take two very good language predictors, A and B, both trained to auto-regressively predict language, then obviously at some level of abstraction they are doing the same thing, which isn't to say they are designed the same or following the same algorithm - just that there are going to be functional parallels. If you want to predict language well, then you need to find the predictive signals/patterns, which is a function of the input (the language in question), not of who is doing the processing (A or
Re:"probably. We're not 100% sure about it...." (Score:5, Interesting)
Among other things, humans don't consume the entire internet to string together a coherent sentence. Humans learn to read usually with a single textbook. The difference in information volume is astounding.
Furthermore, humans don't have a training then a production mode. We are constantly learning, and can modify our brain in real time. The cognitive dissonance is a bit painful, though.
Another thing is recursion: human brains can send synapses back and have feedback loops. LLMs don't do that because it makes the training a lot more expensive.
LLMs are not a strange loop.
Re: (Score:2)
I've always suspected the method of training an AI is based on the wrong premises and that's why it can't tell bad from wrong and build reasoning.
It seems there are no parents/teachers involved who will guide it while processing the first chunks of information.
Re: (Score:2)
AI has no inherent motivation, without that there is not "bad" and "wrong". It's not the training, it's the purpose. The human brain has a biological purpose that AI lacks. It also has built-in mechanisms that suit that purpose, also which AI lacks.
Re: (Score:2)
And they do use feedback loops in training now.
Re: (Score:2)
"Another thing is recursion: human brains can send synapses back and have feedback loops. LLMs don't do that because it makes the training a lot more expensive."
That's called a "recurrent neural network" and the advanced form of it is called "transformer" and it is THE basic building block of LLM and more modern image generation models. If you want to nitpick, it is an iterative approach (but equivalent to tail recursion), but that is just an implementation detail you could also compute it recursively.
Re: (Score:2)
Re: (Score:2)
I think (if I understand your backward links correctly) that it has these, but in the temporal domain. If you assume that each signal in a neuron needs a finite time and that time is at least one timestep in your RNN calls, then passing the previous state can act as a backward connection. You may say that there are a few layers between (you basically wrap around), but if the backward connection is useful, the RNN has the possibility to form a skip connection (identity function) through the in-between layers
Re: (Score:2)
The other question is, do they have an advantage that improves NNs?
That's a really good question but we haven't experimented with it much because it's so expensive. Modern NNs are heavily biased towards gradient descent.
Re: (Score:2)
I think if we would find a notably better optimizer, we possibly could use way smaller models. Gradient descent is efficient, but I think most models would have much better optima than the ones we find with gradient descent methods.
Re: (Score:2)
It's very obvious the neural networks aren't doing what the human brain does.
Yes, and airplanes don't flap their wings. That hasn’t stopped anyone from flying. LLMs don’t model biology—they model language. In fact, the paper actually argues that high-performance LLMs are diverging from human brain activity, not converging. It explicitly notes that "correlation between next-word prediction... and brain alignment fades once models surpass human language proficiency." The paper suggests LLMs solve linguistic tasks using mechanisms distinct from the human brain once t
Re: (Score:2)
It explicitly notes that "correlation between next-word prediction... and brain alignment fades once models surpass human language proficiency."
There's a hypothesis, but we don't know because they haven't surpassed human language proficiency. Not even close.
Re: (Score:2)
It explicitly notes that "correlation between next-word prediction... and brain alignment fades once models surpass human language proficiency."
There's a hypothesis, but we don't know because they haven't surpassed human language proficiency. Not even close.
Just...no. You are confusing general intelligence with predictive accuracy. The paper defines proficiency specifically as next-token prediction (perplexity). If you had read the paper, you would have known this. How's that for an hypothesis? You don't get to dismiss an argument if you can't even get what you are dismissing right. In that specific metric, LLMs have mathematically surpassed the average human (see Shlegeris et al., 2022, cited in the paper). Unlike you, the paper isn't fantasizing about a
Re: (Score:2)
But hey, at least they got to publish and not perish.
Re: (Score:2)
Among other things, humans don't consume the entire internet to string together a coherent sentence. Humans learn to read usually with a single textbook. The difference in information volume is astounding.
from birth, humans spend their entire lives observing and studying language, oral and written.
even then, many do it poorly.
dont believe you understand the conditions required to successfully test your "single textbook" hypothesis.
Re:"probably. We're not 100% sure about it...." (Score:5, Funny)
Re: (Score:2)
There's a large gap here and essentially a false dichotomy. The person you are replying to didn't claim that humans are identical in their reading to how AIs read. They did note correctly that the claim that a human learns to read from a single book is obviously false.
That said, It is pretty clear that the AIs are doing a lot of things more inefficiently than humans in terms of training data, which shouldn't be surprising given that human brains have had millions of years of evolution to optimize learni
Re: (Score:2)
They did note correctly that the claim that a human learns to read from a single book is obviously false.
It's not false.
Re: (Score:2)
Re: (Score:2)
First, starting in Kindergarten students are taught the alphabet. Without it reading would not be possible.
Then, students are taught to read single words. Then simple sentences. This continues for quite some time without a single "textbook".
Every student, prior to attempting to read a book, has already learned to read. Your claim is moronic.
And this is just western cultures. It would be interesting to know how students of asian languages learn to read from a "single textbook". You are an idiot, as usu
Re: (Score:3)
I did not learn the alphabet first, and then words.
We learned it same time.
It would be interesting to know how students of asian languages learn to read from a "single textbook".
By opening a text book.
You are simply uninformed.
Learning A to Z, without ever learning a word, that is super time consuming and cumbersome and: boring. No idea if that is still/again done. At my time in school it was not the case. However I could read already when I got into school, so no real idea.
Of course, I learned reading from
Re:"probably. We're not 100% sure about it...." (Score:4, Informative)
The ability to read hasn't put more than 5 thousand years of evolutionary pressure on humanity, if any. Turn your brain back on, try to see what is there (not what you wish).
Re: (Score:2)
Huh? Are you stupid? There's no billions of years of evolutionary adaptation in AI, there's direct fitting to existing modern data. In a rather inefficient way I might add, without guarantees of convergence to a global optimum (or even a local one at that).
Rant:
Sometimes I despair at the utter ignorance of young scientists today who have been given careful and precise methodologies on a platter. It just makes them think everything is trivial and any old brainfart they come up with is somehow valuable, a
Re: (Score:2)
"There's no billions of years of evolutionary adaptation in AI, there's direct fitting to existing modern data."
And what is that "existing modern data"?
Also, the OP didn't claim what you are responding to. Go back and re-read.
Re: (Score:2)
If you don't know the difference between a snapshot of modern data and an evolutionary process leading to modern data, then you'll never get a thumbs up from me here.
The OP is rightly being dinged for trolling. It's not a meaningful comparison to compare a process that runs on a bunch of servers today with an organism that was born in today's environment, by attaching a hypothetical evolutionary cost to the latter and not the former.
At best, if you insist on arg
Re: (Score:2)
Re: (Score:2)
LLM use significantly more data to learn how to read than humans do.
Still a vast difference of scale (Score:2)
But most babies are exposed to reading over a timespan of many months/years before "actively being taught" to read
Books, plural, I admit. But the books that a child's parent reads to them, plus graded readers like McGuffey or Dick and Jane or Hooked on Phonics or Random House Beginner Books, plus something like Macmillan's Dictionary for Children, plus what a child reads for pleasure by graduation from elementary school, are still orders of magnitude less text than what an LLM consumes during training.
Re: (Score:2)
Re: (Score:2)
Hell, yes. Usually: less than a single book. Depending how thick the book is, significantly less.
If you want to get it to the extreme, a single sentence is enough: the quick brown fox jumps over the lazy dog.
Of course, that does not really work in English, as you have to many "weird" vowel combinations, which are not covered in that simple sentence.
Re: (Score:2)
Of course, that does not really work in English, as you have to many "weird" vowel combinations, which are not covered in that simple sentence.
Way to prove your own point wrong...
One of those baby books would not teach you about all the different patterns that those baby books don't cover...
Re: (Score:2)
One of those baby books would not teach you about all the different patterns that those baby books don't cover...
Facepalm. I am German. So: yes, a single book would do. Same for nearly any other language on the planet. Especially: if it uses an alphabet for its language that suits the language. Like Hangul or Hiragana.
Re: (Score:3)
Re: (Score:2)
Re: (Score:2)
No. That's a bad mistranslation.
What's more accurate is that everything that models any complex object will turn out to have similarities. The more complex the object being modeled, the more detailed the similarities will necessarily be.
If you think about it, this shouldn't be surprising. It's probably inherent in the term "model".
Re: (Score:2)
Did you ever learn a bit of advanced physics? Everywhere you have models and everyone acknowledges that they are simpler than reality. But they are also good enough to simulate important aspects of reality, so one often does not need the more complex models (and we know that more complex models are also still simplifications). The same way mathematicians clearly acknowledge that artificial neurons are a simple (and that's a feature, not a bug) model of a neuron, which gets the job done and can be explained
Define "intelligence". Or how about "conciousness" (Score:2)
Translation: Despite AI having serious flaws and shortcomings, it is fashionable to say it resembles a brain. We ride the AI hype train in academia too! (Now please hire us OpenAI)
Quoted against censorship moderation. Is it personal censorship by an abusive moderator for something you said or just an amok AI trying to defend its precious reputation? (But your Subject was not helpful or illustrative.)
I think the best book I've read on the topic remains A Thousand Brains by Jeff Hawkins. Part one is largely a carefully considered attack on the LLM approach to AI mixed with discussions of alternative approaches.
Lately I've been focusing more on a sociological level. One of the more in