Follow Slashdot blog updates by subscribing to our blog RSS feed

Generative AI Doesn't Have a Coherent Understanding of the World, MIT Researchers Find (mit.edu) 138

Posted by EditorDavid on Sunday November 10, 2024 @06:34PM from the modelling-citizens dept.

Long-time Slashdot reader Geoffrey.landis writes: Despite its impressive output, a recent study from MIT suggests generative AI doesn't have a coherent understanding of the world. While the best-performing large language models have surprising capabilities that make it seem like the models are implicitly learning some general truths about the world, that isn't necessarily the case. The recent paper showed that Large Language Models and game-playing AI implicitly model the world, but the models are flawed and incomplete.

An example study showed that a popular type of generative AI model accurately provided turn-by-turn driving directions in New York City, without having formed an accurate internal map of the city. Though the model can still navigate effectively, when the researchers closed some streets and added detours, its performance plummeted. And when they dug deeper, the researchers found that the New York maps the model implicitly generated had many nonexistent streets curving between the grid and connecting far away intersections.

This discussion has been archived. No new comments can be posted.

Generative AI Doesn't Have a Coherent Understanding of the World, MIT Researchers Find

Load All Comments

Search 138 Comments Log In/Create an Account

Comments Filter:

No kidding (Score:5, Insightful)

by alvinrod ( 889928 ) writes: on Sunday November 10, 2024 @06:41PM (#64935477)

LLMs don't have an understanding of anything. They can only regurgitate derivations of what they've been trained on and can't apply that to something new in the same ways that humans or even other animals can. The models are just so large that the illusion is impressive.

Share
twitter facebook
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  That nicely sums it up. When you pool billions of details with simple connections, you get the illusion of a model. You do not get a model, because that requires abstraction and that requires actual intelligence.
  - So what do humans do? (Score:2)
    
    by goombah99 ( 560566 ) writes:
    
    Or dogs.
    We may be doing much the same thing. All we are actually good at is 3D navigation and language. Everything else like say logic , math, science is super hard. We actually rely on models precisely because these go beyond intuition. But when it comes to symbolic reading then we exploit or ability to use language. I think AI are doing the same thing. They just lack that intuitive 3D navigation training like our brains evolved. But they clearly have the language part. So they can reason but mayb
- Re: (Score:2)
  
  by Marxist Hacker 42 ( 638312 ) * writes:
  
  I've been saying this for two years now. AI is GIGO- Garbage in, Garbage out. There's been nearly ZERO effort put into accurate model creation.
- Re: (Score:3)
  
  by strikethree ( 811449 ) writes:
  
  Exactly. I have no idea where/why people even thought of the idea of our current version of AI as having ANYTHING relating to the concept of "understanding". Are we living in crazy world here?
  I just realized that most people never examine themselves and their own thinking. It is creeping me out, living with automatons.
- Re: No kidding (Score:3)
  
  by superposed ( 308216 ) writes:
  
  I realized recently that what passes for insight in LLMs is really just âoethe wisdom of the crowdsâ. Itâ(TM)s well known that If you average together enough peopleâ(TM)s faulty guesses of quantitative information, you often get something close to the truth. (This is also known as the Delphi process.)
  This is essentially what LLMs do. Instead of thinking independently about something, they apply a fancy statistical estimate of what the average person on the internet would say about it. Th
Seriously, did we need a MIT study? (Score:5, Insightful)

by ls671 ( 1122017 ) writes: on Sunday November 10, 2024 @06:41PM (#64935479) Homepage

Seriously, did we need a MIT study to know that?

Share
twitter facebook
- Re: (Score:2)
  
  by Krishnoid ( 984597 ) writes:
  
  And if it was MIT, why didn't they first try it out on a map of Boston [app.goo.gl]?
  - Re: (Score:2)
    
    by lockecole2 ( 455419 ) writes:
    
    And if it was MIT, why didn't they first try it out on a map of Boston [app.goo.gl]?
    Probably because the city of NY (Manhattan island especially) is much simpler road-wise than Boston.
    - Re: (Score:2)
      
      by ArmoredDragon ( 3450605 ) writes:
      
      Funny enough I spoke to somebody from Boston who knows how the roads were basically first built by cows, and says that somehow the cows managed to do a better job of planning the roads than whoever designed the ones here in Los Angeles. I still remember when I first moved here, one of my first WTF moments was when I saw a green right turn arrow just above a no right turn sign. Though admittedly that came some time after I realized that they only put the freeway exit number signs after the actual exit.
      - Re:Seriously, did we need a MIT study? (Score:5, Interesting)
        
        by flink ( 18449 ) writes: on Monday November 11, 2024 @12:16AM (#64935947)
        
        Lifetime Bostonian here. This is a common myth. The roads aren't cow paths. They are people paths. The roads follow natural contours that were most convenient for people to walk. They followed natural ridge lines of elevation or the contour of the shore line. However over the years, a lot of the elevation was shoveled into the sea to make new land, so neither the original hills nor the original shore that the roads followed are still around, leaving a seemingly non-sensical layout. However, if you look back at old maps the roads make very much sense for a person moving under their own power to travel on. The newer areas that were reclaimed from the sea have straighter roads and simpler layouts. There is even a grid in Back Bay with alphabetical street names where there was a large land reclamation project.
        Here's a decent little video [youtu.be] on the subject.
        
        Parent Share
        twitter facebook
        
        Re: (Score:2)
        
        by martin-boundary ( 547041 ) writes:
        
        Thanks for that, very interesting!
    - Re: (Score:2)
      
      by account_deleted ( 4530225 ) writes:
      
      Comment removed based on user account deletion
- Re:Seriously, did we need a MIT study? (Score:5, Interesting)
  
  by narcc ( 412956 ) writes: on Sunday November 10, 2024 @08:38PM (#64935681) Journal
  
  Apparently. There are a surprising number of people, even in the field, who have what I can only describe as religious beliefs about emergence. It's disturbing.
  
  Parent Share
  twitter facebook
- Re: (Score:2)
  
  by account_deleted ( 4530225 ) writes:
  
  Comment removed based on user account deletion
- Re: Seriously, did we need a MIT study? (Score:2)
  
  by Fons_de_spons ( 1311177 ) writes:
  
  Still a lot of business men are jumping and dancing around their new AI god. I think it needs more than MIT to get them to wind down. So eager to replace those disobedient clumsy humans. They will never learn
  Hey, I just got an idea! Let's replace them with an AI!
- Re: (Score:3)
  
  by Touvan ( 868256 ) writes:
  
  I came here to say the same. It's obvious based on even a shallow understanding of how the technology works that it doesn't "understand" anything - it's just predicting tokens based on a previous body of text, in a way that generates something that has the appearance of intelligence. It's true "artificial intelligence" in the old sense. I don't understand why so many highly technical people keep missing that.
  - Re: (Score:2)
    
    by ls671 ( 1122017 ) writes:
    
    I call it a sophisticated Bayesian filter similar to what spam assassin uses to determine if an email is spam or not. Bayesian filters use tokens too.
- Re: (Score:2)
  
  by luis_a_espinal ( 1810296 ) writes:
  
  Seriously, did we need a MIT study to know that?
  I'm typing this over lunch, so sorry for any typos:
  Half of my brains agrees with you about asking if we need such studies. For us, it's obvious LMs do not understand anything at all.
  However, we reach our (correct) conclusion just by inferring on our own understanding of things work. We argue our conclusions, but we do not demonstrate, for obvious reasons (it's hard.)
  And it is not wrong of us to simply argue rather than demonstrating it. How often do we demonstrate that 2 + 2 is indeed four, or that th
understanding? (Score:5, Insightful)

by dfghjk ( 711126 ) writes: on Sunday November 10, 2024 @06:41PM (#64935483)

More anthropomorphizing neural networks. They don't have "understanding" at all, much less "coherent" understanding.

Share
twitter facebook
- Re: (Score:3)
  
  by Koen Lefever ( 2543028 ) writes:
  
  Moreover, neural networks hate to be anthropomorphized.
  the New York maps the model implicitly generated had many nonexistent streets curving between the grid and connecting far away intersections.
  BTW, does anybody have a link to the report from the researchers MIT sent to New York to check out the glitches in the matrix discovered by this neural network?
  - Re: (Score:3)
    
    by AleRunner ( 4556245 ) writes:
    
    Sure no problem
    https://news.mit.edu/2024/foll... [mit.edu]
- Re: (Score:3)
  
  by gweihir ( 88907 ) writes:
  
  "Understanding" is a synonym for "deduction abilities" here, a term too advanced for many people. And yes, LLMs and neural networks have no deduction abilities.
Results by (Score:2)

by Tablizer ( 95088 ) writes:

...CaptObviousGPT
Very few experts ever claimed it had common sense-like reasoning, and those who did usually added caveats to their claims.
And this is different from humans? (Score:5, Insightful)

by ClickOnThis ( 137803 ) writes: on Sunday November 10, 2024 @06:49PM (#64935499) Journal

I have met lots of people who don't have a coherent understanding of the world. This week I watched them ... oh, never mind.

Share
twitter facebook
- Re: (Score:2)
  
  by timeOday ( 582209 ) writes:
  
  These studies really would be vastly more interesting if they tested humans on the same tasks.
  - Re: (Score:3, Insightful)
    
    by ClickOnThis ( 137803 ) writes:
    
    These studies really would be vastly more interesting if they tested humans on the same tasks.
    Yeah, not sure how you could do it though. It might not even be apples and oranges, more like apples and Apple computers. The models can handle enormous amounts of data and can be examined to determine their internal structure. To get the same thing from a human, you'd need to test behaviors and ask questions. Lots of questions.
    The interesting thing is that the models started to infer the existence of roads without seeing them. I suppose humans do the same thing! And the models' performance "plummeted" when
  - Re: And this is different from humans? (Score:3)
    
    by NagrothAgain ( 4130865 ) writes:
    
    We have. People have managed to walk, and drive, around NYC for a very long time before the invention of in-car nav systems.
- Your experience is called a delusion (Score:2)
  
  by thesjaakspoiler ( 4782965 ) writes:
  
  You could try to step out of your comfort zone and listen to podcasts of people who don't agree with you or read newspapers with more objective views of the world.
  - Re: (Score:2)
    
    by ClickOnThis ( 137803 ) writes:
    
    You could try to step out of your comfort zone and listen to podcasts of people who don't agree with you or read newspapers with more objective views of the world.
    I do. Perhaps you could join me? Here, these might help:
    https://adfontesmedia.com/ [adfontesmedia.com]
    https://www.snopes.com/ [snopes.com]
    https://ground.news/ [ground.news]
- Re:And this is different from humans? (Score:5, Insightful)
  
  by gweihir ( 88907 ) writes: on Monday November 11, 2024 @01:47AM (#64936051)
  
  True. Only about 20% of all people are accessible to rational argument. That pretty much means the rest has no understanding how things actually work. For example, they do not understand what a fact is or how science works. They think their gut-feeling is as good or better than an expert analysis. They think physical reality cares about their wishes and can be changed by belief. And other utter crap. Or to speak with Charles Stross: "The average person understands nothing."
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by ClickOnThis ( 137803 ) writes:
    
    This. When I hear someone say "do your own research" -- I cringe. It really means "ignore the experts who have spent their lives studying the problem."
    I fear a revolution is brewing, and those who embrace reason are not going to fare well.
    - Re: And this is different from humans? (Score:2)
      
      by Ceseuron ( 944486 ) writes:
      
      Hitchens Razor. That which can be asserted without evidence can be dismissed without evidence.
      Whenever I hear anybody make a claim about anything and then attempt to substantiate that claim by insisting other people do their own research to arrive at the same conclusion, I immediately dismiss everything they have to say. If you cannot back up your claims with real, testable evidence, then your claims are without merit.
    - Re: (Score:2)
      
      by gweihir ( 88907 ) writes:
      
      I fear a revolution is brewing, and those who embrace reason are not going to fare well.
      We are only in demand when we deliver weapons for the cavemen to kill each other or things for them to buy as personality-prostheses.
- - Re: (Score:2)
    
    by narcc ( 412956 ) writes:
    
    You seem to think that LLMs are like science fiction robots. They are not.
  - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    This is not about a desire for LLMs to be "perfect". This is about LLMs faking it all the way.
    - Re: (Score:2)
      
      by ClickOnThis ( 137803 ) writes:
      
      This is not about a desire for LLMs to be "perfect". This is about LLMs faking it all the way.
      IMHO, "fake it 'til you make it" describes pretty much any kind of learning with feedback, whether it's done by a machine or a human.
      If an entity can learn to "fake it" extremely well, then I think she/he/it warrants notice. Machines already can "fake it" better than humans in many ways. The list will grow.
      - Re: And this is different from humans? (Score:2)
        
        by ClickOnThis ( 137803 ) writes:
        
        Good question. If you're basing your actions on what you learned, then maybe you're not faking, even if your understanding still isn't coherent.
        I'm reminded of something Richard Feynman said in his lectures on physics. Briefly, it was that a witch doctor may have the wrong theory about a disease, but you still go to him, because he knows the disease.
  - Re:And this is different from humans? (Score:5, Insightful)
    
    by mu22le ( 766735 ) writes: on Monday November 11, 2024 @05:40AM (#64936263) Journal
    
    No one expects LLMs to be all knowing, but the problem is that they have absolutely no self reflection on their knowledge.
    A human will tell you they are not sure, when they are not knowledgable about a certain subject, but they will also insist and try to correct you if they feel confident about what they know and you voice a different view. An LLM will confidently bullshit you with until you tell it it's wrong, then it will contradict everything it just said and tell you that you are right, regardless of what is true, or even logical.
    
    Parent Share
    twitter facebook
    - Re: (Score:2)
      
      by allo ( 1728082 ) writes:
      
      Maybe the problem is, that you think the limits need to be expressed in the generated text. The text may say "I'm sure the moon is bigger than the sun", but the disclaimer about the system says "The system may lie". Think of it like you think about other algorithms that may not always be right and you're fine. The text output claiming it is right is not the inventor claiming it is right. The problem is, that people think it is, just because it reads like a human text instead of reading like wrong numbers.
Combine with a logic engine & rule base (Score:2)

by Tablizer ( 95088 ) writes:

I'm wondering if it would be possible to hook it up to the likes of Cyc, a logic engine and common-sense-rules-of-life database. The engine could find the best match between the language model (text) and Cyc models, weighting to favor shorter candidates (smallest logic graph). Generating candidate Cyc models from language models may first require a big training session itself.
I just smell value in Cyc's knowledge-base, there's nothing on Earth comparable (except smaller clones). Wish I could by stock in it
- Typo Corrections: (Score:2)
  
  by Tablizer ( 95088 ) writes:
  
  Corrections:
  "weighing to favor shorter candidates" [No JD jokes, please]
  "Wish I could buy stock in it"
  (Bumped the damned Submit too early)
- Re: (Score:3)
  
  by phantomfive ( 622387 ) writes:
  
  It's amazing how many startups are out there just repeating the same LLM approach with more data, but none (afaik) are trying something like joining it with Cyc. If I were raising billions for an AI startup, I would consider at least trying that as a side project.
  - Re:Combine with a logic engine & rule base (Score:5, Interesting)
    
    by Tablizer ( 95088 ) writes: on Sunday November 10, 2024 @07:27PM (#64935585) Journal
    
    Indeed. With all the investing going into increasingly questionable AI projects you'd think somebody with money would zig when everyone else is zagging to try bagging a missed solution branch/category.
    Reminds me of the Shuji Nakamura story on the invention of a practical blue LED. Red and green LED's were already commercial viable. Blue was the missing "primary light color" in order to mix to get the full rainbow. Many big co's spent a lot of R&D on blue, but kept failing. Their blue LED's were just way too dim.
    Zinc selenide (ZnSe) was the most productive technology for LED in the past, but gallium nitride (GaN) had some promising theoretical properties, despite early failures. The rest of the industry felt ZnSe seemed clearly the safer bet, being easier to tame. Shuji decided it was worth exploring GaN instead after so many ZnSe fails for blue. He found a promising incremental improvement and kept at it, worked long hours without pay and pissing off his boss, but it eventually paid off, and he won a Nobel.
    Not having a PhD (yet) like most his colleagues, he was often given the grunt-work of repairing equipment. But this work also taught him how to tweak the crystal-growing machines for new variations. He eventually learned to "play the crystal-growing machine like a piano", getting almost anything he wanted.
    Shuji's a true underdog Nerd Hero.
    There is too much Me-Too-Ism in IT in general. Don't get me started about IT projects ruined by fad chasing, I won't shuddup.
    
    Parent Share
    twitter facebook
    - Re: Combine with a logic engine & rule base (Score:2)
      
      by rkww ( 675767 ) writes:
      
      Please cite sources https://youtu.be/AF8d72mA41M [youtu.be]
    - Re: (Score:2)
      
      by jenningsthecat ( 1525947 ) writes:
      
      Shuji's a true underdog Nerd Hero.
      So true. His story should be required reading throughout the curricula of science and tech courses.
  - Re: (Score:2)
    
    by fuzzyfuzzyfungus ( 1223518 ) writes:
    
    Aside from being novelty-addled herd animals; I think that there's a much stronger cultural affinity for the technology that is all about the fact that you can sometimes get surprisingly plausible outputs from nescience so profound that it would be anthropomorphizing to call it ignorance; than for the technology founded on the hope that if you systematically plug away at knowing enough you might eventually be rewarded by competent outputs.
- Re: Combine with a logic engine & rule base (Score:2)
  
  by LindleyF ( 9395567 ) writes:
  
  I had a similar thought. Expert systems are good at some things; LLMs are good at others. We need to combine them. LLMs are superb at converting unstructured input to structured input. There has to be something there.
- Re: (Score:2)
  
  by Koen Lefever ( 2543028 ) writes:
  
  AlphaGeometry [wikipedia.org] works somewhat as you suggest: a LLM to suggest ideas, and an old-school symbolic inference engine to check if those make sense and lead to a solution.
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  My impression was that the Cyc project is mostly considered a failure at this time...
- Re: (Score:2)
  
  by WarlockD ( 623872 ) writes:
  
  I wonder why LLM don't have some kind of "this is true facts" database it can refer from. I know that the whole point of LLM is that it contains all that data anyway but I just think you lose a bit of information when you don't use full floating point numbers. It doesn't feel like your working with a full deck when your using 16bit floats:P
  - Re: (Score:2)
    
    by phantomfive ( 622387 ) writes:
    
    It doesn't feel like your working with a full deck when your using 16bit floats:P
    Wow wait until you realize that for optimization purposes, a lot of these models are rounding to 8-bit integer math (with surprisingly little drop off in quality). https://en.wikipedia.org/wiki/... [wikipedia.org]
Cool approach (Score:5, Interesting)

by phantomfive ( 622387 ) writes: on Sunday November 10, 2024 @06:58PM (#64935525) Journal

Of course, everyone knows these models hallucinate. The question is, what is going on inside the model to make it hallucinate? (Or alternately, what is it doing to be right so often?). Once you can figure out what's going on inside the model, then you can improve it. Actually a lot of work has been done in this area, so they are just adding to it. From the article:
'These results show that transformers can perform surprisingly well at certain tasks without understanding the rules. If scientists want to build LLMs that can capture accurate world models, they need to take a different approach, the researchers say.'
The key thing here is they don't understand the rules. For example, an AI model might make legal chess moves every time, but if you modified the chess board [wikipedia.org] then it would suddenly make illegal moves with the knight. With current AI technology, you would try to "fix" this by including as many possible different chess boards as possible, but that's not how humans think. We know the rules of the knight and recognize that in a new situation, changing the board doesn't change the way the knight moves (but it might). And if you wanted to clarify,, you could ask someone, "Do all the pieces still move the same on this new board?", but that is what these researchers did (modified the map of NY with a detour), and it really confused the model.

It is of course obvious that current LLMs do not have human intelligence because they are not Turing complete, but to understand what that means you'd need to have an internal understanding and mental model of what Turing machines are, and LLMs don't have that. :)

Share
twitter facebook
- Re: (Score:3, Interesting)
  
  by iAmWaySmarterThanYou ( 10095012 ) writes:
  
  These things are really good at sussing out patterns in (seemingly) random data. It's an extremely useful feature/ability.
  But of course they're lost when the rules and patterns change because they mastered the initial patterns already. They have no ability to think and realize, "oh, this is a new thing" because they don't ever "realize" anything. Change the chess board, the fake-I fails.
  I assume the hallucinations come from finding patterns that really weren't there but only appeared to be based on prev
  - Re: (Score:2)
    
    by phantomfive ( 622387 ) writes:
    
    But of course they're lost when the rules and patterns change because they mastered the initial patterns already.
    That's part of it, but also they have no ability to recognize what changes are "important" and what are not. Change the color of the chess piece from white to ivory and it won't recognize it (unless it has ivory in its training set).
  - - Re: (Score:2)
      
      by techno-vampire ( 666512 ) writes:
      
      And Trump now has 57.99% of the votes in the Electoral College; which number is more important?
- Re: (Score:2)
  
  by WaffleMonster ( 969671 ) writes:
  
  It is of course obvious that current LLMs do not have human intelligence because they are not Turing complete, but to understand what that means you'd need to have an internal understanding and mental model of what Turing machines are, and LLMs don't have that. :)
  You are correct they don't have human intelligence yet wrong about the reason why.
  "Memory Augmented Large Language Models are Computationally Universal"
  https://arxiv.org/pdf/2301.045... [arxiv.org]
  The key thing here is they don't understand the rules. For example, an AI model might make legal chess moves every time, but if you modified the chess board then it would suddenly make illegal moves with the knight. With current AI technology, you would try to "fix" this by including as many possible different chess boards as possible,
  Have you tried asking the AI to convert to a normal chess board before moving?
  but that's not how humans think.
  LLMs obviously don't work like humans.
  - Re: (Score:2)
    
    by phantomfive ( 622387 ) writes:
    
    "Memory Augmented Large Language Models are Computationally Universal"
    Kind of cool approach. There are a lot of ways to augment neural networks to make them Turing complete, but they don't work as well. In this case, the compute cycle in 2.3 (page 5) is actually doing the work of the Turing machine. Without it, the LLM is not a Turing machine. Also the cringe phrase "brute force proof" mentioned on page 12 is not a proof at all, but merely a few test cases. It is not at all rigorous, and almost certainly would fall down under more complete analysis (as mentioned, a lot of mo
    - Re: (Score:2)
      
      by WaffleMonster ( 969671 ) writes:
      
      Kind of cool approach. There are a lot of ways to augment neural networks to make them Turing complete, but they don't work as well. In this case, the compute cycle in 2.3 (page 5) is actually doing the work of the Turing machine. Without it, the LLM is not a Turing machine.
      Turing machines don't exist at all in the real world. All anyone can do is create a machine that if you stipulate lasts forever and has access to an external infinite memory can theoretically act as a Turing machine. Nothing lasts forever and there is no infinite anything. All 2.3 is doing is implementing the role of the interface to the external memory. The processing is being handled by the LLM.
      Try asking a human to run this machine in their head without access to a paper and pencil then report back t
      - Re: (Score:2)
        
        by phantomfive ( 622387 ) writes:
        
        Turing machines don't exist at all in the real world.
        Ok, but LLMs can't count parenthesis. Fail. Turn your brain on.
- Re: (Score:2)
  
  by narcc ( 412956 ) writes:
  
  The question is, what is going on inside the model to make it hallucinate?
  
  So-called 'hallucinations' are not errors or mistakes, they are a natural and expected result of the how these models function.
  The key thing here is they don't understand the rules.
  They don't understand anything. That's not how they work. They don't operate on facts and concepts, they operate on statistical relationships between tokens.
- Re:Cool approach (Score:5, Interesting)
  
  by martin-boundary ( 547041 ) writes: on Monday November 11, 2024 @03:18AM (#64936151)
  
  Of course, everyone knows these models hallucinate. The question is, what is going on inside the model to make it hallucinate? (Or alternately, what is it doing to be right so often?). Once you can figure out what's going on inside the model, then you can improve it. Actually a lot of work has been done in this area, so they are just adding to it.
  That's a fundamental misunderstanding of these models. The thing that makes them hallucinate is their very nature: they are non-uniform random generators of text. The hallucinations are randomly generated pieces of text. You cannot have an LLM speaking English without hallucinations, ever.
  What are they doing right? Nothing on their own. But when you couple the output with a human who is willing to interact with it until an acceptable result comes out, you get a bias towards acceptable results. But now you have a combined human-LLM, whereas the LLM itself is incapable of the same. And the output depends on how smart (and patient) the human is.
  The question of figuring out what is going on inside these models is scientifically very interesting indeed, but not for the reasons you think. It won't stop the hallucinations. For that you'd have to throw these models out and go back to something like Prolog (look it up if you've never heard of it).
  TL;DR. An LLM is a stochastic parrot. It's in its nature to hallucinate variations on stuff it found on the Internet. You can't get rid of the hallucinations.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by mesterha ( 110796 ) writes:
    
    That's a fundamental misunderstanding of these models. The thing that makes them hallucinate is their very nature: they are non-uniform random generators of text. The hallucinations are randomly generated pieces of text. You cannot have an LLM speaking English without hallucinations, ever.
    
    Given that we don't really know how humans or LLMs work, I think it's valid to say that you hallucinated this answer. I guess the important difference is that LLMs often hallucinate things that are obviously checkable f
    - Re: (Score:2)
      
      by martin-boundary ( 547041 ) writes:
      
      We most certainly know how LLMs work. It's not how they work that's interesting, it's what the output looks like as a function of the training set and interactions that's interesting.
      Slashdot used to do car analogies, so here's one: we know how a car works in excruciating detail, but what's actually interesting is what can be done with it.
      Your hallucination analogy is false unfortunately. When human beings are given the exact same training set as an LLM, the outcomes are measurably different.
- Re: (Score:2)
  
  by vyvepe ( 809573 ) writes:
  
  It is of course obvious that current LLMs do not have human intelligence because they are not Turing complete, but to understand what that means you'd need to have an internal understanding and mental model of what Turing machines are, and LLMs don't have that. :)
  I think LLMs are likely Turing complete (well in a sense when unbounded memory requirement is not strict). Notice that LLMs take their output as an input. So they have memory within their context window size. Neural networks can approximate any function so they can approximate Turing machine state machine as well. One can emulate random access memory with a log based memory (like Log Structured Filesystems do). You have a memory; you have a state machine. That indicates they are likely Turing complete sans
  - Re: (Score:2)
    
    by AleRunner ( 4556245 ) writes:
    
    That indicates they are likely Turing complete sans unbounded memory requirement.
    Just for any less CS oriented people in the audience I'll point out that this is exactly the same as any other practical system that is considered "Truring complete". Your own computer is "Turing complete" and fully capable of emulating a Turing machine. You can even download turing machine programs to run one and play with it, but in real life you have a limited amount of memory (e.g. probably 8gb of RAM on your phone)
    Also "Turing complete" just means "is a normal computer able to solve the problems that w
  - Re: (Score:2)
    
    by phantomfive ( 622387 ) writes:
    
    Notice that LLMs take their output as an input. So they have memory within their context window size.
    That just means it's going to fail the parentheses matching problem [avikdas.com].
    - Re: (Score:2)
      
      by vyvepe ( 809573 ) writes:
      
      Anything which has only a bounded memory will fail when checking some long enough sequence of parenthesis. A better approach to attack the idea can be a claim that encoding DFA state (of an TM) in the output token sequence is cumbersome. But still, in such a case, it is about efficiency and not about whether it can be done in principle.
      - Re: (Score:2)
        
        by phantomfive ( 622387 ) writes:
        
        Anything which has only a bounded memory will fail when checking some long enough sequence of parenthesis.
        That's...such a horrible misunderstanding of the situation that I don't know how to respond to you. For a parenthesis counting algorithm, all you need is an integer. We're not talking about a lot of memory. LLMs fail catastrophically at counting matching parenthesis. This is well covered theoretically, you may be ignorant of that.
- Re: (Score:2)
  
  by MobyDisk ( 75490 ) writes:
  
  With current AI technology, you would try to "fix" this by including as many possible different chess boards as possible
  First of all, that is one way to "fix" the problem, but certainly not what we would try first. First, you would explain the arrangement of the new board to the AI. But the AI would probably still make mistakes sometimes. But that isn't the only way. I would solve this by telling the AI the rules. But it sounds like we would agree that the AI would still make mistakes. The trouble here is that humans do as well. All this example demonstrates is that humans and AIs think quite similarly.
  , but that's not how humans think
  Humans work two
  - Re: (Score:2)
    
    by phantomfive ( 622387 ) writes:
    
    I would solve this by telling the AI the rules. But it sounds like we would agree that the AI would still make mistakes.
    What exactly does this mean? How exactly would you tell the AI the rules? Introduce something in its training set?
    Turing completeness is not a measure of intelligence.
    The Chomsky hierarchy [wikipedia.org] shows what problems a system theoretically can solve, and what problems it provably can never solve, with a Turing machine being the most capable (known) at type-0.
New discovery. (Score:2)

by backslashdot ( 95548 ) writes:

Anyone who thought this was even possible doesn't have a coherent understanding of the world.
Artificial Intelligence is NOT Actual Intelligence (Score:2)

by Jayhawk0123 ( 8440955 ) writes:

as humans that grow up in the world, we hardly have a coherent understanding of the world, let alone an LLM that is trained on huge data sets and forms patterns to mimic a form of intelligence.
Sounds like the people that wasted their time on this don't have a coherent understanding of the world either. Marketing wank is just that, were they expecting actual Intelligence?
The horrors of being wrong (Score:3)

by WaffleMonster ( 969671 ) writes: on Sunday November 10, 2024 @08:15PM (#64935645)

This is all rather interesting. People create systems inspired by how brains work then they turn around and get all upset it isn't perfect and criticize the system for its failure to magically compile and execute some kind of robust model of how the world works that would enable it to always generate infallible predictions.
On one hand we have people who either hate with a passion or dismiss LLMs outright as cut and paste machines which don't even deserve to be called AI. On the other hand we have people running around comparing them to nuclear weapons and worry about the prospects for the world to be turned into paperclips.
Personally from my own experience I've seen LLMs demonstrate the ability to generalize. I've uploaded documentation for things not in its training set and it was able to apply its experience to answer questions even generating working code in a language it has never seen before. I've used LLMs to base64 decode, perform language translation and figure out simple ciphers albeit sometimes they fuck up. This is demonstrably more than cut and paste.
Humans are highly intolerant to incoherence... you get home from work to find your sofa floating in the air or unplug the blender only for it to keep running you would become highly agitated. People build coherent understanding of how the world works even if those models are fundamentally lacking or misguided and they get highly agitated when they are presented with contradictory information. While LLMs don't appear to have any comparable mechanisms it doesn't mean they are simply cut and paste machines either.

Share
twitter facebook
Turing Test (Score:2)

by PPH ( 736903 ) writes:

LLM's will have achieved true intelligence when they answer the question with <Maine_accent>Ya' can't get thea' from hea'</Maine_accent>.
Pointing out that water is wet (Score:2)

by Mirnotoriety ( 10462951 ) writes:

Announcing that the sky is, in fact, blue

Brilliantly deducing that fire is hot.
So my AI girlfriend is a moron? (Score:2)

by thesjaakspoiler ( 4782965 ) writes:

Bummer. She seems pretty good at telling me all things I'd like to hear!
Surprise! (Score:2)

by coopertempleclause ( 7262286 ) writes:

Generative AI doesn't even know how many R's are in strawberry.
- Re: (Score:3)
  
  by phantomfive ( 622387 ) writes:
  
  It does now. As soon as a problem becomes publicized, it gets fixed. If you don't understand the problem, you won't be able to exploit it: you'll just be parroting information like an LLM.
  
  Here is a sample conversation I just had with ChatGPT:
  Me: how many Rs are in Strawberry?
  Chat: The word "strawberry" has three Rs.
  Me: How many Rs are in Srawberry?
  Chat: "Srawberry" has two Rs. If you meant "strawberry," then it has three Rs.
  Nice spellcheck.
- Re: (Score:2)
  
  by allo ( 1728082 ) writes:
  
  Sigh. Ever heard about tokenizing? An AI tokenizes Strawberry, for example, as St-raw-berry and see something like 0x723-0x681-0x612. If it knows how many r there are in 0x123-0x681-0x612 it memorized it, but it didn't count it as the token for r (e.g. 0x72) is no part of st-raw-berry, so strawberry has 0 r in the language of AI.
duh (Score:2)

by grep_rocks ( 1182831 ) writes:

In related new researchers discover most humans don't have a coherent understanding of the world...
Logan's Run (Score:2)

by eric31415927 ( 861917 ) writes:

Been there; done that.
The central AI in Logan's Run didn't know everything. Enough said.
An AI / LLM is a database (Score:5, Insightful)

by Beeftopia ( 1846720 ) writes: on Monday November 11, 2024 @02:32AM (#64936095)

An AI / LLM is a database you can talk to (query) using natural language. It is an amazing achievement, famously fooling one of Google's own software engineers (Blake Lemoine) into believing the machinery was sentient.
But it's still a database. It's trained for weeks on a datasets to set the weights in the neural network. Tokens [openai.com] from prompts filter through the (static) neural network repeatedly building a response ("inferring").
The problem is as the tokens cycle through the neural network, building the response by filtering through the weights, it's impossible for a human to know exactly what it's doing - specifically how it's reasoning to come to its conclusions. That's where the field of Explainable AI [google.com] comes in.
To help people get a handle on AI, here's how they're priced - based on tokens:
https://learn.microsoft.com/en... [microsoft.com]
https://help.openai.com/en/art... [openai.com]
A bunch of weights in a neural network (the weights set by weeks of continuous training on a dataset), tokens extracted from prompts, filter through the neural network, building the output. Is the possibility of sentience in there? Consciousness? What's the core action being taken? How exactly is a token response built?
Here's a description from IBM [ibm.com]:
During the training process, these models learn to predict the next word in a sentence based on the context provided by the preceding words. The model does this through attributing a probability score to the recurrence of words that have been tokenized— broken down into smaller sequences of characters. These tokens are then transformed into embeddings, which are numeric representations of this context.
To ensure accuracy, this process involves training the LLM on a massive corpora of text (in the billions of pages), allowing it to learn grammar, semantics and conceptual relationships through zero-shot and self-supervised learning. Once trained on this training data, LLMs can generate text by autonomously predicting the next word based on the input they receive, and drawing on the patterns and knowledge they've acquired. The result is coherent and contextually relevant language generation that can be harnessed for a wide range of NLU and content generation tasks.
Once it's trained, it's trained. The neural network and its weights are set. Then it's time to query / prompt.
It's amazing stuff. That the machinery can do this is astonishing. But it's feeding tokens through a trained, static neural network.
Now... right now it's trained on binary data, audio, video, images, text. Prompts are tokenized from incoming text strings. The technology is in its infancy. Could you program something around this core system to be a decision making platform that could be placed in a robot which could navigate its environment, and make decisions about what to do? I think that's coming. That would require being able to tokenize the world around it. I suspect it would take a vast amount of training data that may be beyond current computing and electrical power capabilities. This is nascent technology and there's a long road ahead.
On the other hand... quantum computing, fusion... these have promise, but engineering limitations limit the ability to realize those promises. So, one needs to have a balanced view. We're just at the beginning though. Relational databases were introduced in the early-mid 70s. This technology has been introduced just now, so who knows what it'll look like in 50 years.
Disclaimer: I'm not remotely an LLM / AI expert. Reading the CACM, thinking about it, but there are
Read the rest of this comment...

Share
twitter facebook
- Re: (Score:2)
  
  by Visarga ( 1071662 ) writes:
  
  > But it's still a database
  
  Yet we could train models to route under all sorts of restrictions and then AI would work out. Of course if you only train the model on the perfectly open map you can't route with obstacles later.
- Re: (Score:2)
  
  by phantomfive ( 622387 ) writes:
  
  How are you different than a database? What is your brain doing that a database can't?
  - Re: (Score:2)
    
    by Beeftopia ( 1846720 ) writes:
    
    How are you different than a database? What is your brain doing that a database can't?
    The first big difference between an LLM and a human is that the LLM is set in stone after training. It is a static filter. It gains no more information. The structure is set. With a human, it can immediately update its own personal knowledge store (database) with new information. It's also able to autonomously decide to do this. "Berries taste good. Red berry taste good, make Gorok feel good. Blue berry make Gorok feel b
Humans would have a hard time too (Score:3)

by Visarga ( 1071662 ) writes: on Monday November 11, 2024 @03:22AM (#64936153)

Navigating a city with dynamic traffic conditions is hard for humans as well. We don't easily route around problem areas with just our heads. Maybe an experienced taxi driver would, but not someone who just goes home-to-work on a standard route, they don't form a detailed city level model.

Share
twitter facebook
- Re: (Score:2)
  
  by chas.williams ( 6256556 ) writes:
  
  Only because humans don't have access to enough knowledge about the traffic conditions. Knowing the existing conditions, a human can easily route around the problem areas. Most have driven their routes for years and know all the shortcuts that work and don't work.
- Re: (Score:2)
  
  by SpinyNorman ( 33776 ) writes:
  
  NYC, at least north of Greenwich Village, is based on a grid system of Avenues running north-south, and Streets running east-west. You'd have to be a moron not to be able to route around any road closures.
Welll (Score:2)

by MitchDev ( 2526834 ) writes:

That's what the AIs want us to think........
An AI is just hallucinating all the time (Score:3)

by RobinH ( 124750 ) writes: on Monday November 11, 2024 @08:11AM (#64936395) Homepage

Go on YouTube and find the AI-generated minecraft videos. This is a project that uses AI to generate real-time minecraft based on mouse and keyboard input, trained on actual gameplay. You see very quickly that the output is only generated based on the last frame or series of frames, not on an actual internal representation of the game world. It's kinda trippy, and a lot like a dream. Very odd. It's worth a watch.

Share
twitter facebook
- Re: (Score:2)
  
  by allo ( 1728082 ) writes:
  
  But keep in mind, that's a proof of concept for one technique (using n frames + userinput to generate frame n+1) and not meant to be a full game.
  If you'd like to build something on that, you could couple it for exmaple with a simple memory for at least your position in the world. The counter strike demo (look it up, it is cool as well) would benefit from ammo counter and enemy positions. Not all of them need to be neural networks, you would just have as input (n frames, user input, ammo count, enemy positio
In other MIT news, water is wet! (Score:2)

by chas.williams ( 6256556 ) writes:

Did we really need someone to tell us this?
- Re: (Score:2)
  
  by bleedingobvious ( 6265230 ) writes:
  
  It's how the scientific method works.
  "It's self evident/obvious" is how we represent feelings and intution. Neiher have anything to do with measureable data.
  There exists a large cohort of ignorant individuals out there claiming that "AI is becoming self-aware". We don't combat ignorance with more ignorance. We do so with facts.
  Unless you have some other magical method for doing this?
  - Re: (Score:2)
    
    by chas.williams ( 6256556 ) writes:
    
    Trust me, this research won't convince them.
Mimicry vs Intelligence. (Score:2)

by Fly Swatter ( 30498 ) writes:

All this AI stuff is just copy and paste. There is nothing intelligent going on.

Even a Parrot is far ahead of any Al regurgitation machine.
"AI" (Score:2)

by ledow ( 319597 ) writes:

"AI lacks inference and is found to be nothing more than a statistical machine."
Same as every "AI" since the 60's. Except now you no longer have the excuse of not enough training data / not enough processors / not long enough to train on / lack of funding.
Seriously - where are you going to go from here, where we spend billions and years to train on the entire Internet with millions of processors?
Maybe back to the drawing board to make something actual capable of intelligence, I hope.
But humans also don't understand the world (Score:2)

by shanen ( 462549 ) writes:

REALLY disappointed in Slashdot for missing this obvious joke. A clever version would wrap it around some version of the Turing Test. Perhaps saying humans now pass the test because their coherent understanding of the world is even worse than ChatGPT's?
Or maybe a joke working it into the context of the recent election? Incoherent candidate wins again?
Or some kind of quantum mechanical joke on the coherent bit? My reality function collapsed and killed (and ate?) my dogma?
- Re: (Score:2)
  
  by Tony Isaac ( 1301187 ) writes:
  
  Yeah, I think a lot of people think they do. And a lot more people are selling it as if it did.
- Re: (Score:2)
  
  by phantomfive ( 622387 ) writes:
  
  Wow, you should have written less and used the time saved to at least read the article.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

No kidding (Score:5, Insightful)

Re: (Score:2)

So what do humans do? (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: No kidding (Score:3)

Seriously, did we need a MIT study? (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:Seriously, did we need a MIT study? (Score:5, Interesting)

Re: (Score:2)

Re: (Score:2)

Re:Seriously, did we need a MIT study? (Score:5, Interesting)

Re: (Score:2)

Re: Seriously, did we need a MIT study? (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

understanding? (Score:5, Insightful)

Re: (Score:3)

Re: (Score:3)

Re: (Score:3)

Results by (Score:2)

And this is different from humans? (Score:5, Insightful)

Re: (Score:2)

Re: (Score:3, Insightful)

Re: And this is different from humans? (Score:3)

Your experience is called a delusion (Score:2)

Re: (Score:2)

Re:And this is different from humans? (Score:5, Insightful)

Re: (Score:2)

Re: And this is different from humans? (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: And this is different from humans? (Score:2)

Re:And this is different from humans? (Score:5, Insightful)

Re: (Score:2)

Combine with a logic engine & rule base (Score:2)

Typo Corrections: (Score:2)

Re: (Score:3)

Re:Combine with a logic engine & rule base (Score:5, Interesting)

Re: Combine with a logic engine & rule base (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: Combine with a logic engine & rule base (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Cool approach (Score:5, Interesting)

Re: (Score:3, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:Cool approach (Score:5, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

New discovery. (Score:2)

Artificial Intelligence is NOT Actual Intelligence (Score:2)

The horrors of being wrong (Score:3)

Turing Test (Score:2)

Pointing out that water is wet (Score:2)

So my AI girlfriend is a moron? (Score:2)

Surprise! (Score:2)

Re: (Score:3)

Re: (Score:2)