Meta's Latest Large Language Model Survived Only Three Days Online (technologyreview.com) 57

Posted by msmash on Friday November 18, 2022 @01:23PM from the tough-luck dept.

On November 15 Meta unveiled a new large language model called Galactica, designed to assist scientists. But instead of landing with the big bang Meta hoped for, Galactica has died with a whimper after three days of intense criticism. Yesterday the company took down the public demo that it had encouraged everyone to try out. From a report: Meta's misstep -- and its hubris -- show once again that Big Tech has a blind spot about the severe limitations of large language models. There is a large body of research that highlights the flaws of this technology, including its tendencies to reproduce prejudice and assert falsehoods as facts.

Galactica is a large language model for science, trained on 48 million examples of scientific articles, websites, textbooks, lecture notes, and encyclopedias. Meta promoted its model as a shortcut for researchers and students. In the company's words, Galactica "can summarize academic papers, solve math problems, generate Wiki articles, write scientific code, annotate molecules and proteins, and more." But the shiny veneer wore through fast. Like all language models, Galactica is a mindless bot that cannot tell fact from fiction. Within hours, scientists were sharing its biased and incorrect results on social media.

Meta's Latest Large Language Model Survived Only Three Days Online

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 57 Comments Log In/Create an Account

Comments Filter:

Dr Ian Malcolm's rant (Score:1)

by TheStatsMan ( 1763322 ) writes:

I'm reminded of this performance almost daily amid the ecumenical promises of the so-called 'tech leaders'
https://www.youtube.com/watch?... [youtube.com]
I wonder ... (Score:2)

by Kiliani ( 816330 ) writes:

... did it have Pigs In Space! [youtube.com]??
I am too slow for AI... (Score:4)

by gweihir ( 88907 ) writes: on Friday November 18, 2022 @01:41PM (#63061453)

By now these projects die faster than I can read about them...

- Re: I am too slow for AI... (Score:2)
  
  by q_e_t ( 5104099 ) writes:
  
  You need to create some AI to read them for you and summarise.
- Fakebook pulled a Tay. (Score:2)
  
  by stooo ( 2202012 ) writes:
  
  Fakebook pulled a Tay. They were eager to follow MS.
  - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    Well, yes. If they now could pull a Twitter in addition, that would clearly be a major service to the human race. Although Elon seems tired these days, looks like destroying Twitter is really taking it out of him.
    - Re: (Score:2)
      
      by stooo ( 2202012 ) writes:
      
      Yes, he destroys twitter....to rebuild it at the same time.
      Not so for Facebook.
      - Re: (Score:2)
        
        by gweihir ( 88907 ) writes:
        
        Yes, he destroys twitter....to rebuild it at the same time.
        Hahaha, no. He is cluelessly bumbling about and nothing will be left when he is done. What we get to witness is how abysmally bad a manager he actually is.
        
        Re: (Score:2)
        
        by stooo ( 2202012 ) writes:
        
        Yep, exactly like Tesla, who went bankrupt 25 times in 10 years.
So, it's... (Score:3)

by MoeDrippins ( 769977 ) writes: on Friday November 18, 2022 @01:41PM (#63061455)

> including its tendencies to reproduce prejudice and assert falsehoods as facts.
So it's a politician?

- Comment removed (Score:4, Insightful)
  
  by account_deleted ( 4530225 ) writes: on Friday November 18, 2022 @02:02PM (#63061521)
  
  Comment removed based on user account deletion
  
  - Re:So, it's... [Artifical Stupidity wins again] (Score:3)
    
    by shanen ( 462549 ) writes:
    
    Mod parent funny.
    But here's by reaction from another website:
    [Website] needs combined reactions. Funny and sad in this case.
    In solution terms, I don't think we need such gigantic corporate cancers creating giant problems where no problem was called for. This was a solution in search of a problem, but when people looked more closely, it was quickly obvious that the proposed solution WAS the biggest problem. (And of course this didn't solve Facebook's fake problem. "There ain't no profit big enough.")
    Yes, you need a big project if you're trying to solve a big problem. Sending a rocket to the moon, for example. Social media should NOT be a big project, though there should be a PUBLIC standard for a communications protocol so the small projects can connect to social networks that might be large...
    *sigh* I feel like including yet another description of the NOT-at-all-like-Facebook website that I am still searching for. Will I know it by its "Why?" button?
    P.S. And yes, I'm still wondering who nuked my Facebook account. And why?
Auto-generated content (Score:5, Informative)

by Aero77 ( 1242364 ) writes: on Friday November 18, 2022 @01:41PM (#63061457)

from the article: "A fundamental problem with Galactica is that it is not able to distinguish truth from falsehood, a basic requirement for a language model designed to generate scientific text. People found that it made up fake papers (sometimes attributing them to real authors), and generated wiki articles about the history of bears in space as readily as ones about protein complexes and the speed of light. It’s easy to spot fiction when it involves space bears, but harder with a subject users may not know much about."

- Re: (Score:2, Troll)
  
  by account_deleted ( 4530225 ) writes:
  
  Comment removed based on user account deletion
  - Re: (Score:2)
    
    by IWantMoreSpamPlease ( 571972 ) writes:
    
    Depends on how well it's written. If you are claiming you are the new Messiah then using modern language is one thing, but if you claim to have found an undiscovered trove of dead see scrolls, then your command of those languages and writing style from back then better be top notch.
    On the other hand, given the current level of intelligence of the average voter in the US...you could write it out in crayon and sign them "Thomas Jefferson", and no one would think they were any less authentic.
  - Re: (Score:2)
    
    by Potor ( 658520 ) writes:
    
    Even in science the notion of truth is not clear, but rather is begged, usually through some version of correspondence theory.
- Re: (Score:2)
  
  by tsqr ( 808554 ) writes:
  
  A fundamental problem with Galactica is that it is not able to distinguish truth from falsehood
  Sounds like a lot of people I've run into over the years.
Special cases and "fixers" (Score:3)

by drinkypoo ( 153816 ) writes: <drink@hyperlogos.org> on Friday November 18, 2022 @01:43PM (#63061463) Homepage Journal

Because stable diffusion (&c) can't count, you often need to manually fix hands. It also can't do faces reliably, so there are ML face fixers that repair faces. Weird that there's two face fixers and no hand fixers, but probably face fixing was studied a lot more so there were techniques in the can ready to go.
Well, these things can't fact check, so obviously you need to do that manually. So what's needed next to make this almost useful is a ML fact checker, presumably. But using the current approach is always going to produce bad results.
This does highlight a real problem in scientific research though... some of it is bullshit. Even if you train only on peer-reviewed papers you'll still get a lot of crap.

- Re: Special cases and "fixers" (Score:2)
  
  by LindleyF ( 9395567 ) writes:
  
  So you need to combine stable diffusion with an expert system. Give the AI a logical side of the brain and a creative side......hey, wait a minute....
- Re: (Score:3)
  
  by sinij ( 911942 ) writes:
  
  This does highlight a real problem in scientific research though... some of it is bullshit.
  If peer-review process that involves multiple subject matter experts failed to detect the problem, why do you think your proposed solution to use fact-checkers would do any better? Mandatory dataset disclosure and replication studies is the only approach I could see producing better results. Also, it would be a good idea to mitigate imperatives to cheat (i.e., publish or perish, "independent" testing labs dependent on pharma for revenue) or at least take hard look at how we deal with actual conflicts of int
  - Re: (Score:3)
    
    by drinkypoo ( 153816 ) writes:
    
    If peer-review process that involves multiple subject matter experts failed to detect the problem, why do you think your proposed solution to use fact-checkers would do any better?
    I presume there is already an arms race between the people trying to detect bullshit, and the people trying to bypass the bullshit detectors, so I acknowledge that the situation is complicated. Still, that's how things are always going to be as long as we have people with unmet needs (emotional, economic, or other) which probably means effectively forever.
    Mandatory dataset disclosure and replication studies is the only approach I could see producing better results.
    I feel like we generally need that. But one approach would be to build some biases into the system, which of course you have to carefully [re]evaluate to
    - Re: (Score:2)
      
      by sinij ( 911942 ) writes:
      
      If peer-review process that involves multiple subject matter experts failed to detect the problem, why do you think your proposed solution to use fact-checkers would do any better?
      I presume there is already an arms race between the people trying to detect bullshit, and the people trying to bypass the bullshit detectors
      Sadly, not what I have seen. There is a disincentive to criticize papers that cite your work or support your model/pet theory. As such, by being strategic with your choice of suggested peer reviewers you can get an easy ride. Certainly some people do take it seriously and put the required work to critically review, but as with any honor-based system, it is collapsing in modern society.
      - Re: (Score:2)
        
        by alvinrod ( 889928 ) writes:
        
        You shouldn't be able to choose your peer reviewers. The journal that's publishing the article should perform proper due diligence to try and ensure that they aren't asking anyone who may be predisposed to give sloppy research a pass.
        
        Re: (Score:2)
        
        by sinij ( 911942 ) writes:
        
        In most cases you provide a list of 10 candidates and the editor picks few from that list.
X is racist, a translation: (Score:2)

by RightwingNutjob ( 1302813 ) writes:

X is ambiguous. Therefor it can be denounced, and some will believe the denunciation.
The creators of X have failed to grease the correct palms. Therefor X will be denounced so that the creators of X, Y, and Z remember who's in charge and won't repeat their mistake.
Clearly (Score:2)

by smooth wombat ( 796938 ) writes:

this was no Battlestar [imdb.com] and Adama wasn't in command.
Searle was right (Score:2)

by TheMiddleRoad ( 1153113 ) writes:

There's no mind here.
- Re: Searle was right (Score:2)
  
  by Anonymouse Cowtard ( 6211666 ) writes:
  
  Galactica is a mindless bot that cannot tell fact from fiction
  It's American, right? Sounds like sentience to me.
Bots like this aren't bad (Score:4, Interesting)

by MpVpRb ( 1423381 ) writes: on Friday November 18, 2022 @01:52PM (#63061487)

They're TOO good at mimicking flawed human behavior. We don't like our reflection in an accurate mirror

- Re: (Score:2)
  
  by timeOday ( 582209 ) writes:
  
  Neopuritans in the press and the public have made a sport of building up the preconception that these large language models are supposed to be oracles of truth and virtue, and then purposely eliciting the opposite from them as some sort of triumph. I wonder if search engines would have been allowed to develop in this social environment, they certainly enable you to find web pages that are bad and or incorrect.
- Re: (Score:3)
  
  by Retired Chemist ( 5039029 ) writes:
  
  Exactly, real people have issues distinguishing between fact and falsehood. See the debates on climate change and vaccines for further information. Why should be expect a bot to do better. It generally takes someone familiar with a subject to take the distinction and they are not always accurate. When I was actively reading the literature, I would look at the journal (some are more careful than others), the authors (although given the big names that have been implicated in research fraud that is not a g
Galactica vs Tay (Score:2)

by awwshit ( 6214476 ) writes:

When is the Galactica vs Tay debate? I'm definitely going to tune in.
- Re: (Score:2)
  
  by ayesnymous ( 3665205 ) writes:
  
  I would upvote this if I could.
Kudos for testing in public (Score:2)

by mveloso ( 325617 ) writes:

How many "AI" products never get tested and put into production?
A lot.
Kudos to the Meta team for having the courage to fail publicly. They're also gathering data sets that nobody else has in the process.
snake oil and the wrong track (Score:5, Insightful)

by Walt Dismal ( 534799 ) writes: on Friday November 18, 2022 @02:05PM (#63061535)

As an AI researcher, I have said over and over much of this is snake oil and mere pretense. Although neural net technology has made lots of flashy progress in some areas, it is on the wrong track for creating minds. The idea of scanning millions of raw sources then training from that is that it completely lacks ability to do the important cultural views of minds needed to extract real meaning through interpretation relative to a culture or to a specific mind. Google despite its vaunted achievements still doesn't get it.

- Re: (Score:3)
  
  by Pinky's Brain ( 1158667 ) writes:
  
  It's just extremely advanced cargo culting, but it could perhaps still be good enough for games. If you can live with the fact that the first time it gets prompted to say something politically incorrect and you don't pull it off the market you will get cancelled and your ability to receive credit card payments will be blocked.
  OpenAI only survives by not letting their stuff be used for pretty much anything of consequence, quickly shutting down any uses the moment the inevitable politically incorrect use happ
  - Re: (Score:3)
    
    by sinij ( 911942 ) writes:
    
    It is interesting that you bring up cancel culture, which is a type of unintentional behavior in human collective consciousness that has roots in social shunning. Only shunning which is adaptive local behavior doesn't scale up when you massively network it with social media. So humanity on the whole still has to figure out how to deal with this, so it is not surprising that when you train AI on a dataset that predates the phenomena it is unable to deal with it effectively.
    - - Re: (Score:2)
        
        by sinij ( 911942 ) writes:
        
        Now it seems obvious that a system encouraging constant posturing would shift towards a moralistic pissing contest.
        If Musk manages to blow up Twitter, he should get nominated for Nobel Peace prize.
- Re: (Score:2)
  
  by sinij ( 911942 ) writes:
  
  it is on the wrong track for creating minds.
  Thankfully. I would rather not go extinct.
- Re: (Score:2)
  
  by shanen ( 462549 ) writes:
  
  Is this a reference to How to Create a Mind by Kurzweil? I'm just about done with it. I largely dislike him and therefore tend to disagree with him, but he makes a lot of good points in this one.
  - Re: (Score:2)
    
    by Walt Dismal ( 534799 ) writes:
    
    Was not a reference to Ray but now that you bring him up, I will pick some bones. Ray has been quite wrong in some things and his predictions. I do research into minds, human and AI. What I know is that the big players like Google, Meta, OpenAI are somewhat doing the wrong things. Although brains are neural networks, going at mind creation from the neuron side first is the wrong way to go. We have to go from the functional side, identifying the mechanisms of thought, then mechanizing those functions. What
    - Re: (Score:2)
      
      by shanen ( 462549 ) writes:
      
      Hmm... I don't want to poke where I shouldn't, but I think the focus on patents and money is part of the problem and it is driving much of the research into rat holes. Sometimes even into rabbit holes. But it does sound like you might find his book worth your time. I think he is taking some of the same positions that you are.
      However, if your motivations sound too commercial, I think some of his are too survivalist. Almost a primitive focus on individual survival?
      Did just think of a more neutral example in a
- Word soup (Score:4, Interesting)
  
  by Dan East ( 318230 ) writes: on Friday November 18, 2022 @05:45PM (#63062151) Journal
  
  Basically it's word soup, and attempting to ascribe meanings to words and create associations between them. At the end of the day the only knowledge within the system are language-specific grammatical rules and patterns, and some very indirect bonding of word meanings to the real world. Just because the routines can spit out whole sentences that are grammatically correct, and often combine words in ways that make sense pertaining to a specified topic, does not mean it has any "understanding" or knowledge whatsoever of a given subject.
  It's like the differences between eyeballing a circle and estimating its area, versus using a formula to calculate the area, versus knowing the mathematics and proofs required to produce a formula for calculating the area of a circle. Those are different levels of knowledge beyond the superficial linguistics of merely being conversational about a circle and its area.
  
The cancer you created bits you in the ass (Score:3)

by Opportunist ( 166417 ) writes: on Friday November 18, 2022 @02:26PM (#63061587)

"A fundamental problem with Galactica is that it is not able to distinguish truth from falsehood, a basic requirement for a language model designed to generate scientific text. People found that it made up fake papers (sometimes attributing them to real authors), and generated wiki articles about the history of bears in space as readily as ones about protein complexes and the speed of light. It’s easy to spot fiction when it involves space bears, but harder with a subject users may not know much about"
In other words, business as usual in the Metastasis. You created a world of fake news where reality or truth matters little and the most exciting and most impressions-generating story is king.
You made your bed. Now lie in it.

Bias (Score:2)

by Pinky's Brain ( 1158667 ) writes:

It's not perfect so the only reasonable response to expect would be researchers to crawl over each other to score browny points for bashing Meta, why the fuck did Meta even try? They need to stop engaging with most media and pretty much all outside researchers, there is fuck all useful they can get from them.
It's more a politically correct imperative to bash Meta than the Catholic church.
- Re:Bias (Score:5, Interesting)
  
  by marcle ( 1575627 ) writes: on Friday November 18, 2022 @03:10PM (#63061697)
  
  I think you're missing the point. Meta promoted this as a way to assist in writing scientific papers, but instead, it generates falsehoods. It even makes up citations that don't exist. This is not only useless, it's dangerous. The last thing science needs is more misinformation being circulated.
  And Yann Lecun, Meta's chief AI scientist, is defending it. That's what I don't understand. Something that is not only unfit for purpose, but actively harmful to the scientific community. Mr. Lecun is clearly first of all a Meta employee, and only second of all a scientist.
  
  - Re: (Score:2)
    
    by Pinky's Brain ( 1158667 ) writes:
    
    It's just an advanced pattern matching sentence completer, just letting it go it's merry way from start to finish is something you can let it do for fun, not the expected usage model. It's not dangerous, no one of consequence is going to be fooled if you just let it go on without checking the output. Ohh you can submit bullshit to vanity publishing and piss in the sea of piss, what a catastrophe!!!
    "This tool is to paper writing as driving assistance is to driving. It won’t write papers automatically f
So artificial human intelligence... (Score:3)

by hsthompson69 ( 1674722 ) writes: on Friday November 18, 2022 @03:06PM (#63061683)

...is apparently just as biased as natural human intelligence.
Truth is found through the ruthless application of skepticism to one's own deeply cherished ideals. We narrow the area where truth must be through falsification.
If there is anything that proves that Bayesian statistical methods don't lead to truth, this should be it. A massive statistical analysis does not lead to truth, if you don't have a method for dealing with the demarcation problem, like Popper had.

They don't mean real scientists (Score:3)

by Shaitan ( 22585 ) writes: on Friday November 18, 2022 @03:08PM (#63061691)

This is talking about social science aka pseudoscientific garbage beloved by astrology fans.

If the data supports somethingâ¦ (Score:1)

by snapsnap ( 5451726 ) writes:

They shouldnâ(TM)t fight against it. Thatâ(TM)s anti-science.
All those ass-kissers were (Score:1)

by Tablizer ( 95088 ) writes:

...too chicken to tell Zuck it's a boondoggle.
Orbit first, not moonshots (Score:2)

by Tablizer ( 95088 ) writes:

Meta is trying for too many home-runs rather than incremental base hits. They could have instead focused on improving medical diagnostics and mistake detection which only have to suggest leads and possible oddities, not make an almost-finished product.
A similar big-or-nothing project is the Metaverse where they focused on 3D headsets instead of a more approachable "flat" monitor version. [slashdot.org]
Did it actually work? (Score:2)

by mkwan ( 2589113 ) writes:

From the article, it sounds like some activists found some outputs they didn't like. But that always happens.
More important, did it work in the majority of the use-cases? I'm actually curious. Does anyone know?
And why would it fail if trained on peer-reviewed papers?
I am not surprised (Score:1)

by RUs1729 ( 10049396 ) writes:

The stuff having to do with AI (and quantum computing) from for-profit companies these days is mostly hype and vapor. This is just another example thereof.
Human intelligence (Score:2)

by The Evil Atheist ( 2484676 ) writes:

including its tendencies to reproduce prejudice and assert falsehoods as facts.
So it's stooping to human level intelligence.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Dr Ian Malcolm's rant (Score:1)

I wonder ... (Score:2)

I am too slow for AI... (Score:4)

Re: I am too slow for AI... (Score:2)

Fakebook pulled a Tay. (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

So, it's... (Score:3)

Comment removed (Score:4, Insightful)

Re:So, it's... [Artifical Stupidity wins again] (Score:3)

Auto-generated content (Score:5, Informative)

Re: (Score:2, Troll)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Special cases and "fixers" (Score:3)

Re: Special cases and "fixers" (Score:2)

Re: (Score:3)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

X is racist, a translation: (Score:2)

Clearly (Score:2)

Searle was right (Score:2)

Re: Searle was right (Score:2)

Bots like this aren't bad (Score:4, Interesting)

Re: (Score:2)

Re: (Score:3)

Galactica vs Tay (Score:2)

Re: (Score:2)

Kudos for testing in public (Score:2)

snake oil and the wrong track (Score:5, Insightful)

Re: (Score:3)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Word soup (Score:4, Interesting)

The cancer you created bits you in the ass (Score:3)

Bias (Score:2)

Re:Bias (Score:5, Interesting)

Re: (Score:2)

So artificial human intelligence... (Score:3)

They don't mean real scientists (Score:3)

If the data supports somethingâ¦ (Score:1)

All those ass-kissers were (Score:1)

Orbit first, not moonshots (Score:2)

Did it actually work? (Score:2)

I am not surprised (Score:1)

Human intelligence (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals