Meta's Latest Large Language Model Survived Only Three Days Online (technologyreview.com) 57
On November 15 Meta unveiled a new large language model called Galactica, designed to assist scientists. But instead of landing with the big bang Meta hoped for, Galactica has died with a whimper after three days of intense criticism. Yesterday the company took down the public demo that it had encouraged everyone to try out. From a report: Meta's misstep -- and its hubris -- show once again that Big Tech has a blind spot about the severe limitations of large language models. There is a large body of research that highlights the flaws of this technology, including its tendencies to reproduce prejudice and assert falsehoods as facts.
Galactica is a large language model for science, trained on 48 million examples of scientific articles, websites, textbooks, lecture notes, and encyclopedias. Meta promoted its model as a shortcut for researchers and students. In the company's words, Galactica "can summarize academic papers, solve math problems, generate Wiki articles, write scientific code, annotate molecules and proteins, and more." But the shiny veneer wore through fast. Like all language models, Galactica is a mindless bot that cannot tell fact from fiction. Within hours, scientists were sharing its biased and incorrect results on social media.
Galactica is a large language model for science, trained on 48 million examples of scientific articles, websites, textbooks, lecture notes, and encyclopedias. Meta promoted its model as a shortcut for researchers and students. In the company's words, Galactica "can summarize academic papers, solve math problems, generate Wiki articles, write scientific code, annotate molecules and proteins, and more." But the shiny veneer wore through fast. Like all language models, Galactica is a mindless bot that cannot tell fact from fiction. Within hours, scientists were sharing its biased and incorrect results on social media.
Dr Ian Malcolm's rant (Score:1)
I'm reminded of this performance almost daily amid the ecumenical promises of the so-called 'tech leaders'
https://www.youtube.com/watch?... [youtube.com]
I wonder ... (Score:2)
... did it have Pigs In Space! [youtube.com]??
I am too slow for AI... (Score:4)
By now these projects die faster than I can read about them...
Re: I am too slow for AI... (Score:2)
Fakebook pulled a Tay. (Score:2)
Fakebook pulled a Tay. They were eager to follow MS.
Re: (Score:2)
Well, yes. If they now could pull a Twitter in addition, that would clearly be a major service to the human race. Although Elon seems tired these days, looks like destroying Twitter is really taking it out of him.
Re: (Score:2)
Yes, he destroys twitter....to rebuild it at the same time.
Not so for Facebook.
Re: (Score:2)
Yes, he destroys twitter....to rebuild it at the same time.
Hahaha, no. He is cluelessly bumbling about and nothing will be left when he is done. What we get to witness is how abysmally bad a manager he actually is.
Re: (Score:2)
Yep, exactly like Tesla, who went bankrupt 25 times in 10 years.
So, it's... (Score:3)
> including its tendencies to reproduce prejudice and assert falsehoods as facts.
So it's a politician?
Re:So, it's... (Score:4, Insightful)
Re:So, it's... [Artifical Stupidity wins again] (Score:3)
Mod parent funny.
But here's by reaction from another website:
[Website] needs combined reactions. Funny and sad in this case.
In solution terms, I don't think we need such gigantic corporate cancers creating giant problems where no problem was called for. This was a solution in search of a problem, but when people looked more closely, it was quickly obvious that the proposed solution WAS the biggest problem. (And of course this didn't solve Facebook's fake problem. "There ain't no profit big enough.")
Yes, you need a big project if you're trying to solve a big problem. Sending a rocket to the moon, for example. Social media should NOT be a big project, though there should be a PUBLIC standard for a communications protocol so the small projects can connect to social networks that might be large...
*sigh* I feel like including yet another description of the NOT-at-all-like-Facebook website that I am still searching for. Will I know it by its "Why?" button?
P.S. And yes, I'm still wondering who nuked my Facebook account. And why?
Auto-generated content (Score:5, Informative)
Re: (Score:2, Troll)
Re: (Score:2)
Depends on how well it's written. If you are claiming you are the new Messiah then using modern language is one thing, but if you claim to have found an undiscovered trove of dead see scrolls, then your command of those languages and writing style from back then better be top notch.
On the other hand, given the current level of intelligence of the average voter in the US...you could write it out in crayon and sign them "Thomas Jefferson", and no one would think they were any less authentic.
Re: (Score:2)
Re: (Score:2)
A fundamental problem with Galactica is that it is not able to distinguish truth from falsehood
Sounds like a lot of people I've run into over the years.
Special cases and "fixers" (Score:3)
Because stable diffusion (&c) can't count, you often need to manually fix hands. It also can't do faces reliably, so there are ML face fixers that repair faces. Weird that there's two face fixers and no hand fixers, but probably face fixing was studied a lot more so there were techniques in the can ready to go.
Well, these things can't fact check, so obviously you need to do that manually. So what's needed next to make this almost useful is a ML fact checker, presumably. But using the current approach is always going to produce bad results.
This does highlight a real problem in scientific research though... some of it is bullshit. Even if you train only on peer-reviewed papers you'll still get a lot of crap.
Re: Special cases and "fixers" (Score:2)
Re: (Score:3)
This does highlight a real problem in scientific research though... some of it is bullshit.
If peer-review process that involves multiple subject matter experts failed to detect the problem, why do you think your proposed solution to use fact-checkers would do any better? Mandatory dataset disclosure and replication studies is the only approach I could see producing better results. Also, it would be a good idea to mitigate imperatives to cheat (i.e., publish or perish, "independent" testing labs dependent on pharma for revenue) or at least take hard look at how we deal with actual conflicts of int
Re: (Score:3)
If peer-review process that involves multiple subject matter experts failed to detect the problem, why do you think your proposed solution to use fact-checkers would do any better?
I presume there is already an arms race between the people trying to detect bullshit, and the people trying to bypass the bullshit detectors, so I acknowledge that the situation is complicated. Still, that's how things are always going to be as long as we have people with unmet needs (emotional, economic, or other) which probably means effectively forever.
Mandatory dataset disclosure and replication studies is the only approach I could see producing better results.
I feel like we generally need that. But one approach would be to build some biases into the system, which of course you have to carefully [re]evaluate to
Re: (Score:2)
If peer-review process that involves multiple subject matter experts failed to detect the problem, why do you think your proposed solution to use fact-checkers would do any better?
I presume there is already an arms race between the people trying to detect bullshit, and the people trying to bypass the bullshit detectors
Sadly, not what I have seen. There is a disincentive to criticize papers that cite your work or support your model/pet theory. As such, by being strategic with your choice of suggested peer reviewers you can get an easy ride. Certainly some people do take it seriously and put the required work to critically review, but as with any honor-based system, it is collapsing in modern society.
Re: (Score:2)
Re: (Score:2)
X is racist, a translation: (Score:2)
X is ambiguous. Therefor it can be denounced, and some will believe the denunciation.
The creators of X have failed to grease the correct palms. Therefor X will be denounced so that the creators of X, Y, and Z remember who's in charge and won't repeat their mistake.
Clearly (Score:2)
Searle was right (Score:2)
There's no mind here.
Re: Searle was right (Score:2)
Galactica is a mindless bot that cannot tell fact from fiction
It's American, right? Sounds like sentience to me.
Bots like this aren't bad (Score:4, Interesting)
They're TOO good at mimicking flawed human behavior. We don't like our reflection in an accurate mirror
Re: (Score:2)
Re: (Score:3)
Galactica vs Tay (Score:2)
When is the Galactica vs Tay debate? I'm definitely going to tune in.
Re: (Score:2)
Kudos for testing in public (Score:2)
How many "AI" products never get tested and put into production?
A lot.
Kudos to the Meta team for having the courage to fail publicly. They're also gathering data sets that nobody else has in the process.
snake oil and the wrong track (Score:5, Insightful)
Re: (Score:3)
It's just extremely advanced cargo culting, but it could perhaps still be good enough for games. If you can live with the fact that the first time it gets prompted to say something politically incorrect and you don't pull it off the market you will get cancelled and your ability to receive credit card payments will be blocked.
OpenAI only survives by not letting their stuff be used for pretty much anything of consequence, quickly shutting down any uses the moment the inevitable politically incorrect use happ
Re: (Score:3)
Re: (Score:2)
Now it seems obvious that a system encouraging constant posturing would shift towards a moralistic pissing contest.
If Musk manages to blow up Twitter, he should get nominated for Nobel Peace prize.
Re: (Score:2)
it is on the wrong track for creating minds.
Thankfully. I would rather not go extinct.
Re: (Score:2)
Is this a reference to How to Create a Mind by Kurzweil? I'm just about done with it. I largely dislike him and therefore tend to disagree with him, but he makes a lot of good points in this one.
Re: (Score:2)
Re: (Score:2)
Hmm... I don't want to poke where I shouldn't, but I think the focus on patents and money is part of the problem and it is driving much of the research into rat holes. Sometimes even into rabbit holes. But it does sound like you might find his book worth your time. I think he is taking some of the same positions that you are.
However, if your motivations sound too commercial, I think some of his are too survivalist. Almost a primitive focus on individual survival?
Did just think of a more neutral example in a
Word soup (Score:4, Interesting)
Basically it's word soup, and attempting to ascribe meanings to words and create associations between them. At the end of the day the only knowledge within the system are language-specific grammatical rules and patterns, and some very indirect bonding of word meanings to the real world. Just because the routines can spit out whole sentences that are grammatically correct, and often combine words in ways that make sense pertaining to a specified topic, does not mean it has any "understanding" or knowledge whatsoever of a given subject.
It's like the differences between eyeballing a circle and estimating its area, versus using a formula to calculate the area, versus knowing the mathematics and proofs required to produce a formula for calculating the area of a circle. Those are different levels of knowledge beyond the superficial linguistics of merely being conversational about a circle and its area.
The cancer you created bits you in the ass (Score:3)
"A fundamental problem with Galactica is that it is not able to distinguish truth from falsehood, a basic requirement for a language model designed to generate scientific text. People found that it made up fake papers (sometimes attributing them to real authors), and generated wiki articles about the history of bears in space as readily as ones about protein complexes and the speed of light. It’s easy to spot fiction when it involves space bears, but harder with a subject users may not know much about"
In other words, business as usual in the Metastasis. You created a world of fake news where reality or truth matters little and the most exciting and most impressions-generating story is king.
You made your bed. Now lie in it.
Bias (Score:2)
It's not perfect so the only reasonable response to expect would be researchers to crawl over each other to score browny points for bashing Meta, why the fuck did Meta even try? They need to stop engaging with most media and pretty much all outside researchers, there is fuck all useful they can get from them.
It's more a politically correct imperative to bash Meta than the Catholic church.
Re:Bias (Score:5, Interesting)
I think you're missing the point. Meta promoted this as a way to assist in writing scientific papers, but instead, it generates falsehoods. It even makes up citations that don't exist. This is not only useless, it's dangerous. The last thing science needs is more misinformation being circulated.
And Yann Lecun, Meta's chief AI scientist, is defending it. That's what I don't understand. Something that is not only unfit for purpose, but actively harmful to the scientific community. Mr. Lecun is clearly first of all a Meta employee, and only second of all a scientist.
Re: (Score:2)
It's just an advanced pattern matching sentence completer, just letting it go it's merry way from start to finish is something you can let it do for fun, not the expected usage model. It's not dangerous, no one of consequence is going to be fooled if you just let it go on without checking the output. Ohh you can submit bullshit to vanity publishing and piss in the sea of piss, what a catastrophe!!!
"This tool is to paper writing as driving assistance is to driving. It won’t write papers automatically f
So artificial human intelligence... (Score:3)
...is apparently just as biased as natural human intelligence.
Truth is found through the ruthless application of skepticism to one's own deeply cherished ideals. We narrow the area where truth must be through falsification.
If there is anything that proves that Bayesian statistical methods don't lead to truth, this should be it. A massive statistical analysis does not lead to truth, if you don't have a method for dealing with the demarcation problem, like Popper had.
They don't mean real scientists (Score:3)
This is talking about social science aka pseudoscientific garbage beloved by astrology fans.
If the data supports something⦠(Score:1)
They shouldnâ(TM)t fight against it. Thatâ(TM)s anti-science.
All those ass-kissers were (Score:1)
...too chicken to tell Zuck it's a boondoggle.
Orbit first, not moonshots (Score:2)
Meta is trying for too many home-runs rather than incremental base hits. They could have instead focused on improving medical diagnostics and mistake detection which only have to suggest leads and possible oddities, not make an almost-finished product.
A similar big-or-nothing project is the Metaverse where they focused on 3D headsets instead of a more approachable "flat" monitor version. [slashdot.org]
Did it actually work? (Score:2)
From the article, it sounds like some activists found some outputs they didn't like. But that always happens.
More important, did it work in the majority of the use-cases? I'm actually curious. Does anyone know?
And why would it fail if trained on peer-reviewed papers?
I am not surprised (Score:1)
Human intelligence (Score:2)
including its tendencies to reproduce prejudice and assert falsehoods as facts.
So it's stooping to human level intelligence.