China's AI Matches Anthropic in Cybersecurity, Causing Worry Over US Restrictions (msn.com) 57
Chinese AI systems "have matched the performance of Anthropic's powerful model Mythos in some cybersecurity scenarios," reports the Wall Street Journal.
They call it "a development poised to reset the global tech race and pressure the White House in its overhaul of U.S. AI policy." Security researchers said that a new AI model, released this month by China's Zhipu AI, also known as Z.ai, can match the latest U.S. models when it comes to finding security bugs, although it still lags behind Anthropic's and OpenAI's products in other tasks. Overall, the capability gap between top U.S. models and those built by Chinese companies has narrowed significantly, and use of Chinese AI systems has surged as businesses seek to rein in runaway costs. A host of companies, including Microsoft, are weighing how they can offer Chinese models on their platforms, a development that is set to alter the balance of power among tech companies...
Unlike models from Anthropic or OpenAI, Zhipu's GLM-5.2 is open-weight. That means it can be downloaded and run on hardware operated by anybody and can be modified and used without supervision. Open-weight models are ideal for users who want unfettered access to systems they control, but they are also ideal for hackers, who can run them in the shadows. GLM-5.2 has ranked as one of the 10 most-used AI models, according to data from OpenRouter, a company that provides access to more than 400 AI models. In some benchmarking tests, according to the cybersecurity company Semgrep, GLM-5.2 bested Anthropic's Claude Opus 4.8 model, which was released in May. When given further instructions, Opus 4.8 and GLM-5.2 can match Mythos in bug-finding ability, according to researchers...
"Banning Fable while selling chips China needs to develop its own version is a gift to China," said Saif Khan, a distinguished technology fellow at the Institute for Progress think tank who worked on export restrictions in the Biden administration. The U.S. needs to maximize the use of Mythos and comparable models to harden its cyber defenses while it can, he added. Among the Mythos 5 and Fable 5 users that had lost access before Friday's decision to restore Mythos 5 access for some trusted entities: the National Security Agency, which had been testing the tools and found them impressive in trials, according to people familiar with the matter... "It is incentivizing companies across the globe to use cheaper but very capable Chinese open-weight models, while at the same time undermining the U.S. AI industry," said Niels Provos, a researcher who led security teams at Google and Stripe. "I don't understand it."
Thanks to long-time Slashdot reader schwit1 for sharing the article.
They call it "a development poised to reset the global tech race and pressure the White House in its overhaul of U.S. AI policy." Security researchers said that a new AI model, released this month by China's Zhipu AI, also known as Z.ai, can match the latest U.S. models when it comes to finding security bugs, although it still lags behind Anthropic's and OpenAI's products in other tasks. Overall, the capability gap between top U.S. models and those built by Chinese companies has narrowed significantly, and use of Chinese AI systems has surged as businesses seek to rein in runaway costs. A host of companies, including Microsoft, are weighing how they can offer Chinese models on their platforms, a development that is set to alter the balance of power among tech companies...
Unlike models from Anthropic or OpenAI, Zhipu's GLM-5.2 is open-weight. That means it can be downloaded and run on hardware operated by anybody and can be modified and used without supervision. Open-weight models are ideal for users who want unfettered access to systems they control, but they are also ideal for hackers, who can run them in the shadows. GLM-5.2 has ranked as one of the 10 most-used AI models, according to data from OpenRouter, a company that provides access to more than 400 AI models. In some benchmarking tests, according to the cybersecurity company Semgrep, GLM-5.2 bested Anthropic's Claude Opus 4.8 model, which was released in May. When given further instructions, Opus 4.8 and GLM-5.2 can match Mythos in bug-finding ability, according to researchers...
"Banning Fable while selling chips China needs to develop its own version is a gift to China," said Saif Khan, a distinguished technology fellow at the Institute for Progress think tank who worked on export restrictions in the Biden administration. The U.S. needs to maximize the use of Mythos and comparable models to harden its cyber defenses while it can, he added. Among the Mythos 5 and Fable 5 users that had lost access before Friday's decision to restore Mythos 5 access for some trusted entities: the National Security Agency, which had been testing the tools and found them impressive in trials, according to people familiar with the matter... "It is incentivizing companies across the globe to use cheaper but very capable Chinese open-weight models, while at the same time undermining the U.S. AI industry," said Niels Provos, a researcher who led security teams at Google and Stripe. "I don't understand it."
Thanks to long-time Slashdot reader schwit1 for sharing the article.
Open Source Wins Again (Score:5, Interesting)
If you're close sourced and expect to make back trillions of investment in AI infrastructure by charging people usage fees, they're going to use your competitor's free and unencumbered product instead.
Re:Open Source Wins Again (Score:4, Interesting)
To be fair, you need at least 256GB of RAM just to run the 2-bit version of this model. Most people aren't going to be able to do that at home.
But yeah, the Chinese government is willing to throw lots of money at building AI models and giving them away, so Western companies are screwed.
Re:Open Source Wins Again (Score:5, Insightful)
To be fair, you need at least 256GB of RAM just to run the 2-bit version of this model. Most people aren't going to be able to do that at home.
But yeah, the Chinese government is willing to throw lots of money at building AI models and giving them away, so Western companies are screwed.
Another way of looking at this if western companies are screwed.. hardware prices return to planet earth where more people are able to run this stuff at home. Three years ago the cost of 512GB DDR5 was less than the cost of a single 4090 GPU today.
Re: (Score:2)
Chinese companies (not the government) are doing what they always do. Refine the technology, and make it affordable. Get it running on lower end hardware, leverage the massive amount of cheap and clean renewable energy they have, push for volume over premium pricing.
They keep doing it in different industries and most Western companies seem to only be capable of whining about it, rather than competing.
Re: (Score:2)
True, you aren't going to be running this at home. But then no one runs the SOTA models at home. You can GLM-5.2 on z.ai's hardware, using a subscription plan [z.ai] similar to the plans offered by Anthropic and OpenAI. The most notable difference is: it's 1/2 the price.
Re: (Score:2)
The problem with providers like z.ai is running into compliance problems and corporate paranoia. As an Australian company, the boss is paranoid enough about even letting the Americans access our cloud data let alone the Chinese who have always been
Re: (Score:2)
If you are that paranoid, then your only solution is running local models. They do run OK'ish on a modern MacBook. The twist is: all the good ones are Chinese. And the MacBook is made in China too.
I'm Australian too as it happens, and my boss has just discovered the joys of vibe coding. The ra
Re:Open Source Wins Again (Score:5, Interesting)
Indeed. Common good vs. some assholes getting even richer. That said, there is an open Swiss model (Apertus) as well.
Re: Open Source Wins Again (Score:1)
If you associate anything about the CCP with for-common-good, you really need to catch up on a lot of history.
Re: (Score:2)
What an insightless comment. Well, done, you disqualified yourself.
firmware test battery (Score:3)
So, when are we going to require a battery of AI done security tests against connected device firmware before they can be certified and sold in the USA?
Linux removed use of strncpy() recently - https://chessman7.substack.com... [substack.com]
Lol (and yay open source) (Score:5, Interesting)
Re:Lol (and yay open source) (Score:5, Insightful)
Indeed. Business people seeing a chance to get rich immediately lose all cognitive power to see reality.
Good (Score:4, Interesting)
Can the AI release the Epstein files already?
Re: (Score:3)
Re: (Score:1)
Here you go: https://huggingface.co/mraderm... [huggingface.co]
It's times like these (Score:5, Funny)
With such sensitive and new geopolitical, technological and socioeconomic issues to deal with that we elected such a responsible group of thoughtful individuals to guide us through these situations. I am sure they are giving the proper consideration and delicate balance this requires.
Re:It's times like these (Score:4, Interesting)
Indeed. We desperately need a king but, thanks to Our Democracy, all we get are clowns.
Re: (Score:1)
Uh huh and who is in your eyes the king we all need right now
Re: It's times like these (Score:3)
Re: (Score:2)
No idea, but the crown is lying in the gutter just waiting for someone to pick it up.
Re:It's times like these (Score:5, Insightful)
More evidence the biggest threat to America and it's values are not immigrants but white conservatives.
Re: (Score:2)
We desperately need a king
We already have one piece of shit with syphilis calling the shots, so we already have a king, and it's not working the fuck out is it?
Can I... (Score:2)
Can I IPO with an open source AI and drink OpenAI's milkshake?
Business Plan? Chinese are trying? (Score:2)
I would much rather hand my Ai subscription money to a company that released their model weights.
Better yet it would be nice if they shared their annotated data sets and training recipes as well.
However I am not sure how an organization that "gave everything away" like that would survive.
Right now it would have to be based on having massive compute and renting it out for profit that is put back into model development.
The Chinese companies either have a large side gig, like Alibaba, Bytedance, and Xiaomi sim
It doesn't (Score:2)
I like free and local LLM, but they do not match the Frontier models of Antropic, OpenAI or Google yet.
The take here is either uninformed, sensationalist or fear mongering (possibly to ban Chinese LLM). GLM is good, but nowhere near Mythos/Fable or even latest Opus.
Re: (Score:2)
FYI: It's somewhere between Opus 4.8 and Fable 5 at 1/10 the price [gptbased.com]
Re: (Score:2)
If you want to try out GLM 5.2 for free right now, OpenCode is hosting a version of it that they call "Big Pickle." It's free to use but be warned that they are using prompts for model training. It's as good as Anthropic's Sonnet, but definitely not close to Opus.
Re: (Score:2)
It's 28 ELO stronger than Opus 4.8 according to LMArena web dev leaderboard. Here's another link [arena.ai]
Re: (Score:2)
My practical experience using it for coding tasks says it's definitely not.
Re: (Score:3)
I like free and local LLM, but they do not match the Frontier models of Antropic, OpenAI or Google yet.
The take here is either uninformed, sensationalist or fear mongering (possibly to ban Chinese LLM). GLM is good, but nowhere near Mythos/Fable or even latest Opus.
What is the objective basis of your statement? Do you reject benchmark results? If so what have you used in their place to make such a determination?
Re: (Score:2)
Personal not that much, but I've follow the LLM community and people seem to agree since the release, that it is NOT Opus level, even though it is a good model.
Benchmarks are a complicated topic, that's the reason why there are so many of them. In particular some models are "benchmaxxed" and some people are doing idiotic stuff like claiming their 1.5B model is Opus level, because they trained on a niche topic and claim they beat Opus on that topic (probably still don't, but some amateurs just have a large e
Re: (Score:2)
Addendum: I think from time to time you get GLM for free on open router (I think 20 requests per day / 100 when you deposited some money). So you can test it without buying ten more GPUs ;)
Maybe a lesson from history! (Score:4, Insightful)
So what does that mean? (Score:3)
Is China really good at this or is Anthropic actually pretty pathetic?
My guess would be the latter, I may be wrong.
Re:So what does that mean? (Score:5, Insightful)
In truth niether. China isn't that good yet but moving quickly. Anthropic is still at the top of the game for now but won't be there forever and they insist on keeping the models locked up and proprietary. Sooner it later the more open models will win. And hopefully we'll finally get accessible hardware to run them locally.
Re:So what does that mean? (Score:5, Insightful)
Re: (Score:1)
wow there's some deeply flawed premises here that have nothing to do with nationality
You think that the kids who won those olympiads are now all AI researchers? You think a pissing contest for tiger parents produces competent researchers in the first place?
Re: (Score:2)
If you are actually caring about the competition, there are sites which run objective tests. Like LiveBench [livebench.ai] or OpenRouter [openrouter.ai]. OpenRouter's is probably more interesting because it has data on actual model use so it can go beyond just model performance. So the data on what people are using as a function of price and performance per task is rather illuminating with the caveat that most usage is direct through model providers plans, so the data is just from people who switch models frequently enough to not just po
Re: (Score:2)
I do not care about benchmark results. These always get gamed and often are pure fantasies as a result. Benchmarks can only work when you need to verify conformance against some standards. They are worse than useless to compare performance.
Re: (Score:2)
If you reject objective tests and the subjective opinion of experts, how do you evaluate? Or is the whole point that to just throw out any evidence that doesn't align with your opinion?
Just another BS claim (Score:2)
Re: (Score:2)
People using models undoubtedly still remember DeepSeek since Flash is still #1 in token use on OpenRouter [openrouter.ai] having about double the use of Anthropic most popular model and Pro about tied. And that's only one of the cheap "good enough" Chinese models.
If Z.AI's new model as good as it seems while being 1/5 the price, discounting the threat of open weight models seems like a bad idea.
No need to worry! (Score:2)
No need to worry, there is something called phishing and it has been giving anyone who wants all the access they could ever ask for long before clankers found actual security holes now!
The AI bubble is IP light, always has been... (Score:1)
Joke of a model (Score:2)
I tried to like GLM-5.2 but it is just... not good. It is at least 6 months behind frontier models, probably more.
For example just yesterday I gave it a prompt to implement a mathematical computation. The code it wrote took over 45 minutes to run. I asked Opus 4.8 the same exact prompt and its code ran in about 15 seconds on the same machine.