DeepSeek's First Reasoning Model R1-Lite-Preview Beats OpenAI o1 Performance (venturebeat.com) 16
An anonymous reader quotes a report from VentureBeat: DeepSeek, an AI offshoot of Chinese quantitative hedge fund High-Flyer Capital Management focused on releasing high performance open source tech, has unveiled the R1-Lite-Preview, its latest reasoning-focused large language model, available for now exclusively through DeepSeek Chat, its web-based AI chatbot. Known for its innovative contributions to the open-source AI ecosystem, DeepSeek's new release aims to bring high-level reasoning capabilities to the public while maintaining its commitment to accessible and transparent AI. And the R1-Lite-Preview, despite only being available through the chat application for now, is already turning heads by offering performance nearing and in some cases exceeding OpenAI's vaunted o1-preview model.
Like that model released in September 2024, DeepSeek-R1-Lite-Preview exhibits "chain-of-thought" reasoning, showing the user the different chains or trains of "thought" it goes down to respond to their queries and inputs, documenting the process by explaining what it is doing and why. While some of the chains/trains of thoughts may appear nonsensical or even erroneous to humans, DeepSeek-R1-Lite-Preview appears on the whole to be strikingly accurate, even answering "trick" questions that have tripped up other, older, yet powerful AI models such as GPT-4o and Claude's Anthropic family, including "how many letter Rs are in the word Strawberry?" and "which is larger, 9.11 or 9.9?"
Like that model released in September 2024, DeepSeek-R1-Lite-Preview exhibits "chain-of-thought" reasoning, showing the user the different chains or trains of "thought" it goes down to respond to their queries and inputs, documenting the process by explaining what it is doing and why. While some of the chains/trains of thoughts may appear nonsensical or even erroneous to humans, DeepSeek-R1-Lite-Preview appears on the whole to be strikingly accurate, even answering "trick" questions that have tripped up other, older, yet powerful AI models such as GPT-4o and Claude's Anthropic family, including "how many letter Rs are in the word Strawberry?" and "which is larger, 9.11 or 9.9?"
TOtally useless (Score:3)
I asked it about Tienamen Square and it immediately said that it was a forbidden topic of discussion. WHen I asked it what topics were forbidden it refused to tell me and purposely decided to be vague reasoning that even listing the topics would be tantamount to discussing them or informing people about them which would be against the interests/desires of the government.
There is no point in using a Chinese AI as you will never be able to discuss anything that the Communist party doesn't want you to talk about. For fun I asked it its opinion of XI, and it also refused to give any answer at all.
Totally useless. I don't care how well you reason if you refuse to talk about anything important.
Re: (Score:2)
Re: (Score:2)
Ope! it's official now (Score:2)
$157B (Score:2)
OpenAI is valued at $157B because it's commonly believed they have the secret sauce.
Re: (Score:2)
"believed" is the right therm here. Essentially, people are hallucinating.
Not reasoning (Score:2)
Re: (Score:2)
Makes sense.
But let's see.
Re:what is reasoning? (Score:2)
If we think what we do when "reasoning", it is very similar to applying what we have learned over a very long chain of information gathering and sorting out moments, what is called learning. Learning is nothing else as structuring our neurone network. The whole looks quite similar to what these AI systems do actually.
Re: (Score:2)
That maybe what you and about 80% of the human race does. But, get this, only about 20% of the human race has actual real reasoning ability outside of trivial things. The rest just fakes it and believes some crap they want to believe.
If an LLM performs on the level of an average human then that does not mean it can reason.
Re: (Score:2)
"empty verbalism" covers it nicely. That is exactly what LLMs do and are designed to do.
"Reasoning". Suuuuure. (Score:2)
This is just another clever fake. This thing does not have reasoning ability. It will impress those weak of mind though.
Re: (Score:2)
This is just another clever fake. This thing does not have reasoning ability. It will impress those weak of mind though.
And a pocket calculator doesn't have calculation ability, it just produces the correct answer through some clever algorithms encoded in microchips? No one actually cares, if a random Slashdot user doesn't want to call LLM problem solving ability "reasoning", if the system actually produces correct results, at least most of the time.