DeepSeek's First Reasoning Model R1-Lite-Preview Beats OpenAI o1 Performance (venturebeat.com) 24

Posted by BeauHD on Wednesday November 20, 2024 @05:30PM from the would-you-look-at-that dept.

An anonymous reader quotes a report from VentureBeat: DeepSeek, an AI offshoot of Chinese quantitative hedge fund High-Flyer Capital Management focused on releasing high performance open source tech, has unveiled the R1-Lite-Preview, its latest reasoning-focused large language model, available for now exclusively through DeepSeek Chat, its web-based AI chatbot. Known for its innovative contributions to the open-source AI ecosystem, DeepSeek's new release aims to bring high-level reasoning capabilities to the public while maintaining its commitment to accessible and transparent AI. And the R1-Lite-Preview, despite only being available through the chat application for now, is already turning heads by offering performance nearing and in some cases exceeding OpenAI's vaunted o1-preview model.

Like that model released in September 2024, DeepSeek-R1-Lite-Preview exhibits "chain-of-thought" reasoning, showing the user the different chains or trains of "thought" it goes down to respond to their queries and inputs, documenting the process by explaining what it is doing and why. While some of the chains/trains of thoughts may appear nonsensical or even erroneous to humans, DeepSeek-R1-Lite-Preview appears on the whole to be strikingly accurate, even answering "trick" questions that have tripped up other, older, yet powerful AI models such as GPT-4o and Claude's Anthropic family, including "how many letter Rs are in the word Strawberry?" and "which is larger, 9.11 or 9.9?"

DeepSeek's First Reasoning Model R1-Lite-Preview Beats OpenAI o1 Performance

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 24 Comments Log In/Create an Account

Comments Filter:

TOtally useless (Score:4, Interesting)

by GeekBoy ( 10877 ) writes: on Wednesday November 20, 2024 @05:57PM (#64960851)

I asked it about Tienamen Square and it immediately said that it was a forbidden topic of discussion. WHen I asked it what topics were forbidden it refused to tell me and purposely decided to be vague reasoning that even listing the topics would be tantamount to discussing them or informing people about them which would be against the interests/desires of the government.
There is no point in using a Chinese AI as you will never be able to discuss anything that the Communist party doesn't want you to talk about. For fun I asked it its opinion of XI, and it also refused to give any answer at all.
Totally useless. I don't care how well you reason if you refuse to talk about anything important.

- Re: (Score:3)
  
  by aldousd666 ( 640240 ) writes:
  
  Pliny (github user elder_plinus) has a nice jailbreak prompt you can use to get around that in his jailbreak repo. It's still great for math and science and coding even without breaking it though.
- - Re: (Score:2)
    
    by GeekBoy ( 10877 ) writes:
    
    must be after that uni engineering degree and all those classes in philosophy of logic, I lack any understanding of predicate logic universal (x) or existential (x) quantifiers. But I'm sure you'll be happy to tell me what they mean to demonstrate how intellectually superior you are, pedantic troll.
- Re: (Score:2, Troll)
  
  by martin-boundary ( 547041 ) writes:
  
  You just got a bad hallucination that one time. I asked it about Trump. It answered "wanker". Based on this, I trust it completely.
- Re: (Score:2)
  
  by etash ( 1907284 ) writes:
  
  indeed if you ask it in english it will refuse. I asked it in another language and it answered. here is its own translation of the previous greek reply:
  
  "
  
  translate the above two answers of yours to english pls
  
  Certainly! Here are the translations of the two previous answers:
  
  First Answer:
  
  "The Tiananmen Square in Beijing's Forbidden City became internationally known in 1989 when a large protest by students and citizens took place, advocating for democracy and freedom in China. The protest ended in
Ope! it's official now (Score:2)

by aldousd666 ( 640240 ) writes:

Once this news hits slashdot, it's official and openAI will fire back with their release forthwith.
$157B (Score:2)

by bill_mcgonigle ( 4333 ) * writes:

OpenAI is valued at $157B because it's commonly believed they have the secret sauce.
- Re: (Score:3)
  
  by gweihir ( 88907 ) writes:
  
  "believed" is the right therm here. Essentially, people are hallucinating.
  - Re: $157B (Score:1)
    
    by elcor ( 4519045 ) writes:
    
    Like the chat bot they worship
Re: (Score:2)

by account_deleted ( 4530225 ) writes:

Comment removed based on user account deletion
- Re: (Score:2)
  
  by angel'o'sphere ( 80593 ) writes:
  
  Makes sense.
  But let's see.
- Re:what is reasoning? (Score:2)
  
  by Framboise ( 521772 ) writes:
  
  If we think what we do when "reasoning", it is very similar to applying what we have learned over a very long chain of information gathering and sorting out moments, what is called learning. Learning is nothing else as structuring our neurone network. The whole looks quite similar to what these AI systems do actually.
  - Re: (Score:1)
    
    by gweihir ( 88907 ) writes:
    
    That maybe what you and about 80% of the human race does. But, get this, only about 20% of the human race has actual real reasoning ability outside of trivial things. The rest just fakes it and believes some crap they want to believe.
    If an LLM performs on the level of an average human then that does not mean it can reason.
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  "empty verbalism" covers it nicely. That is exactly what LLMs do and are designed to do.
"Reasoning". Suuuuure. (Score:1)

by gweihir ( 88907 ) writes:

This is just another clever fake. This thing does not have reasoning ability. It will impress those weak of mind though.
- Re: (Score:3)
  
  by Bumbul ( 7920730 ) writes:
  
  This is just another clever fake. This thing does not have reasoning ability. It will impress those weak of mind though.
  And a pocket calculator doesn't have calculation ability, it just produces the correct answer through some clever algorithms encoded in microchips? No one actually cares, if a random Slashdot user doesn't want to call LLM problem solving ability "reasoning", if the system actually produces correct results, at least most of the time.
  - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    LLMs do not have a problem solving ability. At least get the basics right.
    - Re: (Score:2)
      
      by Bumbul ( 7920730 ) writes:
      
      LLMs do not have a problem solving ability. At least get the basics right.
      Did an LLM steal your wife or what is behind your tireless effort?
      
      You are free to have your own definition for "reasoning" or "problem solving", but do not try to force those definitions onto others, just messes up the conversation. See here regarding Problem solving: https://deepmind.google/discov... [deepmind.google]
      - Re: (Score:2)
        
        by gweihir ( 88907 ) writes:
        
        I am using the standard definitions. You hallucinate something that is not there. This says bad things about you.
        
        Re: (Score:2)
        
        by Bumbul ( 7920730 ) writes:
        
        You hallucinate something that is not there. This says bad things about you.
        Sorry about the wife part, then.
  - Re: "Reasoning". Suuuuure. (Score:1)
    
    by home-electro.com ( 1284676 ) writes:
    
    You are random.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

DeepSeek's First Reasoning Model R1-Lite-Preview Beats OpenAI o1 Performance (venturebeat.com) 24

DeepSeek's First Reasoning Model R1-Lite-Preview Beats OpenAI o1 Performance More Login

DeepSeek's First Reasoning Model R1-Lite-Preview Beats OpenAI o1 Performance

TOtally useless (Score:4, Interesting)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2, Troll)

Re: (Score:2)

Ope! it's official now (Score:2)

$157B (Score:2)

Re: (Score:3)

Re: $157B (Score:1)

Re: (Score:2)

Re: (Score:2)

Re:what is reasoning? (Score:2)

Re: (Score:1)

Re: (Score:2)

"Reasoning". Suuuuure. (Score:1)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: "Reasoning". Suuuuure. (Score:1)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot