Meta Releases Llama 3 AI Models, Claiming Top Performance

Meta Releases Llama 3 AI Models, Claiming Top Performance 22

Posted by msmash on Thursday April 18, 2024 @12:40PM from the intensifying-race dept.

Meta debuted a new version of its powerful Llama AI model, its latest effort to keep pace with similar technology from companies like OpenAI, X and Google. The company describes Llama 3 8B and Llama 3 70B, containing 8 billion and 70 billion parameters respectively, as a "major leap" in performance compared to their predecessors.

Meta claims that the Llama 3 models, trained on custom-built 24,000 GPU clusters, are among the best-performing generative AI models available for their respective parameter counts. The company supports this claim by citing the models' scores on popular AI benchmarks such as MMLU, ARC, and DROP, which attempt to measure knowledge, skill acquisition, and reasoning abilities. Despite the ongoing debate about the usefulness and validity of these benchmarks, they remain one of the few standardized methods for evaluating AI models. Llama 3 8B outperforms other open-source models like Mistral's Mistral 7B and Google's Gemma 7B on at least nine benchmarks, showcasing its potential in various domains such as biology, physics, chemistry, mathematics, and commonsense reasoning.

TechCrunch adds: Now, Mistral 7B and Gemma 7B aren't exactly on the bleeding edge (Mistral 7B was released last September), and in a few of benchmarks Meta cites, Llama 3 8B scores only a few percentage points higher than either. But Meta also makes the claim that the larger-parameter-count Llama 3 model, Llama 3 70B, is competitive with flagship generative AI models including Gemini 1.5 Pro, the latest in Google's Gemini series.

Meta Releases Llama 3 AI Models, Claiming Top Performance

Post Load All Comments

Search 22 Comments Log In/Create an Account

Comments Filter:

Llama? (Score:5, Funny)

by TWX ( 665546 ) writes: on Thursday April 18, 2024 @12:52PM (#64405348)

Here's a llama there's a llama,
and another little llama,
Fuzzy llama, Funny llama,
llama, llama, Duck.

Reply to This Share
Flag as Inappropriate
Don't sit on this bench(mark.) (Score:4, Insightful)

by fyngyrz ( 762201 ) writes: on Thursday April 18, 2024 @01:01PM (#64405372) Homepage Journal

I'll be impressed when one of these ML engines is sophisticated enough to be able to say "I don't know" instead of just making up nonsense by stacking probabilistic sequences; also it needs to be able tell fake news from real news. Although there's an entire swath of humans who can't do that, so it'll be a while I guess. That whole "reality has a liberal bias" truism ought to be a prime training area.
While I certainly understand that the Internet and its various social media cesspools are the most readily available training ground(s), it sure leans into the "artificial stupid" thing.

Reply to This Share
Flag as Inappropriate
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  That would be a major achievement. But LLMs cannot do it. Hallucination is baked-in.
  - Re:Don't sit on this bench(mark.) (Score:4, Interesting)
    
    by fyngyrz ( 762201 ) writes: on Thursday April 18, 2024 @01:41PM (#64405482) Homepage Journal
    
    LLMs cannot do it. Hallucination is baked-in.
    LLMs alone definitely can't do it. LLMs, however, seem (to me, speaking for myself as an ML developer) to be a very likely component in an actual AI. Which, to be clear, is why I use "ML" instead of "AI", as we don't have AI yet. It's going to take other brainlike mechanisms to supervise the hugely flawed knowledge assembly that LLMs generate before we even have a chance to get there. Again, IMO.
    I'd love for someone to prove me wrong. No sign of that, though. :)
    
    Reply to This Parent Share
    Flag as Inappropriate
    - Re: (Score:3)
      
      by thegarbz ( 1787294 ) writes:
      
      Which, to be clear, is why I use "ML" instead of "AI", as we don't have AI yet.
      All you're doing is confusing the rest of the world who is using the term AI as a synonym to ML and have coined a new term AGI to talk about what you're talking about. Language evolves. You can either evolve with it and be understood, or be labelled as a pointless pedant refusing to use a term in common use because ... I don't know stubbornness, I guess. No one has ever conveyed meaning through stubbornness.
      - Re: (Score:2)
        
        by gweihir ( 88907 ) writes:
        
        In this case, it is more language getting corrupted. There are already the first attempts to break the term "AGI" as well and make it mean no more than what used to be called "automation". Some assholes wanting to sell things and many idiots that do not understand that words have meaning and need to be reasonably stable in their meaning to remain useful.
        
        Re: (Score:2)
        
        by thegarbz ( 1787294 ) writes:
        
        In this case, it is more language getting corrupted.
        Every change looks like corruption in the eyes of people who don't like it.
        
        Oh, well, change :) (Score:2)
        
        by fyngyrz ( 762201 ) writes:
        
        Every change looks like corruption in the eyes of people who don't like it.
        And corruption looks like evolution to some people.
        Personally, I'm in favor of words meaning as much of the same thing over time as possible. It enhances communication and understanding. If you need a new meaning, you either need a new word or you need to explain yourself at a bit more length. Lest you "decimate" (cough) the listener's/reader's understanding... you get me?
    - Re: (Score:2)
      
      by gweihir ( 88907 ) writes:
      
      I'd love for someone to prove me wrong. No sign of that, though. :)
      No sign of anybody proving you right either. The claim that AGI is even possible is an extraordinary one at this time, with not even simple evidence available. No, the physicist belief system does not count as "evidence". What would count is a credible theory, but all we have is automated deduction and that gets bogged down in complexity so fast it is not useful for any general applications as it just takes too long to deduce anything. In theory it could though, but the physical limitations of this universe
    - Re: (Score:2)
      
      by Xarius ( 691264 ) writes:
      
      This is the direction things are going. I work in business application for this sort of thing, and RAG (Retrieval-Augmented Generation) systems are along these lines.
      When a question is asked, the agent first searches a body of knowledge for relevant snippets/chunks. These all have metadata, e.g. factual confidence score, original data source, how old it is etc. Then this is filtered/ranked/etc.--once you have a reasonable looking set of info, then you pass that to an LLM to synthesise a concise response hea
- Re: (Score:2)
  
  by schneidafunk ( 795759 ) writes:
  
  I'm waiting for AI to beat captcha / recaptcha. I expect it will be pretty soon and I'm not sure what the new way to 'prove you are human' is going to be.
- Re: (Score:2)
  
  by WaffleMonster ( 969671 ) writes:
  
  I'll be impressed when one of these ML engines is sophisticated enough to be able to say "I don't know" instead of just making up nonsense by stacking probabilistic sequences
  For factual questions it is relatively easy to discern through iterative prompting most of the time but this comes at a higher overhead cost.
- Re: (Score:2)
  
  by forgotten_my_nick ( 802929 ) writes:
  
  > sophisticated enough to be able to say "I don't know"
  They absolutely know what data is being trained on.
  Any company telling you otherwise is lying.
  The reason they won't say is because it can put them in the position of having to remove some of the data, or opens up the "Training on other peoples work".
  That's why they say "publicly available sources" as their source. Youtube videos for example.
  It's going to be a big thing soon. Some companies are already getting ahead of it. For example IBMs models they
Still not better than GPT-4? (Score:2)

by WaffleMonster ( 969671 ) writes:

Will be interesting to try Mixtral 8x22B and llama3 70B... going to wait a few weeks for censorship removal, tuning and (franken)merges.
A llama3 400B would be crazy, looks meaningfully better than 70B from the evals but high cost... I hope they release it but that would be like a 200 GB model quantized and less than a token/sec saturating a quad channel system.
- Re: (Score:3)
  
  by Rei ( 128717 ) writes:
  
  Meh, LLaMA is right out for me because of its license. MistralAI makes great products and releases them with genuine open licenses.
Llama...odd reaction on my part. (Score:2)

by Petersko ( 564140 ) writes:

I don't hold on to much. I'm not somebody who generally gets nostalgic. Well, that's not completely true... I'm over 50 so everything old is great and most things new are terrible... But "Llama" is one of those rare terms that I'm surprised to see come up as a product label. After all, I remember clearly when the Llama had it's ass whipped.
Llama 3. (Score:2)

by Motleypuss ( 10291831 ) writes:

[joke?] But does it whip the llama's arse? [/joke?]

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Meta Releases Llama 3 AI Models, Claiming Top Performance 22

Meta Releases Llama 3 AI Models, Claiming Top Performance More | Reply Login

Meta Releases Llama 3 AI Models, Claiming Top Performance

Llama? (Score:5, Funny)

Don't sit on this bench(mark.) (Score:4, Insightful)

Re: (Score:2)

Re:Don't sit on this bench(mark.) (Score:4, Interesting)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Oh, well, change :) (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Still not better than GPT-4? (Score:2)

Re: (Score:3)

Llama...odd reaction on my part. (Score:2)

Llama 3. (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot