Google Unveils Two New AI Chips For the 'Agentic Era' (cnbc.com) 25

Posted by BeauHD on Wednesday April 22, 2026 @02:00PM from the custom-silicon dept.

Google announced two new tensor processing units (TPUs) for the "agentic era," with separate processors dedicated to training and inference. "With the rise of AI agents, we determined the community would benefit from chips individually specialized to the needs of training and serving," Amin Vahdat, a Google senior vice president and chief technologist for AI and infrastructure, said in a blog post. Both chips will become available later this year. CNBC reports: After years of producing chips that can both train artificial intelligence models and handle inference work, Google is separating those tasks into distinct processors, its latest effort to take on Nvidia in AI hardware. [...] None of the tech giants are displacing Nvidia, and Google isn't even comparing the performance of its new chips with those from the AI chip leader. Google did say the training chip enables 2.8 times the performance of the seventh-generation Ironwood TPU, announced in November, for the same price, while performance is 80% better for the inference processor.

Nvidia said its upcoming Groq 3 LPU hardware will draw on large quantities of static random-access memory, or SRAM, which is used by Cerebras, an AI chipmaker that filed to go public earlier this month. Google's new inference chip, dubbed TPU 8i, also relies on SRAM. Each chip contains 384 megabytes of SRAM, triple the amount in Ironwood. The architecture is designed "to deliver the massive throughput and low latency needed to concurrently run millions of agents cost-effectively," Sundar Pichai, CEO of Google parent Alphabet, wrote in a blog post.

Google Unveils Two New AI Chips For the 'Agentic Era'

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 25 Comments Log In/Create an Account

Comments Filter:

- Re: (Score:2, Interesting)
  
  by drnb ( 2434720 ) writes:
  
  ah, not available, except to rent a time slice? fuck off, then.
  In the context of building ML models, renting a virtual farm, may be the better solution. You think think the GPU upgrade cycle is bad, wait until you try to keep up with AI level products. :-)
  
  Accelerating the ML model on your PC or Mac is a very different thing.
- Re: (Score:2)
  
  by allo ( 1728082 ) writes:
  
  Did they ever sell TPUs to customers? I think the only way for us peasants to use them is Google colab.
- - Re: (Score:2)
    
    by drnb ( 2434720 ) writes:
    
    Apple used to embed AMD and Nvidia GPUs in some high end products. Its no longer necessary.
- Re: (Score:3)
  
  by drinkypoo ( 153816 ) writes:
  
  Apple doesn't have devices with enough RAM to challenge Nvidia.
  Apple also has no credibility in servers, after they got into them, then left, then got into them again, then left again. Nobody wants to be rugpulled.
  - Re: (Score:2)
    
    by real_nickname ( 6922224 ) writes:
    
    Nvidia (and AMD too) doesn't care about retail anymore. For the first time, retail can not upgrade their hardware while older hardware are becoming more expensive. The steam machine, if it is released, will probably be one of the most common hardware. A machine those specs barely reach PS5 level. In normal times, having hardware reaching descent FPS for path tracing would have been the norm in 2027 because of Ai it will remain a privilege of high specs. Fortunately Apple is not into the AI server/research
  - Re: (Score:1)
    
    by drnb ( 2434720 ) writes:
    
    Apple doesn't have devices with enough RAM to challenge Nvidia.
    Again, I am referring to the local ML model execution, acceleration. Apple Watches have done impressive on-board processing using ML models.
    also has no credibility in servers, ...
    Not what I referred to.
    - Re: (Score:2)
      
      by drinkypoo ( 153816 ) writes:
      
      The vast majority of LLM processing is done in the cloud and any AMD laptop has the functionality to run LLMs, plus probably expandable memory so if you can afford the RAM, you can run larger models than with Apple. Nobody cares yet. Maybe eventually.
    - Re: (Score:2)
      
      by larryjoe ( 135075 ) writes:
      
      Apple doesn't have devices with enough RAM to challenge Nvidia.
      Again, I am referring to the local ML model execution, acceleration. Apple Watches have done impressive on-board processing using ML models.
      AI/ML encompasses a wide range of use cases. The Apple use cases are generally less demanding client tasks that just have to work and be good enough. Those use cases are very different from what the Nvidia GPUs and Google TPUs are addressing.
      - Re: (Score:2)
        
        by drinkypoo ( 153816 ) writes:
        
        Those use cases are very different from what the Nvidia GPUs and Google TPUs are addressing.
        And speaking of that, Nvidia's business is now 90% in the DC, so when someone says Apple is replacing them and then they say that they're only talking about portable devices, they reveal they have no clue what is happening in the market.
- Re: (Score:2)
  
  by DamnOregonian ( 963763 ) writes:
  
  Apple Silicon can't realistically "replace" a discrete. Rather, they're... different.
  The compute performance of Apple Silicon is vastly inferior to a mid-range discrete. Its bandwidth isn't great in comparison, either.
  So, in terms of GB-of-VRAM-to-GB-of-VRAM, Apple Silicon is worst than any discrete you're likely to have for ML purposes.
  However, they've got something you can't get on a discrete- 128GB of VRAM in a laptop, and 512GB of VRAM in a desktop.
  This changes the equation, because it means your Ap
NVidia + Google + Cerebras moving to SRAM (Score:5, Insightful)

by Tailhook ( 98486 ) writes: on Wednesday April 22, 2026 @03:08PM (#66107272)

SRAM has never been built at this scale, afaik. Cerebras was ahead of the curve here, building wafer scale SRAMs years ago. The penalties of DRAM (even with HBM) are now so severe that everyone is taking the gloves off and building mighty SRAMs. This has always been possible in theory, but the high cost never justified it.
The impact on semiconductor fab demand is significant. SRAM cells are larger than DRAM bits: more silicon die area for the same amount of RAM.
Also, the training vs. inference split Google is baking into actual hardware is a big deal: it's the reality that training and inference are very distinct things asserting itself, which has been obvious to anyone that hasn't been drinking excessive NVidia cool-aid: there is a future where costly, general purpose GPU-like devices aren't actually necessary for operating LLMs.

- Re: (Score:3)
  
  by larryjoe ( 135075 ) writes:
  
  Also, the training vs. inference split Google is baking into actual hardware is a big deal: it's the reality that training and inference are very distinct things asserting itself, which has been obvious to anyone that hasn't been drinking excessive NVidia cool-aid: there is a future where costly, general purpose GPU-like devices aren't actually necessary for operating LLMs.
  The training/inference requirements are indeed different in significant ways. If the TPU 8i had been available a few years ago, it would have seriously affected Nvidia sales. However, the 8i is just now in the process of becoming available and is also not a small or cheap module. It's also being introduced at the same time that Nvidia is introducing its own inference devices. Due to the market timing and other factors, it remains to be seen how much it affects Nvidia's market.
- Re: (Score:2)
  
  by aRTeeNLCH ( 6256058 ) writes:
  
  Analogue designer here, perfectly equipped to understand the details but uninformed about current memory tech,... What exactly are the penalties for DRAM in this context that SRAM solves? Current per GB of stored data? Can't be access speed, DRAM should be as fast...? And does non volatile/ flash not meet the speed target or is it even larger?
  - Re: (Score:2)
    
    by Tailhook ( 98486 ) writes:
    
    DRAM should be as fast...?
    While DRAM isn't exactly slow, SRAM access time is about one order of magnitude faster than DRAM. That's why it's used for CPU caches, TLBs, registers, etc. SRAM is more power efficient, as it doesn't need refresh. The downside is die area: an SRAM flip-flop bit much larger than a gate+cap DRAM bit.
    - Re: (Score:2)
      
      by aRTeeNLCH ( 6256058 ) writes:
      
      Thank you, I wasn't aware of the speed delta.
  - Re: (Score:2)
    
    by Pinky's Brain ( 1158667 ) writes:
    
    Bandwidth density per mm2 and joule per bit. MoE changes the FLOPS/byte ratio (dynamic sparsity in general) and HBM isn't fast enough.
    Hybrid bonded DRAM could probably compete, but making a stack of hybrid bonded wafers will get expensive when it goes wrong.
    - Re: (Score:2)
      
      by aRTeeNLCH ( 6256058 ) writes:
      
      Thank you.
Just wait... (Score:2)

by ambrandt12 ( 6486220 ) writes:

Just wait until the bubble bursts, and everyone starts removing anything 'AI' from their devices (as much as possible), and stops using it because they're sick of the hallucinations or how it's baked-into everything... I'll keep using my Galaxy S9 (and Win10 Enterprise LTSC, and not-smart 18 year old Plasma TV) until it totally dies (only used Bixby like 5 times... mostly just seeing if it's worth using).
I don't need "Clod" to generate my Arduino code for me... I'll look up stuff on my own. I can type out
Two chips? Training for me but not for thee. (Score:2)

by Fly Swatter ( 30498 ) writes:

This is how it goes, the end goal will be that simple users can't do AI training without a permit. You know, for the children.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Google Unveils Two New AI Chips For the 'Agentic Era' (cnbc.com) 25

Google Unveils Two New AI Chips For the 'Agentic Era' More Login

Google Unveils Two New AI Chips For the 'Agentic Era'

Re: (Score:2, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

NVidia + Google + Cerebras moving to SRAM (Score:5, Insightful)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Just wait... (Score:2)

Two chips? Training for me but not for thee. (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot