Forgot your password?
typodupeerror
Google Privacy Apple

Apple Working To Cram Massive Gemini Model Into iPhone To Power New Siri (arstechnica.com) 40

Apple is reportedly working to shrink Google's Gemini models enough to power parts of a long-delayed AI-enhanced Siri on iPhones. But despite Apple's best efforts to run the AI locally, "the iPhone's Gemini makeover will lean heavily on Google and Nvidia in the cloud," reports Ars Technica. That could complicate Apple's privacy-first AI messaging, especially if more complex Siri requests are routed through Google infrastructure and Nvidia's encrypted cloud-computing platform. Ars Technica reports: After inking the Google deal, Apple apparently got to work distilling Google's giant cloud-based Gemini models. Distillation is a process in which a small, less resource-intensive model learns to mimic a large, expensive one. With enough time, this can reliably transfer useful capabilities while pruning less important weights from the model. That may enable Siri to handle some tasks with private local compute, but a cloud component looks inevitable.

Processing users' AI data in the cloud could be a problem for Apple. At WWDC, the company will probably promote its years of experience designing chips and how well that positions it for AI. However, The Information claims that Apple has struggled to even get Google's massive undistilled Gemini models running on its custom Private Cloud Compute infrastructure, which is built on on M-series Mac chips.

When the smarter Siri rolls out, it will probably route more complex tasks to Google's cloud infrastructure instead of Apple's, but it won't be running on Google TPUs. Apple has reportedly signed a deal with Nvidia to use its Confidential Computing platform for this purpose. Confidential Computing keeps data encrypted on Nvidia GPUs while it's being processed in the cloud, which could help Apple claim it's still sensitive to user privacy concerns. It might even retain its own Private Cloud Compute branding for the system.

The iPhone probably won't tell you which version of Gemini is handling individual Siri requests. Device makers designing hybrid systems that rely on local and cloud-based AI like to talk about making the experience feel "seamless." There might be clues, though.

Apple Working To Cram Massive Gemini Model Into iPhone To Power New Siri

Comments Filter:
  • by ChunderDownunder ( 709234 ) on Saturday May 30, 2026 @04:04AM (#66166578)

    That's when RAM shortages are supposed to subside.

    If you're currently selling netbooks with only 8 Gig, how much RAM will a Gemini iPhone realistically require?

    • Depends. Current leading LLM technologies are super bloated and inefficient. When 2028 arrives the resource requirements for a natural language user interface bot may be a lot less. The software side requirements can certainly be finalized a few months before launch. On the other hand, hardware designs have a long lead time and must be locked in much sooner. On the other hand, data can be shipped to a beefy server and processed in the cloud, so the hardware doesn't really need a lot of resources.
      • Isnâ(TM)t deepseek already twice as efficient?

      • by McLoud ( 92118 )

        Depends. Current leading LLM technologies are super bloated and inefficient. When 2028 arrives the resource requirements for a natural language user interface bot may be a lot less. The software side requirements can certainly be finalized a few months before launch. On the other hand, hardware designs have a long lead time and must be locked in much sooner. On the other hand, data can be shipped to a beefy server and processed in the cloud, so the hardware doesn't really need a lot of resources.

        Don't hold your breath. If the model becomes more efficient, they will likely use the free capability to increase the model's capabilities rather than make a compact version of the existing capabilities

    • how much RAM will a Gemini iPhone realistically require

      Depends on what you want to do with it. If you want to generate realistic using images doing whatever you want locally, while answering every question in the universe I suggest you get a phone with at least 96GB of RAM. If on the other hand you are running small local models that do specific tasks and offload the rest to an internet search, you can run that AI model on a iPhone 3GS if so desired.

      Not every AI is the same.

    • by allo ( 1728082 )

      I think chrome's gemini nano needs about 2 GB. I didn't test it yet. I would also expect Apple to distill to a similar size like Gemma E2B or E4B, that's also what Google uses for on-device AI (which uses exactly those models but as far as I know with added MTP layers).

    • I've seen estimates around 12 GB of RAM required for some Gemini models.

      • by allo ( 1728082 )

        The large Gemma models already need about 24 GB (or offloading to CPU). Gemini is way larger.

  • ... phone dead.

    Siri, I need help!

    • Yeah, it will be funny when you realize you can't swap your e-sim if your phone is dead either.
      • Yeah, it will be funny when you realize you can't swap your e-sim if your phone is dead either.

        Why would you do that? No seriously why. I want to follow the train of thought here for a moment. Your phone just died, what befit do you get from swapping the SIM?

        Why carry a whole second phone when a simple charging bank would do.
        If you have another phone, why not just get a second SIM for it.
        If you are borrowing a phone, I'm sure the person will let you briefly use their data.
        If you need to make a phone call, then just use the other person's phone.

        Heck if you have a second phone, just connect the two tog

        • Read carefully what he wrote: swapping an e-sim.
          An e-SIM

          A phone can hold nearly endless e-Sim. Perhaps you delete one, and install a new one, but that hardly counts as swapping.

          While I am historically a heavy Apple user, I am not a "fan boi". So I made a joke about the well known fact that Apple phones historically have short battery live (the main reason I wont buy any in the foreseeable future - my chat apps work just the same on Android, so the "imaginary" better OS, is completely irrelevant for me.)

          And:

  • Since when does a hardware manufacturer own the servers? Or which particular Nvidia model is Apple interested in running on? Or why would google need anything, but own TPU chips that nowadays can do training, let alone, inference? I thought "get paid for mentioning nvidia" was a conspiracy theory, but here we go again...
    • Since when does a hardware manufacturer own the servers?

      Since about a couple years ago, from what I gather. Soon to be the standard, if they have their way.

  • by bsdetector101 ( 6345122 ) on Saturday May 30, 2026 @05:46AM (#66166626)
    Plus don't use Siri much. It's bad enough now when you do a Google search and it has a small disclaimer that results may not be accurate !!!!
    • So... buy another phone? I don't understand what you're worried about here, are you suggesting that every company should suit precisely your needs?

    • There are still lots of dumbphones on the market. https://www.dumbphones.org/ [dumbphones.org]

      And best of all, they are really, really cheap. And they don't require cellular data.

      Unfortunately, you can still be tracked, even if you have a dumbphone, so if tracking is your issue, you're out of luck.

    • Plus don't use Siri much. It's bad enough now when you do a Google search and it has a small disclaimer that results may not be accurate !!!!

      To be fair, it arguably should always had had that disclaimer. They avoiding needing it by saying they were just providing links and it was on those web sites and their users to deal with any inaccuracies... but everyone knew that most users just took whatever the top link said as definitive truth.

  • by Pinky's Brain ( 1158667 ) on Saturday May 30, 2026 @06:11AM (#66166630)

    With progressive layer by layer distillation Apple can make aggressive changes in architecture, all while letting Google take all the blame for the piracy.

    I think there is a lot of potential to improve architectures for local, beyond MoE and what Apple "pioneered" with LLM in a Flash (the low rank predictor approach was actually first described in a paper from 2013 they didn't cite). Google's spark transformer for instance is already far more elegant than MoE and low rank predictors, beyond that there is also unexplored potential of forced temporal coherence in the active set.

    Only Apple and Tiny AI are likely to truly push sparsity in production. Going beyond MoE with sparsity and being forced to accept low single digit percentage compute utilisation during training on NVIDIA's expensive HBM based GPUs is too counter-intuitive for most researchers to accept, even if they really should.

  • Apple-Google-iNvidia(sic)

    Move along.

  • by residue09 ( 1329429 ) on Saturday May 30, 2026 @06:57AM (#66166658)
    I don't want any of this.
    • Try a dumbphone, there are still lots of them available, for as low as $20. https://www.dumbphones.org/ [dumbphones.org]

      Unfortunately, you can still be tracked, even if you're on a dumbphone. So if tracking is your issue, you're out of luck. But they DO let you make and receive calls, take voicemails, and text. For those who don't want sophisticated phones, it's perfect!

  • Imagine a Beowulf cluster of those ;)

  • So dumb. This will push iPhone users towards Google phone in droves.
    • Agree. More than this, Apple has dumped too much complexity into new phones with latest OS.

      Example, my 17 pro is pretty big and heavy, so you end up gripping it every time you pick it up. But with the extra buttons on the sides you end up engaging something you didn't want. So then you menu-dive into system settings just to turn off extra buttons.

      I feel like I am fighting with this thing - Jobs used to say he was proud of the things Apple didn't do. Man, those days are gone.
      • by Phact ( 4649149 )

        i feel like that's been the problem with every phone i've ever had: apple or android, my last android was an HTC with big light-touch buttons on the side.

      • Example, my 17 pro is pretty big and heavy, so you end up gripping it every time you pick it up. But with the extra buttons on the sides you end up engaging something you didn't want. So then you menu-dive into system settings just to turn off extra buttons.

        My kids call me a boomer when that happens to me. And yeah, it happens.

        Though to be fair, I actually really like the side button -- the one on the lower right that is touch sensitive. I use it for activating and using the camera. I just ALSO sometimes activate it when reading in landscape mode. Oops.

  • So if you're using Android, you're using Google. And if you're using Apple, you're using Google.

    Seems Google is in the dominant position here.

  • by grasshoppa ( 657393 ) on Saturday May 30, 2026 @09:30AM (#66166762)

    Are consumers really clamoring for this?

    • Did you ask an AI ?

      "
      No, consumers are not clamoring for AI in iPhones.

      While Apple, Google, and Samsung have spent billions marketing generative AI as the definitive reason to upgrade, multiple industry surveys and sales figures show a massive disconnect between tech company hype and actual consumer behavior.
      Data reveals that the "AI revolution" on smartphones is largely supply-driven by Silicon Valley, rather than demanded by everyday users.

      The Real Upgrade Drivers vs. AI Hype

      According to market tracking fr
      • Spot on!

        Oh wait,...

        Is that answer now slop or not?

      • by ceoyoyo ( 59147 )

        Half the posts: AI is dumb, do not want, is anybody actually asking for this?

        Other half: Siri is dumb, it's Apple's worst product, it's why I use Android.

  • Apple will fail.

  • Now Apple can flatten my battery and use all my ram and bandwidth to make me part of a distributed AI service it can sell! Hurrah!

1 Word = 1 Millipicture

Working...