Forgot your password?
typodupeerror
IT Technology

Raspberry Pi's New Add-on Board Has 8GB of RAM For Running Gen AI Models (theverge.com) 49

An anonymous reader shares a report: Raspberry Pi is launching a new add-on board capable of running generative AI models locally on the Raspberry Pi 5. Announced on Thursday, the $130 AI HAT+ 2 is an upgraded -- and more expensive -- version of the module launched last year, now offering 8GB of RAM and a Hailo 10H chip with 40 TOPS of AI performance.

Once connected, the Raspberry Pi 5 will use the AI HAT+ 2 to handle AI-related workloads while leaving the main board's Arm CPU available to complete other tasks. Unlike the previous AI HAT+, which is focused on image-based AI processing, the AI HAT+ 2 comes with onboard RAM and can run small gen AI models like Llama 3.2 and DeepSeek-R1-Distill, along with a series of Qwen models. You can train and fine-tune AI models using the device as well.

This discussion has been archived. No new comments can be posted.

Raspberry Pi's New Add-on Board Has 8GB of RAM For Running Gen AI Models

Comments Filter:
  • Since all shortcomings of the very large language models popular these days are much more pronounced and abundant in the not-so-large-language-models that fit in 8GB, I wonder what use cases this is meant for. I could understand why somebody would want to use some small "AI-upscaler" or "image recognition" in a Raspberry PI... but LLMs?
    • Does seem a bit small for any gen AI I know of. 16GB seems to be the minimum. Can you use multiple hats perhaps to expand the RAM? Perhaps useful for computer vision/audio.
      • Should be enough for Frigate, which is about the only practical use I've ever found for a TPU.

        Downside is that you're then stuck running your NVR on a Pi.

    • Re: (Score:3, Insightful)

      by blackomegax ( 807080 )
      I run deepseek on my 4060 8gb and it's been great.
    • by Xenx ( 2211586 )

      I could understand why somebody would want to use some small "AI-upscaler" or "image recognition" in a Raspberry PI... but LLMs?

      I'm sure there are a few use cases, but the thing that comes to mind for me right now is something like Home Assistant.

      • This, among other things, is a good use case. I've got an 8GB 1070 in my home NAS/media server. It runs an Ollama instance that's used by Karakeep and Home Assistant, as well as GPU transcoding for Jellyfin and machine learning for Immich. Some of the "AI" stuff is pretty cool and useful, when it doesn't involve sending all your personal data to The Cloud.

        LLM integration in Home Assistant is really nice for building cloudless voice assistants -- HA's native pipeline works, but requires very specific phra

    • I have a dream of running my own personal "google home" from my basement - I want it to turn on and off lights, maybe adjust the thermostat using voice commands. I also want it access a few web pages and be able to answer questions (via voice) regarding their contents: local weather, stock prices, maybe Wikipedia. No reporting back to the motheship because I am the mothership. This might be of a size to be able to accomplish this.
    • LLMs don't need to be large to be useful. Large LLMs are great for generative AI where you insist it creates a story, but small scale LLMs find their niche in contextual search, translation, OCR, and many cases at the *input* side of whatever it is you are trying to achieve.

      You can also get very small models if you restrict the application. E.g. if you need basic inference the model can be small. If you need reasoning the model can also be small if your source space is small.

      AI is more than LLMs, and LLMs a

      • by dfghjk ( 711126 )

        As if people are training models for the job, and will do so specifically for tasks running on a Pi.

    • by Hadlock ( 143607 )

      For voice assistants it's helpful for it to be local. It turns out that 98% of commands fall into about 10-12 commands (Set a timer for 5 min, turn on/off the lights, what time is it, whats todays date, turn on/off tv, turn on/off the lights in another room). The device catalogs all these requests and then makes a list of the top ~30 requests and if the request matches something on the list with ~0.85 confidence it doesn't even go to the LLM it just runs the command. That's how you get the instant response

    • Here is an example of a project that could probably be done on this unit: https://www.youtube.com/watch?... [youtube.com]
    • by allo ( 1728082 )

      Pis were thought as learning platform, people only (ab)use them to build all kind of smart devices. You can learn to do programming with simple python exercises and maybe pygame on a Pi and get quick results. Now you do your first steps into installing local LLM without having to buy a graphics card for $300.

    • consider.
      terms and conditions evaluation

  • by SubmergedInTech ( 7710960 ) on Thursday January 15, 2026 @05:17PM (#65927528)

    Given the general view of AI and LLMs (especially on /.), they should have called it the AI Supplementary Storage HAT.

    Or, ASSHAT for short.

    • People on slashdot who are luddites are hilarious to me. AI is the next stage of human evolution (as soon as we can integrate it into our brains), and yet they resist.

      Reminds me of how VR/AR is a logical step to cybernetics, and yet they resist. The real beta tests for that, ghost in the shell cybernetic utopia future, were google glass etc, but the beta testers were called glassholes, when all they were, were visionaries who were a few decades too early to a future that is coming.
      • I'd be using AR glasses right now if they were not created by the big platforms as just another way to make you, (me) the consumer, the product. To surf the net like the Major using her cyberbrain and a few virtual and physical agents we'd need a much larger leap in understanding the mammalian brain. I don't trust Elmo to develop a safe brain/computer interface.

      • People on slashdot who are luddites are hilarious to me. AI is the next stage of human evolution (as soon as we can integrate it into our brains), and yet they resist. Reminds me of how VR/AR is a logical step to cybernetics, and yet they resist. The real beta tests for that, ghost in the shell cybernetic utopia future, were google glass etc, but the beta testers were called glassholes, when all they were, were visionaries who were a few decades too early to a future that is coming.

        No. These people were glassholes because the only thing they enabled was recording people for the purpose of large companies somehow monetizing it.

        One is not a luddite for shunning shit tech.

      • Not sure there are that many Luddites here, we just want it on our terms. And in many instances that means being unwilling to move forward with a new technology if it means surrendering our privacy.
      • Oh, it's evolution alright.... Darwinism specifically.

        Seriously, the human brain already has a thing that hallucinates grand successes / benefits from crap, and that can be trained to do far more useful things for a fraction of the cost to operate. Replacing that thing with an AI is a downgrade. Although, I'm sure that for some people, the inability to say no and perfect ability to manipulate the output in a predictable way is the entire point.

        Once men turned their thinking over to machines in the hope that this would set them free. But that only permitted other men with machines to enslave them.

      • Fuck it, double post:

        ghost in the shell cybernetic utopia future

        What part of Ghost in the Shell is a utopia!?

        Is your head on straight? Seriously, this is a world in which the police can force you to smile as they throw you into a cell. A place where people's memories are constantly manipulated by viruses, any random person can suddenly start shooting government officials because they opened the wrong set of files in the correct order, children can be abducted by the government have their identities overwritten and given to a bunch of senior citiz

        • Depending on the specific installation it is not an utopia, but Japan is still considered a better place to live than pretty much everywhere else.

    • Funny as your comment is, the irony is that this can't run general models. The hardware will limit you to running special purpose models, and special purpose AI models are actually really frigging good at doing various things.

      They just get no love in the media because it's not fancy to hear how we solve problems with small AI models when Open AI is in an arms race to see who reaches 10trillion.

  • Is there a demand for these at all? Seems like their making the product before there is a market..
    • They're chasing a fad. Their hope is that there are enough of their customers chasing this same fad that they can make a profit off of them buying what sounds to be an essentially worthless product. (And that's even if you're willing to grant that "full-size" LLMs are worthwhile.)

      Maybe they do; maybe they don't. But as one of their customers, I personally resent this diversion of resources from more worthwhile projects in any case.

    • Is there a demand for these at all? Seems like their making the product before there is a market..

      I can't speak to language models but for things like object detection for surveillance systems these are very popular. 40 TOPS is lots of inferencing power, considering a Google Coral has 4. Hailo already has smaller models, 12 and 25 TOPS I think. And there are Jetsons, and Memryx, and others.

      I'm just looking to get my feet wet with this shortly and am researching my options, of which there are quite a few, so I would say yes there definitely is a market.

      • >> I would say yes there definitely is a market

        I'm interested in it but the article says;
        "Jeff Geerling found that a standalone Raspberry Pi 5 with 8GB of RAM generally outperformed the AI HAT+ 2 across the supported models."

        • Unsurprising.

          This mistake is constantly made.
          NPUs don't accelerate large models that are bandwidth-bottlenecked, rather than compute bottlenecked.
          Pi5 is using a single channel of LPDDR4X. The hat is using a single channel of LPDDR4.
          The Pi 5 is going to outperform it unless the model is compute-bound (which basically precludes any SLM/LLM)
  • by GoRK ( 10018 ) on Thursday January 15, 2026 @06:18PM (#65927698) Journal

    Seeing as this consumes the only PCIe port on the device, you can't use NVMe storage in conjunction with it, making the entire thing far less useful since all of the other storage options for the Pi are dogshit

    • You're using this model for an output. Storage isn't the factor here. You don't need a lot of IO for this. These models get loaded into the RAM. The Pi isn't a general purpose computer. If you're using this you're doing a thing that very VERY likely has no remote need for any kind of fast non-volatile storage.

  • How does this Pi / Hat add on compare to the BeagleBone AI-64 (https://www.beagleboard.org/boards/beaglebone-ai-64)? If anyone knows.

There is nothing so easy but that it becomes difficult when you do it reluctantly. -- Publius Terentius Afer (Terence)

Working...