We use some essential cookies to make our website work.

We use optional cookies, as detailed in our cookie policy, to remember your settings and understand how you use our website.

When and why you might need the Raspberry Pi AI HAT+ 2

Our friends at Hailo wrote this article about how to make the most of the Raspberry Pi AI HAT+ 2, pinpointing some of their favourite generative AI use cases.

The Raspberry Pi AI HAT+ 2 is the official generative AI PCIe add-on for Raspberry Pi 5, released on 15 January 2026. It pairs a Hailo-10H AI accelerator capable of up to 40 TOPS of inference performance (INT4) with 8GB of dedicated on-board LPDDR4X memory, enabling local vision and small generative AI workloads on one of the most popular single-board computers ever made.

This hardware combination is designed to enable efficient on-device generative AI while allowing the AI HAT+ 2 to operate within edge device requirements. These include low power consumption, no cloud connectivity, low latency, and maximum data privacy. However, as with any embedded hardware, performance trade-offs matter: edge devices are limited in memory, compute resources, and power budget (typically single-digit W).

For this reason, generative AI applications that require general world awareness, continuous learning, or conversations based on extensive context and knowledge-heavy reasoning are better suited to run in the cloud. For latency-sensitive, privacy-critical, knowledge-confined applications, the new AI HAT+ 2 is an ideal fit.

Let’s break down when and where the AI HAT+ 2 is most powerful, and why it’s not just another niche gadget.

Where the AI HAT+ 2 really excels

The AI HAT+ 2 is strongest when running workloads that are compute-heavy up front, rather than workloads that are dominated by token-by-token (TBT) generation. In practice, this means it shines when you need the Raspberry Pi’s CPU to be available and responsive while running generative AI applications with the following profiles:

  1. Fast execution of encoders — when turning a visual, audio, or text input into a prompt embedding
  2. Short time to first token (TTFT)* — when interactivity and user experience are critical
  3. Large prefill — when the input context is larger than the output response
  4. Multi-stage pipelines — when sequential processing is needed, in which the output of one model becomes the input of the next

*Example benchmark figures for 96 prefill tokens, measured on the CPU using llama.cpp:

ModelRaspberry Pi 5 CPUHailo-10H
QWEN2.5-1.5B-4int2039ms320ms

Ideal use cases

Vision-language models (VLMs)

VLMs map naturally to the AI HAT+ 2’s strengths, as the image encoder is a high-compute stage that generates compact token embeddings as output. The Hailo-10H accelerator enables event triggering, logging, indexing, captioning, and smart searching with free text, using a 2B-parameter model that would be prohibitively slow to run on the Raspberry Pi’s CPU alone.

We can think of countless applications in home security and surveillance, such as turning off your alarm when your package is being delivered and notifying you once the delivery is complete, or sending you a log of meaningful pet-monitoring events at the end of each day. The AI HAT+ 2 is also ideal for security and monitoring applications in industries like quality assurance, healthcare, and industrial automation.

Voice to action

Another strong application of the AI HAT+ 2 is a local voice-to-action agent, combining high-compute inference with relatively low-bandwidth interaction. These workflows often rely on a large prefill step, i.e. processing a big, changing input context before generating a short response, which can be much slower on the Raspberry Pi’s CPU alone. This is particularly useful for agents that continuously ingest fresh data (including sensor readings, device states, logs, schedules, and recent events) and then respond locally with a short command or action.

The full sequential pipeline first converts free speech to text using a Whisper-class model, after which a small LLM handles intent understanding, decision-making, and natural free-text interaction, triggering real-world actions locally and reliably. This architecture enables agentic AI and physical AI at the edge by supporting larger Whisper models for improved accuracy, delivering low-cost, responsive, privacy-preserving, real-time voice control for a seamless user experience.

There are endless applications here too. For example, local voice to action enables natural, touchless control of devices, eliminating the need to navigate between elaborate menus and submenus or flip through tedious manuals. Another example application is intuitive wayfinding and navigation in public spaces, such as shopping centres, airports, and campuses, where users can state what they want to do rather than the exact location they need to find (e.g. “Where can I buy sunglasses?”, “Where can I get lunch?”, or “How do I reach my gate?”). In robotics and industrial systems, voice to action can facilitate more responsive human–machine interactions and more seamless cooperation.

Advanced vision applications

When it comes to demanding vision workloads, the AI HAT+ 2 enables a step change in performance. Its high compute power and efficient on-device execution translate directly into large performance gains — as much as 100% faster than the previous Raspberry Pi AI HAT+.

The Hailo-10H chip accelerates large convolutional neural networks (CNNs) and transformer-based vision models, including CLIP, zero-shot detection, and high-capacity object detectors, enabling richer perception without increasing bandwidth or power. This makes it possible to build physical AI systems that combine multiple vision stages — detection, embedding, semantic matching, and reasoning — entirely at the edge, unlocking more capable and responsive applications in home automation, security, robotics, retail, industrial automation, and more. With no cloud connectivity, no data leaves the device, and there are no network lags or costs.

Play to its strengths

The Raspberry Pi AI HAT+ 2 is at its most powerful when certain strengths are harnessed for the right applications. Some examples include:

StrengthsIdeal use cases
Free text operation without cloud dependencyOffline home automation and robotics
Small language outputs for event triggering, captioning, and summarisation on top of real-time visionHome security
Air-gapped generative summarisation of logs and sensor dataSecure industrial monitoring
Natural speech and zero-queue interaction with information agentsInformation kiosks

Bottom line: Don’t ask your toaster for history lessons…

The Raspberry Pi AI HAT+ 2 isn’t designed to compete with cloud inferencing; large LLMs will always run better where compute and memory are effectively unconstrained. However, for edge scenarios that value privacy, offline operation, low latency, and low power consumption, it unlocks real capabilities that weren’t feasible on the Raspberry Pi platform before, with or without the original AI HAT+.

You will make the best use of it when you need to run tightly scoped, on-device generative tasks alongside vision or real-world sensor input, particularly when the alternative is cloud dependency or far larger and more expensive hardware.

The robust Hailo Community has thousands of active developers. The upcoming integrations with Frigate and Home Assistant make the AI HAT+ 2 the most attractive option for anyone looking to make their first steps in physical AI and home automation.

16 comments
Jump to the comment form

Anders avatar

I have tried to get one of these but there does seem to be a supply problem.

Reply to Anders

Jose Ramirez avatar

I want to switch from shinobi NVR to frigate and for that reason bought the AI Hat +2. I hope that support of this Hardware will come as this will be a important usecase for this Hardware.

Reply to Jose Ramirez

Jordan Rhymes avatar

This is amazing! I can’t help but think this was inspired by OpenHome, the AI hat that let’s you do local voice on a raspberry pi. Maybe the worked with the OpenHome team on this?

Reply to Jordan Rhymes

Barry von Tobel avatar

have you tried Voice Assistant yet? It’s a ‘satellite’ microphone and HA (yellow in my case) processes the TTS with some help from nabu cloud.

Reply to Barry von Tobel

Simon avatar

Any plan to support llama.cpp

Reply to Simon

Lars avatar

Any ETA on the Frigate / Home Assistant integrations?

Reply to Lars

Rodolfo Briguez avatar

How can i buy

Reply to Rodolfo Briguez

Helen Lynn avatar

Head over to the AI HAT+ 2 page and hit “Buy now” to see Raspberry Pi Approved Resellers in your region.

Reply to Helen Lynn

Anders avatar

Love your optimism Lynn, but there doesn’t appear to be any available in any supplier for me. I’m on all the notification mailing lists. Been like that for some weeks now. Any news on when some more will be available?

Reply to Anders

Anders avatar

Sorry Helen,
I had just woken up. I didn’t mean to use your surname like that.

Reply to Anders

Helen Lynn avatar

No worries :) More stock should be on its way to resellers in a few weeks.

Sascha Leib avatar

Been there, tried that … unfortunately it is out of stock wherever I look.

Reply to Sascha Leib

crumble avatar

It has RAM. Many people buy that stuff, because its price may rise. So they help to increase the shortage and the price is rising.

But there is hope for a perfect storm. Add flash, so we are not forced to load the model over one thin PCIe lane. All of us will have 2-n times the fun ;)

Reply to crumble

Nick Ledwith avatar

I duno why i need one, but im sure ill think of a reason.

Reply to Nick Ledwith

photobooth near me avatar

I need this in my life, where can I buy this in Ottawa Canada?

Reply to photobooth near me

Oswin Joseph Ziervogel avatar

I bought a AI HAT+ (Halilo 8, 26TOPS) right before this released.

Reply to Oswin Joseph Ziervogel

Replying to Jose Ramirez
Cancel reply?