AI software
This page provides instructions for running AI models powered by Hailo NPUs on Raspberry Pi 5. The Hailo NPU is an AI accelerator chip designed to run neural networks; instead of Raspberry Pi’s CPU doing the AI work, the NPU handles it more efficiently.
You can connect a Hailo NPU to your Raspberry Pi 5 using either:
-
Raspberry Pi AI Kit, which consists of an M.2 HAT+ with a pre-installed Hailo-8L NPU.
-
Raspberry Pi AI HAT+ with either an on-board Hailo-8L NPU or an on-board Hailo-8 NPU.
-
Raspberry Pi AI HAT+ 2 with an on-board Hailo-10H NPU.
|
Note
|
All three of these options allow you to run vision AI models on a Raspberry Pi 5. However, the AI Kit is no longer in production and so, for new designs, we recommend using an AI HAT+ or AI HAT+ 2. |
AI HAT+ 2 additionally allows you to run Generative AI (GenAI) models. With AI HAT+ 2, you can:
-
Run Large Language Models (LLMs). If you have an AI HAT+ 2 and want to run both vision AI models and LLMs, follow both sets of instructions: Run vision AI models on Raspberry Pi 5 and Run LLMs on Raspberry Pi 5 (AI HAT+ 2 only).
-
Run Vision-Language Models (VLMs) and other GenAI tasks. For instructions, see Hailo’s GitHub repository: hailo-apps.
Hardware prerequisites
Edit this on GitHub
Running AI models requires a Raspberry Pi 5 with a 64-bit Raspberry Pi OS installation (Trixie) and one of the following Hailo AI accelerator (NPU) options:
-
A Raspberry Pi AI HAT+ or AI HAT+ 2 (recommended), both of which have an on-board Hailo module. For more information about these accessories, see AI HATs.
-
A Raspberry Pi AI Kit, which includes an M.2 HAT+ with a Hailo-8L AI accelerator pre-installed. For more information about the AI Kit, see AI Kit.
|
Note
|
The AI Kit is no longer in production; for new designs, we recommend using an AI HAT+ or AI HAT+ 2. |
If you want to run vision AI models, you additionally need a supported camera, such as Raspberry Pi Camera Module 3. We recommend attaching your camera before attaching your AI hardware. For instructions, see Install a Raspberry Pi Camera. Skip reconnecting your Raspberry Pi to power because you must disconnect your Raspberry Pi from power for the next step.
Then, depending on your AI hardware, follow the instructions for attaching either an AI HAT (HAT+ or HAT+ 2) or AI Kit to your Raspberry Pi 5.
Next, follow the instructions in Software prerequisites to enable PCIe Gen 3.0 (AI Kit only), install the required dependencies, and verify that everything has been set up correctly.
Software prerequisites
Before running vision AI models or GenAI models on your Raspberry Pi 5, you must configure the required software. Broadly, this consists of the following tasks, performed in order:
-
Enable PCIe Gen 3.0 (AI Kit only). Manually configure the PCIe interface to allow the Hailo NPU to communicate at full speed. This is only necessary for the AI Kit.
-
Update Raspberry Pi OS. Ensure that your Raspberry Pi OS packages are fully up to date.
-
Install required dependencies. Install the necessary software dependencies that allow the operating system (OS) and applications to detect, communicate with, and run AI models on the Hailo NPU.
-
Reboot and verify. Verify that your AI hardware is correctly detected and ready to use.
Enable PCIe Gen 3.0 (AI Kit only)
If you’re using an AI Kit, we highly recommend that you enable PCIe Gen 3.0. You can skip this for AI HAT+ and AI HAT+ 2 because the setting is automatically applied.
By default, Raspberry Pi 5 uses Gen 2.0 (5 GT/s) speeds on its PCIe interface. To achieve the best performance for your NPU, use one of the following approaches to enable Gen 3.0 (8 GT/s) speeds:
-
Enable Gen 3.0 on your Raspberry Pi 5 from the configuration command line (CLI) tool (
raspi-config). -
Update the configuration file (
config.txt) to enable the PCIe interface on your Raspberry Pi 5 to operate at PCIe Gen 3.0 speeds.
For more information about this setting, see PCIe Gen 3.0.
-
raspi-config
-
config.txt
First, open the Raspberry Pi configuration CLI; in the Raspberry Pi Terminal, run the following command:
$ sudo raspi-config
Then, from the configuration CLI, complete the following steps:
-
Select
Advanced Options > PCIe Speed. -
Select
Yesto enable PCI Gen 3.0 mode. -
Select
Finishto exit the configuration CLI. -
Reboot your Raspberry Pi for your changes to take effect. You can do this from the Terminal with
sudo reboot.
First, open your configuration file (/boot/firmware/config.txt) as the root user. Then:
-
Add the following line to the
config.txtfile:dtparam=pciex1_gen=3 -
Reboot your Raspberry Pi for these settings to take effect:
$ sudo reboot
Update Raspberry Pi OS
Ensure that the Raspberry Pi 5 is running Raspberry Pi OS Trixie with the latest software installed, and that it has the latest Raspberry Pi firmware:
$ sudo apt update
$ sudo apt full-upgrade -y
$ sudo rpi-eeprom-update -a
$ sudo reboot
For more information, see Update software and Update the bootloader configuration.
Install required dependencies
After updating your Raspberry Pi with the latest Raspberry Pi software and firmware, the following dependencies are required to use the NPU:
-
The Hailo kernel device driver and firmware.
-
Hailo RT middleware software.
-
Hailo Tappas core post-processing libraries.
How you install these dependencies depends on the AI hardware you’re using. Choose the appropriate installation option for your AI hardware:
|
Note
|
The AI Kit and AI HAT+ require a different package (hailo-all) to AI HAT+ 2 (hailo-h10-all). These packages can’t co-exist.
|
-
AI Kit and AI HAT+
-
AI HAT+ 2
To install the required dependencies for the AI Kit or AI HAT+, open the Raspberry Pi Terminal and run the following commands:
$ sudo apt install dkms
$ sudo apt install hailo-all
To install the required dependencies for AI HAT+ 2, open the Raspberry Pi Terminal and run the following commands:
$ sudo apt install dkms
$ sudo apt install hailo-h10-all
Reboot and verify
After installing the required dependencies, you must reboot your Raspberry Pi 5. You can do this from the Raspberry Pi Terminal using the following command:
$ sudo reboot
When your Raspberry Pi 5 has finished booting back up again, run the following command to check that everything is running correctly:
$ hailortcli fw-control identify
If you see output similar to the following, you’ve successfully installed the NPU and its software dependencies:
Executing on device: 0000:01:00.0 Identifying board Control Protocol Version: 2 Firmware Version: 4.17.0 (release,app,extended context switch buffer) Logger Version: 0 Board Name: Hailo-8 Device Architecture: HAILO8L Serial Number: HLDDLBB234500054 Part Number: HM21LB1C2LAE Product Name: HAILO-8L AI ACC M.2 B+M KEY MODULE EXT TMP
AI HAT+ and AI HAT+ 2 might show <N/A> for Serial Number, Part Number and Product Name. This is expected and doesn’t affect functionality.
Additionally, you can run dmesg | grep -i hailo to check the kernel logs, which is expected to output something like the following:
[ 3.049657] hailo: Init module. driver version 4.17.0 [ 3.051983] hailo 0000:01:00.0: Probing on: 1e60:2864... [ 3.051989] hailo 0000:01:00.0: Probing: Allocate memory for device extension, 11600 [ 3.052006] hailo 0000:01:00.0: enabling device (0000 -> 0002) [ 3.052011] hailo 0000:01:00.0: Probing: Device enabled [ 3.052028] hailo 0000:01:00.0: Probing: mapped bar 0 - 000000000d8baaf1 16384 [ 3.052034] hailo 0000:01:00.0: Probing: mapped bar 2 - 000000009eeaa33c 4096 [ 3.052039] hailo 0000:01:00.0: Probing: mapped bar 4 - 00000000b9b3d17d 16384 [ 3.052044] hailo 0000:01:00.0: Probing: Force setting max_desc_page_size to 4096 (recommended value is 16384) [ 3.052052] hailo 0000:01:00.0: Probing: Enabled 64 bit dma [ 3.052055] hailo 0000:01:00.0: Probing: Using userspace allocated vdma buffers [ 3.052059] hailo 0000:01:00.0: Disabling ASPM L0s [ 3.052070] hailo 0000:01:00.0: Successfully disabled ASPM L0s [ 3.221043] hailo 0000:01:00.0: Firmware was loaded successfully [ 3.231845] hailo 0000:01:00.0: Probing: Added board 1e60-2864, /dev/hailo0
Run vision AI models on Raspberry Pi 5
This section provides guidance for setting up a Hailo NPU with your Raspberry Pi 5 so that camera applications can run real-time AI tasks like object detection on camera input. The following instructions are relevant for the AI Kit, AI HAT+, and AI HAT+ 2.
Step 1. Install camera dependencies
First, you must install the Raspberry Pi camera software stack. The rpicam-apps package provides the camera utilities used by Raspberry Pi OS and includes the Hailo post-processing software demo stages required for vision AI pipelines.
Use the following command to install the latest rpicam-apps software package:
$ sudo apt update && sudo apt install rpicam-apps
Then, run the following command to ensure that the camera is operating correctly:
$ rpicam-hello
This starts the camera and shows a preview window for five seconds. If the preview window appears, the camera is set up correctly.
Step 2. Run real-time visual AI demos
After you’ve verified that everything is correctly installed, you can run camera AI demos using the rpicam-apps camera software. This software implements AI demos using a post-processing framework; this software uses pre-trained neural networks to run AI inference on camera frames using the NPU.
To highlight some of the capabilities of the NPU, this section outlines some demos that showcase different models and post-processing stages, such as drawing bounding boxes around objects or pose lines around people. Results are displayed either visually on the live preview window (default) or as text in the Raspberry Pi Terminal.
The following demos use rpicam-hello, but you can also use other rpicam-apps, such as rpicam-vid for video recordings and rpicam-still for still images. These applications might require you to add or modify some command line options to make them compatible.
Object detection
The following rpicam-apps demos perform object detection using rpicam-hello with different YOLO models. Each demo draws bounding boxes around detected objects and supports optional flags to modify the output, such as -n to turn off the viewfinder and -v 2 to display textural output only.
Different demos have different tradeoffs in speed and accuracy. Run the following commands to try each of the demos on your Raspberry Pi 5:
| Model | Command | Post-processing stage |
|---|---|---|
YOLOv6 |
|
Object detection. |
YOLOv8 |
|
Object detection. |
YOLOX |
|
Lightweight and fast object detection. |
YOLOv5 |
|
People and face detection. |
Image segmentation
The following rpicam-apps demo uses rpicam-hello to perform object detection and then segments the object by drawing a colour mask on the viewfinder image. Run the following command to try the demo on your Raspberry Pi 5:
$ rpicam-hello -t 0 --post-process-file /usr/share/rpi-camera-assets/hailo_yolov5_segmentation.json --framerate 20
Pose estimation
The following rpicam-apps demo uses rpicam-hello to perform 17-point human pose estimation, drawing lines connecting the detected points. Run the following command to try the demo on your Raspberry Pi 5:
$ rpicam-hello -t 0 --post-process-file /usr/share/rpi-camera-assets/hailo_yolov8_pose.json
Package versions for AI Kit and AI HAT+
If you want to run models using the AI Kit or AI HAT+ generated with a specific version of the Hailo toolchain, you must ensure that you’re using compatible versions of the Hailo software packages and device drivers. These components don’t function correctly if their versions don’t match.
First, if you’ve previously used apt-mark to hold any of the relevant packages, you might need to unhold them with the following command:
$ sudo apt-mark unhold hailo-tappas-core hailort hailo-dkms
You can then install the required version of the software packages:
-
4.19
-
4.18
-
4.17
To install version 4.19 of Hailo’s neural network tooling, run the following commands:
$ sudo apt install hailo-tappas-core=3.30.0-1 hailort=4.19.0-3 hailo-dkms=4.19.0-1 python3-hailort=4.19.0-2
$ sudo apt-mark hold hailo-tappas-core hailort hailo-dkms python3-hailort
To install version 4.18 of Hailo’s neural network tooling, run the following commands:
$ sudo apt install hailo-tappas-core=3.29.1 hailort=4.18.0 hailo-dkms=4.18.0-2
$ sudo apt-mark hold hailo-tappas-core hailort hailo-dkms
To install version 4.17 of Hailo’s neural network tooling, run the following commands:
$ sudo apt install hailo-tappas-core=3.28.2 hailort=4.17.0 hailo-dkms=4.17.0-1
$ sudo apt-mark hold hailo-tappas-core hailort hailo-dkms
Further Resources
-
For Hailo’s own set of demos that you can run on a Raspberry Pi 5, see the hailo-rpi5-examples GitHub repository.
-
For Hailo’s model zoo, which contains a large number of neural networks, see the Hailo Model Explorer.
-
For discussion on Hailo’s hardware and tooling, see the Hailo Community forum and Developer Zone.
Run LLMs on Raspberry Pi 5 (AI HAT+ 2 only)
This section provides instructions for setting up the Hailo 10 NPU on your AI HAT+ 2 with a Raspberry Pi 5 so that you can locally run Large Language Models (LLM). With the following setup, you can access LLMs through:
-
POST requests (API calls), where you directly send queries to the hailo-ollama server.
-
The Web UI frontend, where you use a browser-based chat-like interface.
Components for running local LLMs
Running local LLMs on a Raspberry Pi 5 involves several system layers that work together to enable hardware-accelerated inference.
Hardware layer
The physical compute required to run local LLMs comes from:
-
A Raspberry Pi 5, which provides the main CPU, memory, and general-purpose I/O required to run the OS, manage the software, and co-ordinate communication with the AI accelerator.
-
A Hailo AI accelerator chip (Hailo-10H NPU provided through the AI HAT+ 2), which provides the neural processing for AI inference, allowing you to run local LLMs.
For more information about these prerequisites, see Hardware prerequisites.
Software layer
To use the Hailo NPU for local LLMs on your Raspberry Pi 5, you must install a set of software dependencies, drivers, and runtime components.
Software dependencies are needed for running LLMs (as well as vision AI) on Raspberry Pi 5. For information about required dependencies and instructions for installing them, see Software prerequisites > Install required dependencies.
AI model layer
The Hailo Gen-AI Model Zoo contains the pre-trained LLMs suitable for running on Hailo-10H. These models are installed as part of Step 1. Install the Hailo Ollama server. These models are then loaded by the runtime and run by the AI accelerator.
Backend layer
LLMs are loaded and run by the Hailo Ollama server. This backend layer:
-
Loads LLMs from the Hailo Gen-AI Model Zoo.
-
Manages inference on the Hailo-10H NPU.
-
Exposes a REST API for sending requests (submitting prompts) to the NPU and returning AI inference results (receiving responses).
The Hailo Ollama server is installed as part of Step 2. Start the Hailo Ollama server and run LLMs.
Frontend layer (optional)
The frontend layer provided by Open WebUI is an optional, browser-based chat interface for interacting with LLMs. Alternatively, you can keep using the terminal-based POST requests described in Step 2. Start the Hailo Ollama server and run LLMs to interact with LLMs.
While Open WebUI isn’t required to run LLMs, it provides a more user-friendly way to submit prompts and view responses instead of terminal-based POST requests. Open WebUI communicates with the Hailo Ollama server through its REST API and displays model outputs in a conversational UI.
Open WebUI needs to run in a Docker container. This is because Open WebUI is incompatible with Python 3.13 (as used on Raspberry Pi OS Trixie). Docker provides a containerised environment for stable operation.
Open WebUI is set up as a combination of Step 3. Install Docker (for Open WebUI) and Step 4. Install and use Open WebUI (optional).
Step 1. Install the Hailo Ollama server
The Hailo Ollama server provides:
-
Pre-trained LLMs optimised for the Hailo-10H NPU.
-
The Hailo Ollama server that exposes a REST API for model inference.
To install the Hailo Ollama server, download version 5.1.1 of the Hailo Model Zoo GenAI Debian package for Raspberry Pi 5, and then install it using the following command in a Raspberry Pi Terminal:
sudo dpkg -i hailo_gen_ai_model_zoo_5.1.1_arm64.deb
Step 2. Start the Hailo Ollama server and run LLMs
After everything is installed, start the local hailo-ollama server to expose a REST API for LLM requests, and then download and run some LLMs.
-
In a Raspberry Pi Terminal, run the following command to start the local
hailo-ollamaserver:$ hailo-ollama -
In a new Terminal window, run the following command to get a list of LLMs:
$ curl --silent http://localhost:8000/hailo/v1/list -
Run the following command to download a model from the provided list, replacing
"examplemodel:tag"with any listed model (for example,"qwen2:1.5b"):$ curl --silent http://localhost:8000/api/pull \ -H 'Content-Type: application/json' \ -d '{ "model": "examplemodel:tag", "stream" : true }' -
Run the following command to send a query to the LLM with a POST request, replacing
"examplemodel:tag"with whichever model you’ve downloaded and want to run:$ curl --silent http://localhost:8000/api/chat \ -H 'Content-Type: application/json' \ -d '{"model": "examplemodel:tag", "messages": [{"role": "user", "content": "Translate to French: The cat is on the table."}]}'
Step 3. Install Docker (for Open WebUI)
You can skip this step if you don’t plan to use the Open WebUI interface.
The following instructions install Docker on your Raspberry Pi 5, which is a prerequisite for deploying and running Open WebUI, described in Step 4. Install and use Open WebUI (optional). For more comprehensive installation instructions, see Docker’s page: Install Docker Engine on Debian.
To install Docker:
-
Remove any existing Docker packages:
$ sudo apt remove $(dpkg --get-selections docker.io docker-compose docker-doc podman-docker containerd runc | cut -f1) -
Install the Docker apt repository:
# Add Docker's official GPG key: $ sudo apt update $ sudo apt install ca-certificates curl $ sudo install -m 0755 -d /etc/apt/keyrings $ sudo curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc $ sudo chmod a+r /etc/apt/keyrings/docker.asc # Add the repository to Apt sources: $ sudo tee /etc/apt/sources.list.d/docker.sources <<EOF Types: deb URIs: https://download.docker.com/linux/debian Suites: $(. /etc/os-release && echo "$VERSION_CODENAME") Components: stable Signed-By: /etc/apt/keyrings/docker.asc EOF $ sudo apt update -
Check that the
docker.sourcesfile has been created correctly:$ cat /etc/apt/sources.list.d/docker.sourcesThe output is expected to list the following, where
Suitesis theVERSION_CODENAMEof your operating system (trixie):Types: deb URIs: https://download.docker.com/linux/debian Suites: trixie Components: stable Signed-By: /etc/apt/keyrings/docker.asc -
Install and run the Docker service:
$ sudo apt install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin $ sudo systemctl start docker -
Create a
dockergroup:$ sudo groupadd docker -
Add your user to the
dockergroup$ sudo usermod -aG docker $USER -
Sign out and back in again so that your group membership is re-evaluated or run the following command to activate changes to the group:
$ newgrp docker -
Test Docker:
$ docker run hello-world
Step 4. Install and use Open WebUI (optional)
After installing Docker in the previous step, you can deploy and run an Open WebUI container with Docker.
|
Note
|
The following method is one way to deploy and use containers, but it’s not the only approach. For example, you could instead use docker-compose for container management.
|
-
To use Open WebUI, you first need to install it. Download the Open WebUI image required to run the frontend layer:
$ docker pull ghcr.io/open-webui/open-webui:main -
Ensure that
hailo-ollamais already running. Then, start the Open WebUI container and connect it to thehailo-ollamabackend server:$ docker run -d -e OLLAMA_BASE_URL=http://127.0.0.1:8000 -v open-webui:/app/backend/data --name open-webui --network=host --restart always ghcr.io/open-webui/open-webui:main -
Monitor container startup. The container can take up to a minute to initialise. To view progress and logs, run the following command and then wait until the logs indicate that the server is running and ready to accept connections.
$ docker logs open-webui -f -
Access Open WebUI in a web browser and enter the following URL: http://127.0.0.1:8080. This opens a chat interface where you can select a model and begin interacting with the LLM.
For more detailed instructions, see the Quick Start guide in the Open WebUI documentation.