We use some essential cookies to make our website work.

We use optional cookies, as detailed in our cookie policy, to remember your settings and understand how you use our website.

Search our documentation by meaning, not keywords

Raspberry Pi is always looking for new and better ways to do things. One of the most significant ways in which AI has affected the internet is the proliferation of chatbots and answer generators, and while we’ll never use AI to author our documentation, we’re curious to find out whether tools like these could help users find technical information easier and quicker.

Something you may have heard about is retrieval-augmented generation (RAG for short), which uses a defined set of documentation to answer questions for you in an informed way. In this blog, we are trialling two different RAG-based documentation tools: InKeep and Kapa. Our website is serving two versions of today’s post to visitors at random; if you reload the page, you’ll get the other tool. Why not type a question about something you’d like to do with a Raspberry Pi into the chatbot below and see how it responds?

We’d love you to think about whether the tool did a decent job of answering your question — give the response a thumbs up or a thumbs down and get ready to tell us anything else you want us to bear in mind when you get to the comments section. We take our documentation very seriously, and your thoughts and opinions will help us augment and improve it.

So what's special about this?

The chatbots are actually using Raspberry Pi documentation, including the html, the white papers and books stored in pip.raspberrypi.com, and some specific GitHub repositories that have documentation built into them (for example, rpi-image-gen). Because of this, it can search within the documentation for specifically relevant chunks of information to feed into the AI model. It can then ask the model your question, and the model will decide whether it actually has relevant information and generate an answer if it feels it does.

How does it work? Isn't it just a fancy search engine?

When LLMs ingest something, it's first converted into tokens. You can think of a token as a thing that represents a single word, although most tokens represent sub-words ('running' might be broken up into 'run-' and '-ning', for example). The next stage is to convert those tokens into 'embeddings'. Embeddings are actually vectors in a very high-dimensional space (up to a thousand different dimensions), and these vectors represent the "semantic meaning" of the token. Two vectors that are close in value have a similar meaning. Here's the canonical example:

The diagram above shows an example displayed in three dimensions (because we can't even visualise what four spatial dimensions would look like). You could think of the dimensions as terms like 'gender', 'royalty', and '80s music reviews'! If you take the embedding for 'woman' and subtract the embedding for 'man', you get a vector that roughly means "the shift from male to female". Add that same shift to 'king', and you land very close to 'queen'.

This isn't a party trick. It tells you something genuinely useful: nearby coordinates in this space mean similar things, and the same kind of relationship between pairs of words shows up as the same kind of geometric shift. The model has learnt, from billions of words of training data, that the relationship between 'king' and 'queen' is about the same as the one between 'man' and 'woman', and it has encoded that where the points sit.

An extension of this concept is to develop embedding models that can take a sequence of tokens (like a paragraph of, say, 100 words) and create an embedding (vector) from it that gives the semantic meaning of the section of text. This is the type of embedding model used in RAG-based systems, as we want them to find chunks of text that have similar meanings.

Chunking

Once you notice this property, it's fairly clear what to do with it. Take all your documentation, chop it into reasonably sized chunks — a paragraph or two each — and generate an embedding for every chunk. You now have a database in which similar meanings are close together. 

When someone asks a question, the system generates an embedding for that question and goes looking through the database for the embeddings that sit closest to it. The closest ones are, by construction, the chunks of documentation whose meaning is most similar to the question. You haven't had to guess the author's vocabulary; the model has done that work for you.

Finally, generation

At this point, the system has a list of the most relevant bits of documentation. It could just display them to the user as is — and plenty of search systems do stop there — but it can go one step further: it can hand those chunks, along with the original question, to a language model and ask it to write an answer using only those chunks.

That's the 'generation' half of 'retrieval-augmented generation'. The retrieval step finds the right documentation, and the generation step turns it into a tidy, plain-English answer. The model isn't drawing on whatever it happens to remember from the training data, which is where language models tend to get themselves into trouble; it's giving an answer from your actual documentation and nothing else. If the answer isn't there, the tool will state that, rather than making something up.

Opinions, please

Please have a go and give us feedback on what you found; convenient thumbs-up and -down buttons are included in the chat interface. That information, along with the interaction, will be stored in the system so that we can determine what works and what doesn't.

We will never use AI to create documentation in the first place. Instead, we are hoping to use these tools to help inform us where our documentation has gaps or errors; we put a lot of effort into creating it, and we want it to be as accurate, complete, and straightforward to use as possible. Our documentation team and I are excited to scrutinise the results to discover what they reveal about your needs and how effectively we serve the information you want.

35 comments
Jump to the comment form

Grzesiek avatar

Zapytałem o podłączenie jednego modułu z dwoma czujnikami do Raspberry Pi Pico W.
Pierwszy model zaczął od informacji że nie ma wiedzy na temat modułu ale że powinien działać na i2c, ale najlepiej żebym sprawdził dokumentację modułu.
Drugi model przeszed do konkretu, jak działa i jak podłączyć.
Nie wiem jakim cudem tak wielka rozbieżność z pytaniem o prostą sprawę.
Będę testował 😉
Pozdrawiam

Reply to Grzesiek

Steve Brook avatar

I asked two questions. 1) can a raspberry pi run ai (note I tried no capitals) Understood the question well and gave a very comprehensive response. giving examples of hardware and software. score=impressive – 2) I asked if a Raspberry PI could run engine management software. I was amazed that it completely understood my question. The AI came back with the fact it did not have specific info, but it gave pointers on the I/O capabilities of hardware etc. It then asked if I wanted it to look at more broader knowledge than it’s manuals. It then posted another line of info, but it did remember that it asked me a question. I replied simply “yes” and it went on to check broader knowledge. It came back with limitations of software and hardware on the PI family and that EMSoftware requires Real Time responses. – Overall I am very impressed with this AI bot. It has a scary understanding of how my mind works. Are you sure you don’t have a Nerd tied up in a box with a keyboard, at PI headquarters, pretending to be AI?

Reply to Steve Brook

Gordon Hollingworth avatar

We have a whole floor of nerds… But they’re not tied up!

Reply to Gordon Hollingworth

ukscone avatar

Have you checked on the ones locked in the basement ?

Reply to ukscone

Gordon Hollingworth avatar

They’re only ever let out on a leash…

Reply to Gordon Hollingworth

James Hughes avatar

Shhh. The reason that so much money is being pushed into AI, is actually to clone all the nerds and buy boxes and keyboards for them.

Reply to James Hughes

Otto Schäfer avatar

Just asked how to install nextcloud on the Pi. It gave me a general guidance, which was at the beginning quite good but ended with the apache configuration withiut details. My questionw was “How to install nextcloud on a Raspberry Pi5”

Reply to Otto Schäfer

Michael avatar

I asked for current camera controls and got back useful summaries from three different documents, are learly referenced. It would be useful to have a “copy” button to save the answer easily for reference.

Reply to Michael

PeterF avatar

I tried two questions, one to each machine to see how much help one could get when trying to use nginx with Nextcloud, instead of Apache. The questions were:

– how to install nextcloud on Raspberry Pi using nginx (kappa)
– how to configure nginx for optimal use with nextcloud (inkeep)

The kappa answer was quite sparse, and would indicate at first glance, that there’s not much help available on configuring nginx, particularly for use with Nextcloud.

The inkeep answer finished with a question on whether I would like a general answer. Replying “yes” provided quite a long (cf kappa) list of helpful suggestions. I can’t check at present whether its’ comprehensive enough to get a working installation as I’m away, and can’t access my notes.

I do realise that there is some practical help for this combination outside Raspberry Pi documentation.
Thanks for letting me be a guinea pig!
PeterF

Reply to PeterF

Paul Hutchinson avatar

Yesterday I searched the doc, forum and web for some old information I knew existed about RaPi0W UARTS and bluetooth configuration. I never found what I was looking for, but the AI dug it out for me this morning.
Yay!, thanks for this useful tool.

Reply to Paul Hutchinson

Jack avatar

I generally find AI chat bots rather frustrating. Ether because their training is at book level, or directed toward some marketing solution (buy this, to fix that).

Reply to Jack

solar3000 avatar

hhhmmmm another human replaced.

Reply to solar3000

Gordon Hollingworth avatar

No, no human has been replaced. This is a glorified search engine which generates a bit of text, choosing not to try it is like saying you aren’t going to use Google because you’d rather read the internet yourself.
It is only trying to help people use the documentation we spend a great deal of time writing and fixing by (human) hands.

Reply to Gordon Hollingworth

Testing avatar

I had one question “What command should I use to set automatic upgrades?”. Got advise sudo rpi-eeprom-update -a. Which is of course partly correct.
Asked a follow up question “What if I want to set automatic updates for all programs?” Answer given was:
” the knowledge sources do not explicitly cover how to configure automatic unattended upgrades for all packages (e.g., using unattended-upgrades). They only describe the manual APT update process and the bootloader’s automatic update service.

What the sources do recommend is:

Running sudo apt update && sudo apt full-upgrade regularly to keep your software, Linux kernel, and firmware up to date. [Manage software packages with APT]
For SSH specifically, setting up a daily cron job running apt install openssh-server to ensure the SSH server stays updated independently. [Secure your Raspberry Pi]

For a fully automated solution covering all packages, I’d recommend consulting the broader Raspberry Pi or Debian documentation, as this topic isn’t covered in the available knowledge sources.”

Looks like AI can’t find the answer or there is no answer. If its in the broader Raspberry Pi documentation the AI should be able to find it.

Reply to Testing

Gordon Hollingworth avatar

This is exactly the kind of feedback we’ll be working through with the Kapa/InKeep systems, we will have your conversation in the database with a note next to it saying how certain it is about it’s answer. For your particular conversation it has highlighted it with “Uncertain” and clearly there’s missing information there about the updater plugin and, for example unattended-upgrades. We’ll have a look at adding something along those lines.

Reply to Gordon Hollingworth

Simon avatar

One benefit from this sort of open question searching is that it might highlight some of the “knowledge” people are looking for that isn’t in the documentation.

Existing keyword-based searches (and lack of results) already allow this, but the addition of intent and context in the user prompts may be easier to understand what people are actually trying to find out about, and where the AI fails to be “certain” about what has been asked and can’t find an answer, allows for those areas of knowledge to be added to the Documentation (by humans who do know) and making the documentation – and this AI using it – more useful/helpful.

Reply to Simon

Anders avatar

I ask how to install zig language on Raspberry Pi OS.
The first response was that there is no specific documentation but can provide general guidance.

I accepted the offer and it gave me a perfect set of instructions for setting up from the downloaded .tar.

Reply to Anders

D avatar

Q1. I want to make an application start up after the GUI launches
A1: I don’t have specific information about configuring applications to start automatically after the GUI launches in the available Raspberry Pi documentation sources.

Q2. How do I set a static IP address
A2. run nmcli1
Q3. What version of OpenCV works best with LCCV
A3. It didn’t know.

To be honest for my needs (we do Robotics/and OpenCV with Raspberry Pi’s) I just loaded all the docs and some other websites into Notebook LM and let it figure out most of these questions for me. It seems to work better… mostly

Reply to D

ZJ avatar

I asked three questions:

I wanted to know whether the Raspberry Pi CM5 series compute module supports RTC and what the purpose of this RTC is.

When using a Pi 4, after repeatedly power cycling it, an issue occurs where the device’s /etc/fstabbecomes empty. This causes the system to boot but fails to mount the /bootpartition, leaving the filesystem in a read-only state.

After flashing the Debian 13 system, I once got stuck at the login screen. I confirmed that the username and password I configured were correct, but after entering them, the login screen would briefly flash and then return, repeatedly preventing me from proceeding. I eventually had to reflash the system.

I asked the AI for solutions. For the first question, it provided a detailed and accurate answer, citing relevant CM5 datasheets as evidence, which was precise in terms of document retrieval.

However, when I inquired about the latter two practical usage issues, its response was unsatisfactory, merely focusing on avoiding repeated power cycling and suggesting the use of a better SD card. I’m wondering if this AI chatbot is connected to Raspberry Pi forums, as some user issues might have related solutions available in the forums.

Reply to ZJ

Paul Salmon avatar

It lacks “Forum Knowledge”: Raspberry Pi has over a decade of community troubleshooting locked inside its forums. Because the RAG tool only scrapes official docs, it misses out on the collective human experience of fixing weird bugs.

Reply to Paul Salmon

Gordon Hollingworth avatar

There are a number of problems with using the forums as a source of information:
1) The information provided in the forums can be wrong but there is often no feedback saying whether that’s the case.
2) The information is not checked by an engineer to make sure that the response is the best answer for the future as well as the present.
3) The information may be significantly out of date

Also the whole point of using the chatbot for us is to help understand what is missing from the documentation.

What I might look into is something of an agentic fall-back to a secondary documentation source. So if it cannot find the answer in the main source it then goes and searches in the forum posts.

Reply to Gordon Hollingworth

unknownk avatar

roposal: A Vision for an AI-Powered “Bridge” OS and a New Way to Support the Foundation

This is a fantastic step toward making documentation more accessible! Building on this, I would love to propose a vision to further bridge the gap between “unfriendly” Linux systems and beginners:

1. AI-Powered “Explanation and Safe-Step” UI:
Integrate an AI layer into both the terminal and GUI error dialogs. Instead of just showing a cryptic error, provide an “Explain this” button. The AI would analyze the cause and present a “Draft Command” in an editable text box. Users can review, learn the parameters, and then execute it manually.

2. The “Human-Centric” Safety Net:
The AI should act as a “supporter,” not a “replacement.” For dangerous commands (like rm -rf), the AI should warn the user: “Are you sure? You are about to delete a system folder.”

3. Gamified Learning and “Chotto Dekiru” Rewards:
Track the user’s progress locally. As they learn, their “Linux Level” increases (from Beginner to the legendary “Chotto Dekiru” rank). Upon reaching a new rank, the system generates an encrypted code.

4. The “Right to Donate”:
Users can send this code to the Foundation to unlock the “Right to Donate.” After reaching a certain rank, you can choose an amount to donate (with a minimum set) to receive a rank-exclusive T-shirt (e.g., the “Linux Chotto Dekiru” shirt). It is not just a purchase; it is a badge of honor for supporting the community with your grown skills!

(Disclaimer: I know I am shamelessly dumping this massive, unasked-for idea here without any technical contribution on my part… yet! haha. But I believe this “Bridge” could turn Linux from a “cryptic wall” into a supportive RPG for everyone.)

Reply to unknownk

Simon avatar

InKeep one gave better responses for my specific conversation.

I feel the formatting of the InKeep responses was also “nicer” visually, particuarly the ‘.bg-inkeep-primary-medium’ colouring of some code specific info in answers, where as the Kapa responses are a light grey on white which didn’t stand out as well for me. The more condensed (taking up less space) ‘sources’ section on the Kapa responses was better than the InKeep one which seems to take up too much room especially when it provides a lot of sources, however I do like how InKeep gives the ‘breadcrumb’ navigation of where the source info is, such as “Docs > Microcontrollers > MicroPython”, as that then makes it easier for me to find it myself if I was to browse the Documentation, as I know the steps/route to take to get to the area of the documentation.

As long as the Raspberry Pi Documentation AI keeps to facts, is accurate, and doesn’t have a “personality” like so many AI appears to gain from it’s “learning”, I think it’ll be ok.

Reply to Simon

Anders avatar

Have you set up a RAG configuration with a local model?

Reply to Anders

Anders avatar

Ahh I didn’t need to ask here when the assistant could tell me. I have the answer thanks.

Reply to Anders

rossifr avatar

Hi,
Coul you please allow the AI assistant access to the raspberry pi forum posts?
Thank you for this new tool.

Reply to rossifr

Carlos Luna avatar

I tried using the chatbot but it kept me saying “something went wrong” until I disabled my Ad Blocker and Disconnect.me browser extensions.
When I finally got it working, I asked it about recording high speed video with the Pi Camera 3 and Python, but to push it a little further I wrote my question in Spanish. It did a good work giving me the parameters to use for getting the best results at 120fps and it answered in Spanish too. As an example, it gave me the following command:

$ rpicam-vid –level 4.2 –framerate 120 –width 1280 –height 720 –save-pts timestamp.pts -o video.264 -t 10000 –denoise cdn_off -n

The only thing I didn’t like about this is that at the end of the answer it added some tips to get better performance and it recommended using the force_turbo=1 flag, but never rmentioned anything about the possibility of voiding warranty by doing so (unless this is no longer the case).
All in all, nice little tool. I have to say that some time ago I toyed around this same idea and even cloned your documentation repo to build embeddings to get the results I needed. Nowadays we are using NotebookLM in a similar way to help our agents deal with client’s questions.
Would it be possible to have somethng like this running on the AI HAT+? I guess I’ll have to find some time to test it…

Reply to Carlos Luna

Paul Duggan avatar

I asked it could I run dotnet on a pi and it confirmed it could, and also could i control gpio pins and it gave a reasonable answer. If you add https://www.petecodes.co.uk/ to the training material it will give great answers on this topic!

Reply to Paul Duggan

Sam Downie avatar

I asked “What Are You” it’s responce is – “I’m an AI Assistant knowledgeable about Raspberry Pi, built by the team at inkeep.com (a company that builds AI assistants for developers). I use documentation and other sources to help answer questions you might have. Let me know how I can be helpful.”.
In your Raspberry Pi newsletter you state that – we don’t use Ai tools. So therefore drop the use of this application, and Gordon and Eben you have a complaint email from me

Reply to Sam Downie

Helen Lynn avatar

We said that we don’t use AI tools to create documentation. To create our documentation we use three human technical documentation experts, who draw on their own very substantial skills and experience as well as on the knowledge of all of our also-expert engineers, and I don’t mind telling you that it is not remotely easy to find this many people who are all this good, and we have no intention of replacing any of them with an AI tool. We didn’t, though, say we don’t use AI tools at all, ever, for anything – that would be daft.

Reply to Helen Lynn

Erle Williams avatar

I asked the following question:
What is best way to shutdown a headless Pi-hole installation on a Raspberry Pi 4?
The reply was:
For a headless Pi-hole installation on a Raspberry Pi 4, the best way to shut down is using the command line since you won’t have desktop access1.

Connect to your Raspberry Pi via SSH and run:
sudo poweroff
I like that the interface allows copying the commands into the clipboard and that the source of the answers are also available.
Thanks for the opportunity to try out this new tool!

Reply to Erle Williams

Marian avatar

That was my question: Is it possible to make distributed computing on two or more raspberry pis. If so, how can I start to do it.
I got a nice, short but very basic answer abot networking of raspis and how to share files between them on the network, and the information that the documentation doesn’t include specific distributed computing software or cluster management tools.

Reply to Marian

Richard Whitaker avatar

I used this query “i am using a rpi2040 and need a program in c to write all the available outputs at the same time.” And it wrote a C program and explanation which look good. I then asked for the same but in Python. It pointed out that there is no equivalent to the function to write many outputs at the same time. Very useful.

Reply to Richard Whitaker

HankB avatar

I asked how to configure WiFi credentials on a Pi Zero W w/out a DE. I asked because I was trying to do this using `raspi-config`, `nmtui` and `nmcli` and kept getting errors. The LLM helpfully provided non-interactive `raspi-config` commands. I powered up the Pi to try and like magic, it associated with my AP. I guess that the desired changes were made during earlier attempts but did not work until the Pi was rebooted.

Anyway, I think it’s going to be very useful to train an LLM on the official documentation repository.

Thanks!

Reply to HankB

Epoch avatar

I poked it on wireless networking, it’s not bad but it’s a bit shallow in its knowledge.
Had it read RFCs it would have known how to tell me why I can’t bridge a wireless interface in client mode.
A good start and definitely the way too the future!

Reply to Epoch

Leave a Comment