Run a Local LLM on a Raspberry Pi 5 with Ollama

Turn a Raspberry Pi 5 into a private, offline AI chatbot. Install Ollama, pull a small model like Llama 3.2 or Phi-3, and chat with an LLM that runs entirely on your own hardware — no cloud, no fees.

What you’ll need

A Raspberry Pi 5 — get the 8GB or 16GB model. RAM is the hard limit on which models you can run, so more is better here.
Active cooling (the official Active Cooler or a case with a fan) — the Pi will run the CPU hard.
A quality power supply and a 32GB+ microSD card or, better, an NVMe SSD (models are several gigabytes each).
Raspberry Pi OS (64-bit) already set up. New to the Pi? Start with our Raspberry Pi 5 getting-started guide.

Reality check: a Pi has no dedicated GPU VRAM, so it runs small models (1-3B parameters) on the CPU. Expect a usable, conversational pace — not the instant replies of a desktop GPU. For bigger models you’ll want real hardware; see our best hardware for local LLMs guide.

Step 1: Update the Pi

Open a terminal and make sure everything is current before installing:

sudo apt update && sudo apt upgrade -y

Reboot if the kernel updated:

sudo reboot

Step 2: Install Ollama

Ollama is the simplest way to download and run local models. It has a native ARM64 build, so the official one-line installer works directly on the Pi:

curl -fsSL https://ollama.com/install.sh | sh

The script installs Ollama and starts it as a background service. Confirm it’s running:

ollama --version

Step 3: Pull and run a small model

Model size must fit in the Pi’s RAM. Good choices for an 8GB Pi:

llama3.2:3b — a capable 3B all-rounder from Meta.
phi3:mini — Microsoft’s 3.8B model, strong at reasoning for its size.
gemma2:2b — Google’s small, fast 2B model.
llama3.2:1b — the fastest option when you want snappy replies over smarts.

Download and chat in one command:

ollama run llama3.2:3b

The first run downloads the model (a few gigabytes), then drops you into an interactive prompt. Type a question and it answers — entirely offline. Press Ctrl+D to exit.

Tip: start with a 1-3B model. If replies are too slow, drop to llama3.2:1b; if you have a 16GB Pi and want more capability, try a quantized 7-8B model and accept slower output.

Step 4: Use it from your network (optional)

Ollama exposes an HTTP API on port 11434. To reach it from other machines, bind it to all interfaces. Edit the service override:

sudo systemctl edit ollama.service

Add these lines, then save:

[Service]
Environment="OLLAMA_HOST=0.0.0.0"

Restart and confirm:

sudo systemctl restart ollama
curl http://localhost:11434/api/tags

Now any device on your LAN can use the Pi as a private AI endpoint — point a chat front-end like Open WebUI at http://<your-pi-ip>:11434. Only do this on a trusted home network; don’t expose port 11434 to the internet.

Getting the best performance

Use an NVMe SSD, not a microSD card — model load times and stability improve noticeably.
Keep it cool. Sustained inference pins the CPU; without active cooling the Pi will thermal-throttle. Our overclocking guide covers cooling and safe clock tuning.
Match the model to the RAM. If the Pi swaps to disk, responses crawl. Smaller model, faster replies.
Lower the context length if you only need short answers — it saves memory.

What’s next?

A Pi running an LLM pairs naturally with the rest of a home lab:

Install Pi-hole ad blocker — network-wide ad blocking on the same box.
Build a Raspberry Pi NAS — store your model files and data.
Best hardware to run LLMs locally — when you outgrow the Pi and want to run bigger models on a GPU or Mac.

Running an LLM on a $80 computer won’t replace a cloud model — but it’s a genuinely private, offline assistant you fully own, and a great way to learn how local AI works before investing in bigger hardware.

Run a Local LLM on a Raspberry Pi 5 with Ollama

What you’ll need

Step 1: Update the Pi

Step 2: Install Ollama

Step 3: Pull and run a small model

Step 4: Use it from your network (optional)

Getting the best performance

What’s next?

More Tutorials

Turn Your Raspberry Pi into a Plex Media Server

Getting Started with Raspberry Pi 5: Complete Beginner's Guide

Overclock Your Raspberry Pi 5: Safe Speed Boost Guide