How To Deploy a Local AI via Docker
If you’re tired of worrying about your AI queries or the data you share within them being used to either train large language models (LLMs) or to create a profile of you, there are always local AI options you can use. I’ve actually reached the point where the only AI I use is local. For me, it’s not just about the privacy and security, but also the toll AI takes on the energy grids and the environment. If I can do my part to prevent an all-out collapse, you bet I’m going to do it.
Most often, I deploy local AI directly on my machine. There are, however, some instances where I want to quickly deploy a local AI to a remote server (either within my LAN or a server beyond it). When that need arises, I have two choices:
- Install a local AI service in the same way I install it on my desktop.
- Containerize it.
The benefit of containerizing it is that the locally installed AI is sandboxed from the rest of the system, giving me even more privacy. Also, if I want to stop the locally installed AI, I can do so with a quick and easy Docker command.
I would go so far as to say that containerizing your local AI is the fastest and easiest way to get it up and running.
Thanks to Docker.
That’s right, we’re going to deploy a local AI service as a Docker container.
Let me show you how this is done.
What You Need
First off, you need an operating system that supports Docker, which can be Linux, macOS or Windows. You’ll also need enough space on the system to pull whatever LLM you want to use. Finally, you’ll need a user with admin privileges and a network connection. I’m going to demonstrate this on Ubuntu Server 24.04.
Install Docker
The first thing we have to do is install Docker. Here’s how.
First, you’ll need to add the official Docker GPG key with the commands:
Next, add the required Docker repository with the command:
Install the required software with the following command:
To run the Docker command as a standard user, you’ll need to add that user to the Docker group. This is done so you can run the Docker command without sudo privileges. Add your user to the Docker group with:
Log out and log back in so the changes take effect.
Deploying a Local AI With Docker
There are three different methods of deploying the local AI with Docker.
Without a GPU
The first method of deployment is for a machine without an NVIDIA GPU, which means the local AI will run solely off the CPU. For that, the command is:
That’s the easy method.
With an NVIDIA GPU
If you have an NVIDIA GPU on your machine, there are several steps you must take.
The first thing you must do is add the necessary repository for the NVIDIA Container Toolkit with the following commands:
You can now install the NVIDIA Container Toolkit with:
You’ll then have to configure Docker to work with the NVIDIA toolkit with the following two commands:
You can now deploy the Ollama container with the command:
With an AMD GPU
If you have an AMD GPU, the command is:
Accessing the Local AI
With everything up and running, we now have to access the AI prompt. Let’s say you want to pull the Llama 3.2 LLM. You can pull it and access the prompt with the following command:
The above command will land you at the Ollama prompt, where you can run your first query.
And that’s all there is to deploying a local AI via a Docker container.