Install Ollama AI on Ubuntu Linux to Use LLMs on Your Own Machine

You might think getting an LLM up and running on your own machine would be an insurmountable task, but it's actually been made easy thanks to Ollama.

Jun 15th, 2024 6:00am by Jack Wallen

Featued image for: Install Ollama AI on Ubuntu Linux to Use LLMs on Your Own Machine

Feature image by Albert-Paul from Pixabay/Ollama logo.

Artificial intelligence is everywhere, and just about every company on the planet is looking to leverage it in some fashion. For some, just hopping onto the likes of ChatGPT is the solution for such needs. For others, however, there are certain issues at play. One issue is the privacy of documents. In order for AI to function, it must learn from documents. You or your company might not want to feed those beasts your information because it means that the information could be used for other purposes. For example, I stopped using Google Docs for writing my novels. Why? Because I couldn’t be certain whether Google was using those manuscripts to train their neural networks. Simply put, my work is my work. There are other issues at play, such as customization, cost control and performance testing. You might think the free version of ChatGPT is enough, but it’s only going to get you so far. So, since Linux has become a viable option for your company, why not bring a version of ChatGPT into the fold? That version is called PrivateGPT, and you can install it on a Ubuntu machine and work with it like you would with the proprietary option.

PrivateGPT is an AI project that allows you to ask questions about your own documents using large language models.

You might think getting this up and running would be an insurmountable task, but it’s actually been made very easy thanks to Ollama, which is an open source project for running LLMs on a local machine. For those who don’t know, an LLM is a large language model used for AI interactions. Ollama has a number of LLMs available, such as:

Llama3: An openly available LLM from Meta
Qwen2: A new series of LLMs from Alibaba
Phi3: Lightweight LLMs from Microsoft
Aya: Multilingual models in 23 languages
Mistral: 7B model from Mistral AI
Gemma: A lightweight LLM from Google DeepMind

You can see the entire library of available LLMs on this page, but know that there are quite a large number of them. What I’m going to do is walk you through the process of installing and using Ollama. One thing to keep in mind is that this setup does require some hefty hardware. You’re going to need some GPU power; otherwise, Ollama will run in CPU mode, which is incredibly slow. The best hardware to run this on would consist of a modern CPU and an NVIDIA GPU. Make sure to check for NVIDIA drivers with this command:

nvidia-sma -a

If you receive an error and you know you have an NVIDIA GPU, make sure to install the required drivers. On a Ubuntu machine, you can check for available drivers with one of the following two commands, depending on your configuration: Desktop:

sudo ubuntu-drivers list

Server:

sudo ubuntu-drivers list --gpgpu

You can then install the driver that’s the best match for your system with:

sudo ubuntu-drivers install

Again, if you don’t have an NVIDIA GPU, Ollama will still work — it’ll just be excruciatingly slow. If you just want to see how to get it up and running (even without an NVIDIA GPU), you can install it and run it, but know that it’s going to be hair-pullingly slow. Point made. Let’s install.

What You’ll Need

Here are the suggested minimum system requirements:

GPU: Nvidia Quadro RTX A4000
Microarchitecture: Ampere
Max GPUs: 2
CUDA Cores: 6,144
Tensor Cores: 192
GPU Memory: 16GB GDDR6
FP32 Performance: 19.2 TFLOPS

You’ll also need a user with sudo privileges. Ready? Let’s do it.

Installing Ollama

Installing Ollama is actually quite simple. Open your terminal app and run the following command:

curl -fsSL https://ollama.com/install.sh | sh

If the command reports that curl isn’t installed, you can add it with:

sudo apt-get install curl -y

Or, if you prefer wget, you can download that installer with:

wget https://ollama.com/install.sh

If you run the command with curl, the installer will automatically start. If you run it with wget, you’ll then have to give the file executable permissions with:

chmod u+x install.sh

You can then run the script with:

./install.sh

You will be prompted to type your sudo password. Once that’s taken care of, Ollama will install.

Downloading a Model

Next, you’ll need to locate the LLM you want to use. Go to the Ollama library site and select the one you prefer. Let’s say you want to use the Llama3 LLM. Go back to the terminal app and issue this command:

ollama run llama3

Since this is the first time you’ve run the command, it will have to download the library. When the download is complete, you’ll see a new prompt that looks like this:

Send a message (/? for help)

This is your prompt. Type a query such as:

What is Linux?

If your have met the system requirements, you should receive a response fairly quickly. Otherwise, give it time to answer. When you’re done using Ollama, exit out of the prompt with Ctrl+D. You can use any of the listed LLMs that you want. Just remember to run this command:

ollama run LLM_NAME

Replace LLM_NAME with the name of the model you want to use. You could give anyone access to this machine and show them how to run their own queries via Ollama. Allow them to SSH into the system and they can conveniently access their new, locally installed AI from their desktop. And that’s all there is to installing and using a private AI on a local Linux machine. If you’re worried about privacy or any other issue that comes along with using public AI, this is a great option.

Jack Wallen is what happens when a Gen Xer mind-melds with present-day snark. Jack is a seeker of truth and a writer of words with a quantum mechanical pencil and a disjointed beat of sound and soul. Although he resides...