Local LLM: Llama 3 and OpenUI

Running Llama 3 with OpenWebUI Locally

Introduction

This guide will help you set up and run Llama 3 with OpenWebUI using Docker. Running large language models (LLMs) locally provides enhanced privacy, security, and performance. It is particularly beneficial for users with powerful GPUs.

Why Use a GPU?

Using a GPU significantly speeds up the processing of LLMs. It allows efficient handling of large data and complex computations required by models like Llama 3 70B, making it an ideal choice for users who need powerful computational resources.

Prerequisites

  • Powerful NVIDIA GPU: Required for running Llama 3 70B efficiently.
  • Linux OS or Windows with WSL: This guide covers both platforms.
  • Docker: Necessary for containerized application deployment.
  • NVIDIA Drivers and CUDA Toolkit: Needed for GPU support in Docker containers.

Installation Guide

1. Install Docker

For Linux:

  1. Update Packages:
     sudo apt-get update
    
  2. Install Docker:
     sudo apt-get install -y docker.io
     sudo systemctl start docker
     sudo systemctl enable docker
    

For Windows (WSL):

  1. Set Up WSL 2:
    • Open PowerShell as Administrator:
      wsl --install
      wsl --set-default-version 2
      
    • Install a Linux distribution from the Microsoft Store (e.g., Ubuntu).
  2. Install Docker in WSL:
    • Open your Linux distribution and install Docker:
      sudo apt-get update
      sudo apt-get install -y docker.io
      sudo service docker start
      sudo usermod -aG docker $USER
      newgrp docker
      

2. Install NVIDIA Drivers and CUDA Toolkit

For Linux:

  1. Install NVIDIA Drivers:
     sudo apt-get install -y nvidia-driver-530  # Replace 530 with the correct version for your GPU
     sudo reboot  # Restart to apply changes
    
  2. Install CUDA Toolkit:

For Windows:

  1. Install NVIDIA Drivers:
  2. Install CUDA Toolkit:

3. Install NVIDIA Docker Toolkit

For Linux:

  1. Add NVIDIA Docker Repository:
     distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
     curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
     curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
     sudo apt-get update
    
  2. Install NVIDIA Docker Toolkit:
     sudo apt-get install -y nvidia-docker2
     sudo systemctl restart docker
    

For Windows (WSL):

  1. Set Up NVIDIA Container Toolkit:
    • Open your Linux distribution and run:
      sudo apt-get update
      sudo apt-get install -y nvidia-container-toolkit
      sudo service docker restart
      
  2. Verify GPU Support:
     docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
    
    • This command should display your GPU details if configured correctly.

4. Pull and Run OpenWebUI Docker Image

  1. Create a Directory for OpenWebUI:
     mkdir -p ~/open-webui
     cd ~/open-webui
    
  2. Pull the Docker Image:
     docker pull ghcr.io/open-webui/open-webui:main
    

Running OpenWebUI with GPU Support

For enhanced performance using an NVIDIA GPU, run:

docker run -d -p 3000:8080 --gpus all --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:cuda

Running Along Side Ollama

For installation if you Have Ollama Running on the System, run:

docker run -d --network=host -v open-webui:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main

Installation for OpenAI API Usage Only

If you only need OpenWebUI for the OpenAI API, use the command below and replace your_secret_key with your actual API key:

Note: Consider setting your OpenAI key to an environment varriable rather than your cli input which may save the secret to your cli history.

docker run -d -p 3000:8080 -e OPENAI_API_KEY=$OPENAI_API_KEY -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

Installing OpenWebUI with Bundled Ollama Support

For a streamlined setup that includes both OpenWebUI and Ollama, choose the appropriate command based on your hardware setup:

With GPU Support:

Utilize GPU resources by running the following command:

docker run -d -p 3000:8080 --gpus=all -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama

For CPU Only:

If you don’t have a GPU, use this command instead:

docker run -d -p 3000:8080 -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama

Conclusion

By following this guide, you can efficiently set up and run Llama 3 with OpenWebUI locally on both Linux and Windows using WSL. Leveraging your NVIDIA GPU will provide optimal performance and ensure you maintain control over your data privacy and security.