Running Llama 3 with OpenWebUI Locally
Introduction
This guide will help you set up and run Llama 3 with OpenWebUI using Docker. Running large language models (LLMs) locally provides enhanced privacy, security, and performance. It is particularly beneficial for users with powerful GPUs.
Why Use a GPU?
Using a GPU significantly speeds up the processing of LLMs. It allows efficient handling of large data and complex computations required by models like Llama 3 70B, making it an ideal choice for users who need powerful computational resources.
Prerequisites
- Powerful NVIDIA GPU: Required for running Llama 3 70B efficiently.
- Linux OS or Windows with WSL: This guide covers both platforms.
- Docker: Necessary for containerized application deployment.
- NVIDIA Drivers and CUDA Toolkit: Needed for GPU support in Docker containers.
Installation Guide
1. Install Docker
For Linux:
- Update Packages:
sudo apt-get update - Install Docker:
sudo apt-get install -y docker.io sudo systemctl start docker sudo systemctl enable docker
For Windows (WSL):
- Set Up WSL 2:
- Open PowerShell as Administrator:
wsl --install wsl --set-default-version 2 - Install a Linux distribution from the Microsoft Store (e.g., Ubuntu).
- Open PowerShell as Administrator:
- Install Docker in WSL:
- Open your Linux distribution and install Docker:
sudo apt-get update sudo apt-get install -y docker.io sudo service docker start sudo usermod -aG docker $USER newgrp docker
- Open your Linux distribution and install Docker:
2. Install NVIDIA Drivers and CUDA Toolkit
For Linux:
- Install NVIDIA Drivers:
sudo apt-get install -y nvidia-driver-530 # Replace 530 with the correct version for your GPU sudo reboot # Restart to apply changes - Install CUDA Toolkit:
- Download and install the CUDA toolkit from the NVIDIA website.
For Windows:
- Install NVIDIA Drivers:
- Download and install the latest drivers from the NVIDIA website.
- Install CUDA Toolkit:
- Download and install the CUDA toolkit from the NVIDIA website.
3. Install NVIDIA Docker Toolkit
For Linux:
- Add NVIDIA Docker Repository:
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list sudo apt-get update - Install NVIDIA Docker Toolkit:
sudo apt-get install -y nvidia-docker2 sudo systemctl restart docker
For Windows (WSL):
- Set Up NVIDIA Container Toolkit:
- Open your Linux distribution and run:
sudo apt-get update sudo apt-get install -y nvidia-container-toolkit sudo service docker restart
- Open your Linux distribution and run:
- Verify GPU Support:
docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi- This command should display your GPU details if configured correctly.
4. Pull and Run OpenWebUI Docker Image
- Create a Directory for OpenWebUI:
mkdir -p ~/open-webui cd ~/open-webui - Pull the Docker Image:
docker pull ghcr.io/open-webui/open-webui:main
Running OpenWebUI with GPU Support
For enhanced performance using an NVIDIA GPU, run:
docker run -d -p 3000:8080 --gpus all --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:cuda
Running Along Side Ollama
For installation if you Have Ollama Running on the System, run:
docker run -d --network=host -v open-webui:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main
Installation for OpenAI API Usage Only
If you only need OpenWebUI for the OpenAI API, use the command below and replace your_secret_key with your actual API key:
Note: Consider setting your OpenAI key to an environment varriable rather than your cli input which may save the secret to your cli history.
docker run -d -p 3000:8080 -e OPENAI_API_KEY=$OPENAI_API_KEY -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
Installing OpenWebUI with Bundled Ollama Support
For a streamlined setup that includes both OpenWebUI and Ollama, choose the appropriate command based on your hardware setup:
With GPU Support:
Utilize GPU resources by running the following command:
docker run -d -p 3000:8080 --gpus=all -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama
For CPU Only:
If you don’t have a GPU, use this command instead:
docker run -d -p 3000:8080 -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama
Conclusion
By following this guide, you can efficiently set up and run Llama 3 with OpenWebUI locally on both Linux and Windows using WSL. Leveraging your NVIDIA GPU will provide optimal performance and ensure you maintain control over your data privacy and security.