Deployment Overview of gpt-oss on Server¶
Prerequisites and Basic Requirements¶
The deployment requires a Linux server running Ubuntu with root privileges. The system must have Docker installed and configured to support GPU acceleration for the AI model. The following components are required:
-
Operating System: Ubuntu
-
Privileges: Root access (sudo)
-
Network: Access to the internet for downloading models and certificates
-
Hardware: GPU support for CUDA acceleration
FQDN of the Final Panel¶
The application is accessible via the following Fully Qualified Domain Name (FQDN) format:
gpt-oss<Server ID>.hostkey.in:443
Replace <Server ID> with the specific identifier assigned to the server instance.
File and Directory Structure¶
The deployment utilizes the following directory structure for configuration, data, and certificates:
-
/root/nginx/: Contains the Docker Compose configuration for the proxy and SSL management. -
/data/nginx/user_conf.d/: Stores custom Nginx configuration files for the specific domain. -
/data/nginx/nginx-certbot.env: Environment file for the Nginx-Certbot service. -
/etc/systemd/system/ollama.service: Systemd service file for the Ollama backend. -
/usr/share/ollama/.ollama/models/: Storage location for the downloaded AI models. -
/var/lib/docker/volumes/open-webui/: Persistent storage volume for the Open WebUI application data.
Application Installation Process¶
The application consists of a backend AI engine (Ollama) and a frontend interface (Open WebUI). The installation involves the following steps:
-
Install Ollama: The Ollama package is installed using the official installation script.
-
Configure Ollama Service: The
ollama.servicefile is modified to set the following environment variables:-
OLLAMA_HOST=0.0.0.0 -
OLLAMA_ORIGINS=* -
LLAMA_FLASH_ATTENTION=1
-
-
Download Model: The
gpt-oss:20bmodel is pulled and stored locally. -
Deploy Open WebUI: The frontend is deployed as a Docker container using the
ghcr.io/open-webui/open-webui:cudaimage.
Docker Containers and Their Deployment¶
Two primary Docker containers are deployed to run the application stack:
-
Open WebUI Container:
-
Image:
ghcr.io/open-webui/open-webui:cuda -
Name:
open-webui -
Ports: Maps host port
8080to container port8080. -
Environment Variables:
-
ENV=dev -
OLLAMA_BASE_URLS=http://host.docker.internal:11434
-
-
Volumes: Mounts the
open-webuivolume to/app/backend/data. -
GPU: Configured with
--gpus allfor CUDA support. -
Restart Policy: Set to
always.
-
-
Nginx-Certbot Container:
-
Image:
jonasal/nginx-certbot:latest -
Network Mode: Host
-
Volumes:
-
nginx_secretsmounted to/etc/letsencrypt. -
/data/nginx/user_conf.dmounted to/etc/nginx/user_conf.d.
-
-
Environment: Uses
[email protected]and loads variables from/data/nginx/nginx-certbot.env.
-
Proxy Servers¶
The application uses Nginx as a reverse proxy with SSL termination managed by Certbot.
-
Proxy Configuration: The Nginx configuration file located at
/data/nginx/user_conf.d/gpt-oss<Server ID>.hostkey.in.confdirects traffic to the internal application. -
Proxy Pass: Traffic is forwarded from the external port to the internal service using the rule:
proxy_pass http://127.0.0.1:8080;. -
SSL: Managed automatically by the
nginx-certbotcontainer to ensure HTTPS connectivity on port 443.
Available Ports for Connection¶
The following ports are configured for the application:
-
Port 443: External HTTPS access via the Nginx proxy.
-
Port 8080: Internal HTTP access for the Open WebUI container.
-
Port 11434: Internal access for the Ollama service (accessible via
host.docker.internalfrom within the container).
Starting, Stopping, and Updating¶
Service management is handled through Docker and Systemd commands:
-
Open WebUI Container:
-
Start:
docker start open-webui -
Stop:
docker stop open-webui -
Restart:
docker restart open-webui -
Update: Pull the latest image and recreate the container.
-
-
Ollama Service:
-
Start:
systemctl start ollama -
Stop:
systemctl stop ollama -
Restart:
systemctl restart ollama -
Enable on Boot:
systemctl enable ollama
-
-
Nginx Proxy:
-
Start/Update:
docker compose up -dexecuted from the/root/nginxdirectory. -
Stop:
docker compose downexecuted from the/root/nginxdirectory.
-