Skip to content

Deployment Overview of gpt-oss-20b on Server

Prerequisites and Basic Requirements

The deployment requires a Linux server running Ubuntu. The following conditions must be met before initiating the installation:

  • Operating System: Ubuntu (specific version not restricted in configuration, but standard Ubuntu repositories are used).

  • Privileges: Root access or sudo privileges are required to install system packages, manage systemd services, and run Docker containers.

  • Domain Configuration: The server must be configured to resolve the domain hostkey.in.

  • Ports:

    • Port 8080: Internal communication between the proxy and the application.

    • Port 443: External HTTPS access for the web interface.

    • Port 11434: Internal Ollama API service.

FQDN of the Final Panel

The application is accessible via the following Fully Qualified Domain Name (FQDN) format:

gpt-oss<Server ID>.hostkey.in:443

Replace <Server ID> with the specific identifier assigned to the server instance. The application is served over HTTPS on port 443.

File and Directory Structure

The deployment utilizes the following directory structure for configuration, data, and certificates:

  • /root/nginx/: Contains the Docker Compose configuration for the proxy server.

    • /root/nginx/compose.yml: Docker Compose file for the Nginx and Certbot setup.
  • /data/nginx/: Stores persistent data for the proxy.

    • /data/nginx/user_conf.d/: Contains custom Nginx configuration files, specifically gpt-oss<Server ID>.hostkey.in.conf.

    • /data/nginx/nginx-certbot.env: Environment variables for the Certbot service.

    • /data/nginx/user_conf.d/: Mount point for custom server configurations.

  • /etc/systemd/system/: Contains the service unit file for Ollama.

    • ollama.service: The active service configuration.

    • ollama.service.bak: Backup of the original service file.

  • /usr/share/ollama/.ollama/models/: Storage location for the gpt-oss:20b model.

  • /var/lib/docker/volumes/open-webui/_data: Persistent storage volume for the Open WebUI application.

Application Installation Process

The application stack consists of three main components: Ollama, Open WebUI, and an Nginx proxy with Certbot.

Ollama Installation

  1. The Ollama package is installed using the official installation script.

  2. A system user named ollama is created.

  3. The ollama.service systemd unit is modified to include the following environment variables:

    • OLLAMA_HOST=0.0.0.0

    • OLLAMA_ORIGINS=*

    • LLAMA_FLASH_ATTENTION=1

  4. The gpt-oss:20b model is pulled and stored locally.

Open WebUI Deployment

The Open WebUI interface is deployed as a Docker container with the following specifications:

  • Image: ghcr.io/open-webui/open-webui:cuda

  • Container Name: open-webui

  • Port Mapping: Host port 8080 maps to container port 8080.

  • GPU Support: The container is launched with --gpus all to utilize available GPU resources.

  • Environment Variables:

    • ENV=dev

    • OLLAMA_BASE_URLS=http://host.docker.internal:11434

  • Volume: A named volume open-webui is mounted to /app/backend/data for data persistence.

  • Restart Policy: Set to always.

Proxy Deployment

The Nginx proxy is deployed using Docker Compose to handle SSL termination and routing.

Docker Containers and Their Deployment

Two primary Docker components are managed:

  1. Open WebUI Container:

    • Launched via a direct docker run command.

    • Command structure:

      docker run -d -p 8080:8080 --gpus all \
        --add-host=host.docker.internal:host-gateway \
        -v open-webui:/app/backend/data \
        --name open-webui \
        -e ENV='dev' \
        -e OLLAMA_BASE_URLS='http://host.docker.internal:11434' \
        --restart always ghcr.io/open-webui/open-webui:cuda
      

  2. Nginx and Certbot Containers:

    • Managed via Docker Compose located at /root/nginx/compose.yml.

    • Services defined:

      • nginx: Uses image jonasal/nginx-certbot:latest.

      • Network mode is set to host.

      • Volumes mounted:

        • nginx_secrets (external) to /etc/letsencrypt.

        • /data/nginx/user_conf.d to /etc/nginx/user_conf.d.

    • Deployment command:

      docker compose up -d
      
      Executed from the /root/nginx directory.

Proxy Servers

The Nginx proxy server handles external traffic and SSL certificate management.

  • Image: jonasal/nginx-certbot:latest

  • Configuration:

    • Custom server blocks are stored in /data/nginx/user_conf.d/.

    • The specific configuration file for this application is named gpt-oss<Server ID>.hostkey.in.conf.

    • The proxy_pass directive is configured to forward requests to http://127.0.0.1:8080.

  • SSL/Certbot:

    • Certbot is integrated to manage Let's Encrypt certificates.

    • Email for notifications: [email protected].

    • Certificates are stored in the nginx_secrets volume mounted at /etc/letsencrypt.

  • Routing:

    • External requests to gpt-oss<Server ID>.hostkey.in on port 443 are routed to the internal Open WebUI service on port 8080.

Databases

The Open WebUI application utilizes a local storage volume for its data persistence.

  • Storage Type: Docker Volume.

  • Volume Name: open-webui.

  • Mount Point: /app/backend/data inside the container.

  • Host Location: /var/lib/docker/volumes/open-webui/_data.

  • Connection Method: The application accesses the data directly via the mounted volume; no external database connection string is required.

Available Ports for Connection

The following ports are utilized by the deployment:

Port Protocol Description
8080 TCP Internal Open WebUI service (accessible via proxy).
11434 TCP Internal Ollama API service.
443 TCP External HTTPS access for the web interface.

Starting, Stopping, and Updating

Ollama Service

Managed via systemd:

  • Start: systemctl start ollama

  • Stop: systemctl stop ollama

  • Restart: systemctl restart ollama

  • Enable: systemctl enable ollama

  • Reload Daemon: systemctl daemon-reload (Required after modifying ollama.service).

Open WebUI Container

Managed via Docker:

  • Start: docker start open-webui

  • Stop: docker stop open-webui

  • Restart: docker restart open-webui

  • Update: Pull the latest image and recreate the container:

    docker pull ghcr.io/open-webui/open-webui:cuda
    docker rm -f open-webui
    docker run -d -p 8080:8080 --gpus all \
      --add-host=host.docker.internal:host-gateway \
      -v open-webui:/app/backend/data \
      --name open-webui \
      -e ENV='dev' \
      -e OLLAMA_BASE_URLS='http://host.docker.internal:11434' \
      --restart always ghcr.io/open-webui/open-webui:cuda
    

Nginx Proxy

Managed via Docker Compose in /root/nginx:

  • Start/Restart: docker compose up -d

  • Stop: docker compose down

  • Update: Modify /root/nginx/compose.yml or environment files, then run docker compose up -d.

question_mark
Is there anything I can help you with?
question_mark
AI Assistant ×