Skip to content

Deployment Overview of Gemma-3-27B on Server

Prerequisites and Basic Requirements

  • Operating System: Ubuntu Linux

  • Privileges: Root access or sudo privileges are required for system modifications, Docker installation, and service management.

  • Hardware: System must be equipped with NVIDIA GPUs to support the nvidia-container-runtime.

  • Network: Internet connectivity is required for pulling Docker images and downloading the Gemma3 model.

FQDN of the Final Panel

The application is accessible via the following Fully Qualified Domain Name (FQDN) on the hostkey.in domain:

  • Address: gemma<Server ID>.hostkey.in:443

  • Replace <Server ID> with the specific identifier assigned to your instance.

File and Directory Structure

The following directories and files contain critical configurations and data:

  • /root/nginx/: Directory containing the Nginx proxy and Certbot configuration files.

  • /root/nginx/compose.yml: Docker Compose file for the Nginx/Certbot stack.

  • /etc/systemd/system/ollama.service: Systemd service unit file for the Ollama backend.

  • /etc/systemd/system/ollama.service.bak: Backup of the original Ollama service file.

  • /etc/docker/daemon.json: Docker daemon configuration file defining the NVIDIA runtime.

  • /data/nginx/: Directory for Nginx data, including SSL certificates and user configurations.

  • /data/nginx/nginx-certbot.env: Environment variables for the Certbot service.

  • /data/nginx/user_conf.d/: Directory for custom Nginx configuration snippets.

  • /etc/letsencrypt/: External volume mount for SSL secrets managed by Nginx-Certbot.

  • /app/backend/data/: Internal volume mount for Open-WebUI persistent data.

Application Installation Process

The installation consists of three main components: the Ollama backend, the NVIDIA GPU runtime, and the Open-WebUI frontend.

Ollama Backend Setup

  1. Install Ollama using the official installation script.

  2. Create a system user named ollama.

  3. Modify the ollama.service file to expose the service on all network interfaces and enable specific environment variables for performance and security:

    • OLLAMA_HOST=0.0.0.0

    • OLLAMA_ORIGINS=*

    • LLAMA_FLASH_ATTENTION=1

  4. Reload the systemd daemon and restart the Ollama service.

  5. Pull the specific model gemma3:27b using the command ollama pull gemma3:27b.

NVIDIA Container Toolkit Setup

  1. Install the nvidia-container-toolkit package.

  2. Configure the NVIDIA container runtime for Docker using nvidia-ctk runtime configure --runtime=docker.

  3. Update the /etc/docker/daemon.json file to set nvidia as the default runtime.

  4. Restart the Docker service to apply runtime changes.

Open-WebUI Frontend Setup

  1. Remove any existing container named open-webui.

  2. Deploy the Open-WebUI container using the ghcr.io/open-webui/open-webui:cuda image.

  3. Configure the container to expose port 8080 and utilize all available GPUs.

  4. Set the environment variable OLLAMA_BASE_URLS to http://host.docker.internal:11434.

  5. Set the ENV variable to dev.

Access Rights and Security

  • Docker User: The ollama user is created as a system user to manage the backend service.

  • Ollama Origins: The OLLAMA_ORIGINS environment variable is set to *, allowing requests from any origin.

  • Nginx/Certbot: The Nginx container runs in host network mode to handle HTTPS traffic directly on the server's network interface.

  • Service Persistence: Both the Ollama service and the Open-WebUI container are configured to restart automatically on failure or system reboot.

Databases

  • Storage: The Open-WebUI application stores its data in a Docker volume named open-webui, which is mapped to /app/backend/data inside the container.

  • Connection: No external database connection is configured; the application uses its internal storage mechanism within the mounted volume.

Docker Containers and Their Deployment

Two distinct Docker deployments are utilized in this architecture:

Open-WebUI Container

  • Image: ghcr.io/open-webui/open-webui:cuda

  • Name: open-webui

  • Ports: Exposes internal port 8080 mapped to the host port 8080.

  • GPU: Configured with --gpus all to utilize NVIDIA hardware acceleration.

  • Volumes: Mounts the open-webui volume to /app/backend/data.

  • Hosts: Adds the DNS entry host.docker.internal pointing to the host gateway.

  • Environment Variables:

  • ENV=dev

  • OLLAMA_BASE_URLS=http://host.docker.internal:11434

Nginx-Certbot Container

  • Image: jonasal/nginx-certbot:latest

  • Deployment Method: Docker Compose located at /root/nginx/compose.yml.

  • Volumes:

  • nginx_secrets (external volume) mapped to /etc/letsencrypt.

  • Host path /data/nginx/user_conf.d mapped to /etc/nginx/user_conf.d.

  • Environment:

  • [email protected]

  • Loads additional variables from /data/nginx/nginx-certbot.env.

  • Network Mode: host

Proxy Servers

  • Software: Nginx with Certbot for SSL certificate management.

  • Image: jonasal/nginx-certbot:latest

  • Configuration:

  • Manages SSL certificates for the domain gemma<Server ID>.hostkey.in.

  • Routes traffic from the external port 443 (HTTPS) to the internal port 8080.

  • Uses the host network mode, binding directly to the server's network stack.

  • Email: SSL renewal notifications are configured for [email protected].

Permission Settings

  • Nginx Directory: The /root/nginx directory is owned by root:root with 0755 permissions.

  • Compose File: The /root/nginx/compose.yml file is owned by root:root with 0644 permissions.

  • Docker Daemon: The /etc/docker/daemon.json file is owned by root with 0644 permissions.

Location of Configuration Files and Data

Component Configuration File/Path Description
Nginx Proxy /root/nginx/compose.yml Docker Compose definition for Nginx and Certbot.
Ollama Service /etc/systemd/system/ollama.service Systemd unit file for the Ollama backend.
Docker Daemon /etc/docker/daemon.json Runtime configuration for NVIDIA support.
Nginx Data /data/nginx/ Root directory for Nginx logs, configs, and certs.
SSL Secrets /data/nginx/user_conf.d/ Custom Nginx configuration directory.
Open-WebUI Data Docker Volume open-webui Persistent storage for the web interface.

Available Ports for Connection

  • 443 (HTTPS): External access to the Open-WebUI frontend via the Nginx proxy.

  • 8080 (HTTP): Internal access to the Open-WebUI container (used by Nginx proxy).

  • 11434 (HTTP): Internal access to the Ollama API endpoint.

Starting, Stopping, and Updating

Ollama Service

  • Start: systemctl start ollama

  • Stop: systemctl stop ollama

  • Restart: systemctl restart ollama

  • Enable: systemctl enable ollama

Open-WebUI Container

  • Start: docker start open-webui

  • Stop: docker stop open-webui

  • Remove: docker rm -f open-webui

  • Restart: docker restart open-webui

Nginx-Certbot Container

  • Start/Update: Navigate to /root/nginx and execute docker compose up -d.

  • Stop: Navigate to /root/nginx and execute docker compose down.

question_mark
Is there anything I can help you with?
question_mark
AI Assistant ×