Deployment Overview of Gemma-3-27B on Server¶

Prerequisites and Basic Requirements¶

Operating System: Ubuntu Linux
Privileges: Root access or sudo privileges are required for system modifications, Docker installation, and service management.
Hardware: System must be equipped with NVIDIA GPUs to support the nvidia-container-runtime.
Network: Internet connectivity is required for pulling Docker images and downloading the Gemma3 model.

The application is accessible via the following Fully Qualified Domain Name (FQDN) on the hostkey.in domain:

The following directories and files contain critical configurations and data:

/root/nginx/: Directory containing the Nginx proxy and Certbot configuration files.
/root/nginx/compose.yml: Docker Compose file for the Nginx/Certbot stack.
/etc/systemd/system/ollama.service: Systemd service unit file for the Ollama backend.
/etc/systemd/system/ollama.service.bak: Backup of the original Ollama service file.
/etc/docker/daemon.json: Docker daemon configuration file defining the NVIDIA runtime.
/data/nginx/: Directory for Nginx data, including SSL certificates and user configurations.
/data/nginx/nginx-certbot.env: Environment variables for the Certbot service.
/data/nginx/user_conf.d/: Directory for custom Nginx configuration snippets.
/etc/letsencrypt/: External volume mount for SSL secrets managed by Nginx-Certbot.
/app/backend/data/: Internal volume mount for Open-WebUI persistent data.

The installation consists of three main components: the Ollama backend, the NVIDIA GPU runtime, and the Open-WebUI frontend.

Install Ollama using the official installation script.
Create a system user named ollama.
Modify the ollama.service file to expose the service on all network interfaces and enable specific environment variables for performance and security:
- OLLAMA_HOST=0.0.0.0
- OLLAMA_ORIGINS=*
- LLAMA_FLASH_ATTENTION=1
Reload the systemd daemon and restart the Ollama service.
Pull the specific model gemma3:27b using the command ollama pull gemma3:27b.

Install the nvidia-container-toolkit package.
Configure the NVIDIA container runtime for Docker using nvidia-ctk runtime configure --runtime=docker.
Update the /etc/docker/daemon.json file to set nvidia as the default runtime.
Restart the Docker service to apply runtime changes.

Remove any existing container named open-webui.
Deploy the Open-WebUI container using the ghcr.io/open-webui/open-webui:cuda image.
Configure the container to expose port 8080 and utilize all available GPUs.
Set the environment variable OLLAMA_BASE_URLS to http://host.docker.internal:11434.
Set the ENV variable to dev.

Docker User: The ollama user is created as a system user to manage the backend service.
Ollama Origins: The OLLAMA_ORIGINS environment variable is set to *, allowing requests from any origin.
Nginx/Certbot: The Nginx container runs in host network mode to handle HTTPS traffic directly on the server's network interface.
Service Persistence: Both the Ollama service and the Open-WebUI container are configured to restart automatically on failure or system reboot.

Storage: The Open-WebUI application stores its data in a Docker volume named open-webui, which is mapped to /app/backend/data inside the container.
Connection: No external database connection is configured; the application uses its internal storage mechanism within the mounted volume.

Two distinct Docker deployments are utilized in this architecture:

Image: ghcr.io/open-webui/open-webui:cuda
Name: open-webui
Ports: Exposes internal port 8080 mapped to the host port 8080.
GPU: Configured with --gpus all to utilize NVIDIA hardware acceleration.
Volumes: Mounts the open-webui volume to /app/backend/data.
Hosts: Adds the DNS entry host.docker.internal pointing to the host gateway.
Environment Variables:
ENV=dev
OLLAMA_BASE_URLS=http://host.docker.internal:11434

Software: Nginx with Certbot for SSL certificate management.
Image: jonasal/nginx-certbot:latest
Configuration:
Manages SSL certificates for the domain gemma<Server ID>.hostkey.in.
Routes traffic from the external port 443 (HTTPS) to the internal port 8080.
Uses the host network mode, binding directly to the server's network stack.
Email: SSL renewal notifications are configured for [email protected].

Nginx Directory: The /root/nginx directory is owned by root:root with 0755 permissions.
Compose File: The /root/nginx/compose.yml file is owned by root:root with 0644 permissions.
Docker Daemon: The /etc/docker/daemon.json file is owned by root with 0644 permissions.

Component	Configuration File/Path	Description
Nginx Proxy	`/root/nginx/compose.yml`	Docker Compose definition for Nginx and Certbot.
Ollama Service	`/etc/systemd/system/ollama.service`	Systemd unit file for the Ollama backend.
Docker Daemon	`/etc/docker/daemon.json`	Runtime configuration for NVIDIA support.
Nginx Data	`/data/nginx/`	Root directory for Nginx logs, configs, and certs.
SSL Secrets	`/data/nginx/user_conf.d/`	Custom Nginx configuration directory.
Open-WebUI Data	Docker Volume `open-webui`	Persistent storage for the web interface.

443 (HTTPS): External access to the Open-WebUI frontend via the Nginx proxy.
8080 (HTTP): Internal access to the Open-WebUI container (used by Nginx proxy).
11434 (HTTP): Internal access to the Ollama API endpoint.