Deployment Overview of Gemma-3-27B on Server¶
Prerequisites and Basic Requirements¶
-
Operating System: Ubuntu Linux
-
Privileges: Root access or
sudoprivileges are required for system modifications, Docker installation, and service management. -
Hardware: System must be equipped with NVIDIA GPUs to support the
nvidia-container-runtime. -
Network: Internet connectivity is required for pulling Docker images and downloading the Gemma3 model.
FQDN of the Final Panel¶
The application is accessible via the following Fully Qualified Domain Name (FQDN) on the hostkey.in domain:
-
Address:
gemma<Server ID>.hostkey.in:443 -
Replace
<Server ID>with the specific identifier assigned to your instance.
File and Directory Structure¶
The following directories and files contain critical configurations and data:
-
/root/nginx/: Directory containing the Nginx proxy and Certbot configuration files. -
/root/nginx/compose.yml: Docker Compose file for the Nginx/Certbot stack. -
/etc/systemd/system/ollama.service: Systemd service unit file for the Ollama backend. -
/etc/systemd/system/ollama.service.bak: Backup of the original Ollama service file. -
/etc/docker/daemon.json: Docker daemon configuration file defining the NVIDIA runtime. -
/data/nginx/: Directory for Nginx data, including SSL certificates and user configurations. -
/data/nginx/nginx-certbot.env: Environment variables for the Certbot service. -
/data/nginx/user_conf.d/: Directory for custom Nginx configuration snippets. -
/etc/letsencrypt/: External volume mount for SSL secrets managed by Nginx-Certbot. -
/app/backend/data/: Internal volume mount for Open-WebUI persistent data.
Application Installation Process¶
The installation consists of three main components: the Ollama backend, the NVIDIA GPU runtime, and the Open-WebUI frontend.
Ollama Backend Setup¶
-
Install Ollama using the official installation script.
-
Create a system user named
ollama. -
Modify the
ollama.servicefile to expose the service on all network interfaces and enable specific environment variables for performance and security:-
OLLAMA_HOST=0.0.0.0 -
OLLAMA_ORIGINS=* -
LLAMA_FLASH_ATTENTION=1
-
-
Reload the systemd daemon and restart the Ollama service.
-
Pull the specific model
gemma3:27busing the commandollama pull gemma3:27b.
NVIDIA Container Toolkit Setup¶
-
Install the
nvidia-container-toolkitpackage. -
Configure the NVIDIA container runtime for Docker using
nvidia-ctk runtime configure --runtime=docker. -
Update the
/etc/docker/daemon.jsonfile to setnvidiaas the default runtime. -
Restart the Docker service to apply runtime changes.
Open-WebUI Frontend Setup¶
-
Remove any existing container named
open-webui. -
Deploy the Open-WebUI container using the
ghcr.io/open-webui/open-webui:cudaimage. -
Configure the container to expose port
8080and utilize all available GPUs. -
Set the environment variable
OLLAMA_BASE_URLStohttp://host.docker.internal:11434. -
Set the
ENVvariable todev.
Access Rights and Security¶
-
Docker User: The
ollamauser is created as a system user to manage the backend service. -
Ollama Origins: The
OLLAMA_ORIGINSenvironment variable is set to*, allowing requests from any origin. -
Nginx/Certbot: The Nginx container runs in
hostnetwork mode to handle HTTPS traffic directly on the server's network interface. -
Service Persistence: Both the Ollama service and the Open-WebUI container are configured to restart automatically on failure or system reboot.
Databases¶
-
Storage: The Open-WebUI application stores its data in a Docker volume named
open-webui, which is mapped to/app/backend/datainside the container. -
Connection: No external database connection is configured; the application uses its internal storage mechanism within the mounted volume.
Docker Containers and Their Deployment¶
Two distinct Docker deployments are utilized in this architecture:
Open-WebUI Container¶
-
Image:
ghcr.io/open-webui/open-webui:cuda -
Name:
open-webui -
Ports: Exposes internal port
8080mapped to the host port8080. -
GPU: Configured with
--gpus allto utilize NVIDIA hardware acceleration. -
Volumes: Mounts the
open-webuivolume to/app/backend/data. -
Hosts: Adds the DNS entry
host.docker.internalpointing to the host gateway. -
Environment Variables:
-
ENV=dev -
OLLAMA_BASE_URLS=http://host.docker.internal:11434
Nginx-Certbot Container¶
-
Image:
jonasal/nginx-certbot:latest -
Deployment Method: Docker Compose located at
/root/nginx/compose.yml. -
Volumes:
-
nginx_secrets(external volume) mapped to/etc/letsencrypt. -
Host path
/data/nginx/user_conf.dmapped to/etc/nginx/user_conf.d. -
Environment:
-
Loads additional variables from
/data/nginx/nginx-certbot.env. -
Network Mode:
host
Proxy Servers¶
-
Software: Nginx with Certbot for SSL certificate management.
-
Image:
jonasal/nginx-certbot:latest -
Configuration:
-
Manages SSL certificates for the domain
gemma<Server ID>.hostkey.in. -
Routes traffic from the external port
443(HTTPS) to the internal port8080. -
Uses the
hostnetwork mode, binding directly to the server's network stack. -
Email: SSL renewal notifications are configured for
[email protected].
Permission Settings¶
-
Nginx Directory: The
/root/nginxdirectory is owned byroot:rootwith0755permissions. -
Compose File: The
/root/nginx/compose.ymlfile is owned byroot:rootwith0644permissions. -
Docker Daemon: The
/etc/docker/daemon.jsonfile is owned by root with0644permissions.
Location of Configuration Files and Data¶
| Component | Configuration File/Path | Description |
|---|---|---|
| Nginx Proxy | /root/nginx/compose.yml | Docker Compose definition for Nginx and Certbot. |
| Ollama Service | /etc/systemd/system/ollama.service | Systemd unit file for the Ollama backend. |
| Docker Daemon | /etc/docker/daemon.json | Runtime configuration for NVIDIA support. |
| Nginx Data | /data/nginx/ | Root directory for Nginx logs, configs, and certs. |
| SSL Secrets | /data/nginx/user_conf.d/ | Custom Nginx configuration directory. |
| Open-WebUI Data | Docker Volume open-webui | Persistent storage for the web interface. |
Available Ports for Connection¶
-
443 (HTTPS): External access to the Open-WebUI frontend via the Nginx proxy.
-
8080 (HTTP): Internal access to the Open-WebUI container (used by Nginx proxy).
-
11434 (HTTP): Internal access to the Ollama API endpoint.
Starting, Stopping, and Updating¶
Ollama Service¶
-
Start:
systemctl start ollama -
Stop:
systemctl stop ollama -
Restart:
systemctl restart ollama -
Enable:
systemctl enable ollama
Open-WebUI Container¶
-
Start:
docker start open-webui -
Stop:
docker stop open-webui -
Remove:
docker rm -f open-webui -
Restart:
docker restart open-webui
Nginx-Certbot Container¶
-
Start/Update: Navigate to
/root/nginxand executedocker compose up -d. -
Stop: Navigate to
/root/nginxand executedocker compose down.