Skip to content

Deployment Overview of Qwen3-32B on Server

Prerequisites and Basic Requirements

The deployment requires a Linux server running Ubuntu with root privileges. The following components must be available on the host system:

  • Docker Engine installed and running.

  • Systemd service manager for managing background services.

  • Network access to download the Ollama installer, the qwen3:32b model, and Docker images.

  • GPU support with CUDA drivers installed for the Open WebUI container.

  • Domain configuration pointing to the host server for SSL certificate issuance.

FQDN of the Final Panel

The application is accessible via the following Fully Qualified Domain Name (FQDN) format:

  • qwen3-32b<Server ID>.hostkey.in:443

Replace <Server ID> with the specific identifier assigned to the server instance. The service operates over HTTPS on port 443.

File and Directory Structure

The deployment utilizes the following directory structure for configuration, data, and certificates:

  • /root/nginx/: Contains the Docker Compose file for the proxy and SSL management.

  • /root/nginx/compose.yml: The Docker Compose configuration for the Nginx and Certbot services.

  • /data/nginx/user_conf.d/: Stores custom Nginx configuration files for the specific domain.

  • /data/nginx/nginx-certbot.env: Environment file containing settings for the Nginx-Certbot container.

  • /etc/systemd/system/ollama.service: Systemd unit file for the Ollama service.

  • /usr/share/ollama/.ollama/models/: Storage location for the downloaded qwen3:32b model.

  • /var/lib/docker/volumes/open-webui/: Docker volume storing the backend data for the Open WebUI application.

Application Installation Process

The application consists of three main components: the Ollama inference engine, the Open WebUI interface, and the Nginx proxy with SSL.

  1. Ollama Installation:

  2. The Ollama package is installed via the official shell script.

  3. The ollama system user is created to manage the service.

  4. The ollama.service is configured with specific environment variables to allow external connections and enable flash attention.

  5. The qwen3:32b model is pulled and stored locally.

  6. Open WebUI Deployment:

  7. The Open WebUI container is deployed using the ghcr.io/open-webui/open-webui:cuda image.

  8. The container is configured to connect to the local Ollama instance running on the host.

  9. Proxy and SSL Setup:

  10. An Nginx container with Certbot is deployed to handle SSL termination and reverse proxying.

  11. The configuration is generated to route traffic from the external domain to the internal Open WebUI service.

Access Rights and Security

  • Firewall: The server must allow incoming traffic on port 443 (HTTPS) and port 80 (HTTP for SSL validation).

  • Users: The ollama system user is created to run the inference engine. The Docker containers run with specific privileges required for GPU access.

  • Restrictions: The Ollama service is configured to listen on 0.0.0.0 but is only exposed to the internal network via the Nginx proxy. Direct access to the Ollama port (11434) or Open WebUI port (8080) is not intended for external users.

Databases

The Open WebUI application utilizes a local Docker volume for data persistence.

  • Storage Location: /var/lib/docker/volumes/open-webui/

  • Connection Method: The application stores data directly within the mounted volume; no external database server is required.

  • Settings: The environment variable ENV is set to dev within the container configuration.

Docker Containers and Their Deployment

Two primary Docker containers are deployed as part of this solution:

  1. Open WebUI:

  2. Image: ghcr.io/open-webui/open-webui:cuda

  3. Command:

    docker run -d -p 8080:8080 --gpus all \
      --add-host=host.docker.internal:host-gateway \
      -v open-webui:/app/backend/data \
      --name open-webui \
      -e ENV='dev' \
      -e OLLAMA_BASE_URLS='http://host.docker.internal:11434' \
      --restart always ghcr.io/open-webui/open-webui:cuda
    

  4. Ports: Exposes port 8080 internally.

  5. Volumes: Mounts the open-webui volume to /app/backend/data.

  6. Nginx-Certbot:

  7. Image: jonasal/nginx-certbot:latest

  8. Deployment Method: Managed via Docker Compose located at /root/nginx/compose.yml.

  9. Network Mode: Host.

  10. Volumes:

    • nginx_secrets (external) mounted to /etc/letsencrypt.

    • /data/nginx/user_conf.d mounted to /etc/nginx/user_conf.d.

Proxy Servers

The Nginx proxy handles SSL termination and routing for the application.

  • Software: Nginx with Certbot (via Docker container).

  • SSL: Managed automatically by Certbot using the jonasal/nginx-certbot image.

  • Custom Domain: Configured for the hostkey.in zone with the prefix qwen3-32b.

  • Configuration:

  • The proxy passes requests from the external domain to http://127.0.0.1:8080.

  • The configuration file is located at /data/nginx/user_conf.d/qwen3-32b<Server ID>.hostkey.in.conf.

  • The proxy_pass directive is dynamically updated to point to the internal Open WebUI service.

Permission Settings

  • Nginx Directory: /root/nginx is owned by root:root with permissions 0755.

  • Compose File: /root/nginx/compose.yml is owned by root:root with permissions 0644.

  • Ollama Service: The ollama service runs under the ollama system user.

  • Docker Volumes: Docker manages permissions for the open-webui and nginx_secrets volumes internally.

Location of Configuration Files and Data

  • Nginx Proxy Config: /root/nginx/compose.yml

  • Nginx User Config: /data/nginx/user_conf.d/qwen3-32b<Server ID>.hostkey.in.conf

  • Nginx Environment: /data/nginx/nginx-certbot.env

  • Ollama Service File: /etc/systemd/system/ollama.service

  • Model Data: /usr/share/ollama/.ollama/models/

  • WebUI Data: /var/lib/docker/volumes/open-webui/

Available Ports for Connection

  • Port 443: HTTPS traffic for the Open WebUI interface (External).

  • Port 80: HTTP traffic for SSL certificate validation (External).

  • Port 8080: Internal port for Open WebUI (Not exposed directly to the internet).

  • Port 11434: Internal port for Ollama API (Not exposed directly to the internet).

Starting, Stopping, and Updating

The services are managed using systemctl for Ollama and docker compose for the proxy.

  • Ollama Service:

  • Start: systemctl start ollama

  • Stop: systemctl stop ollama

  • Restart: systemctl restart ollama

  • Enable on boot: systemctl enable ollama

  • Nginx Proxy:

  • Start/Update: docker compose up -d (executed from /root/nginx)

  • Stop: docker compose down (executed from /root/nginx)

  • Open WebUI Container:

  • Start: docker start open-webui

  • Stop: docker stop open-webui

  • Restart: docker restart open-webui

  • Update: Pull the latest image and recreate the container using the docker run command provided in the installation section.

question_mark
Is there anything I can help you with?
question_mark
AI Assistant ×