Skip to content

Deployment Overview of Qwen3-32B on Server

Prerequisites and Basic Requirements

The deployment requires a server running the Ubuntu operating system. The following conditions must be met before proceeding:

  • Root privileges or sudo access are required to install system packages and manage services.
  • Docker must be installed and running on the server to host the web interface and proxy components.
  • The server must have access to the internet to download the Ollama installer, the Qwen3-32B model, and Docker images.
  • Port 8080 must be available for the Open WebUI application.
  • Ports 80 and 443 are required for the Nginx proxy and SSL certificate management.
  • The system must have a GPU with CUDA support to run the qwen3:32b model efficiently.

File and Directory Structure

The application utilizes specific directories for configuration, data storage, and certificates:

  • /root/nginx/: Contains the Docker Compose configuration for the Nginx proxy and Certbot.
  • /root/nginx/compose.yml: The Docker Compose file defining the Nginx service.
  • /data/nginx/user_conf.d/: Stores custom Nginx configuration files, including the host-specific configuration for the application.
  • /data/nginx/nginx-certbot.env: Environment file containing settings for the Nginx-Certbot container.
  • /etc/systemd/system/ollama.service: Systemd unit file for the Ollama service.
  • /usr/share/ollama/.ollama/models/: Default storage location for the downloaded Qwen3-32B model.
  • /var/lib/docker/volumes/open-webui/: Docker volume storing the persistent data for the Open WebUI application.

Application Installation Process

The deployment involves installing the Ollama backend, pulling the specific AI model, and launching the Open WebUI frontend via Docker.

  1. Install Ollama: The Ollama package is installed using the official installation script.
    curl -fsSL https://ollama.com/install.sh | sh
    
  2. Configure Ollama Service: The ollama.service file is updated to expose the service on all network interfaces and enable flash attention. The following environment variables are set:
    • OLLAMA_HOST=0.0.0.0
    • OLLAMA_ORIGINS=*
    • LLAMA_FLASH_ATTENTION=1
  3. Download the Model: The qwen3:32b model is pulled into the local Ollama repository.
    ollama pull qwen3:32b
    
  4. Launch Open WebUI: The Open WebUI container is started with CUDA support, mapping port 8080 and connecting to the local Ollama instance.

Docker Containers and Their Deployment

Two primary Docker components are deployed: the Open WebUI application and the Nginx proxy with Certbot.

Open WebUI Container

The Open WebUI container is deployed using the following docker run command parameters: - Image: ghcr.io/open-webui/open-webui:cuda - Container Name: open-webui - Ports: Maps host port 8080 to container port 8080. - GPU Access: The --gpus all flag is used to enable GPU acceleration. - Host Resolution: The --add-host=host.docker.internal:host-gateway flag allows the container to reach the host machine. - Volumes: A named volume open-webui is mounted to /app/backend/data for data persistence. - Environment Variables: - ENV=dev - OLLAMA_BASE_URLS=http://host.docker.internal:11434 - Restart Policy: Set to always to ensure the container restarts automatically.

Nginx and Certbot Container

The Nginx proxy is managed via Docker Compose located in /root/nginx/compose.yml. - Image: jonasal/nginx-certbot:latest - Network Mode: host - Volumes: - nginx_secrets (external) mounted to /etc/letsencrypt. - /data/nginx/user_conf.d mounted to /etc/nginx/user_conf.d. - Environment: - [email protected] - Loads additional settings from /data/nginx/nginx-certbot.env. - Restart Policy: unless-stopped

Proxy Servers

The Nginx proxy handles incoming traffic and SSL termination for the application.

  • Configuration Location: Custom configurations are stored in /data/nginx/user_conf.d/.
  • Proxy Pass: The Nginx configuration for the specific host includes a location / block that forwards traffic to the Open WebUI container running on the host.
    location / {
        proxy_pass http://127.0.0.1:8080;
    }
    
  • SSL Management: The nginx-certbot container automatically manages SSL certificates using Let's Encrypt.
  • Deployment: The proxy stack is started using the command:
    docker compose up -d
    
    executed from the /root/nginx directory.

Access Rights and Security

Security and access controls are implemented through system users, service configurations, and firewall considerations.

  • Ollama User: A dedicated system user named ollama is created to run the Ollama service.
  • Service Exposure: The Ollama service is configured to listen on 0.0.0.0, allowing connections from the Docker network and the host.
  • CORS: The OLLAMA_ORIGINS=* environment variable allows requests from any origin, which is necessary for the Open WebUI frontend to communicate with the backend.
  • File Permissions:
    • The /root/nginx directory is owned by root with 0755 permissions.
    • The compose.yml file is owned by root with 0644 permissions.
  • Firewall: Ensure that the server firewall allows inbound traffic on ports 80, 443, and 8080.

Starting, Stopping, and Updating

The services are managed using systemctl for Ollama and docker compose for the proxy stack.

Ollama Service

  • Restart Service:
    systemctl daemon-reload
    systemctl restart ollama
    
  • Enable on Boot: The service is enabled to start automatically on system boot.
    systemctl enable ollama
    

Open WebUI Container

  • Start/Restart: The container is managed via Docker commands. To restart the container:
    docker restart open-webui
    
  • Stop:
    docker stop open-webui
    

Nginx Proxy Stack

  • Start/Update:
    cd /root/nginx
    docker compose up -d
    
  • Stop:
    cd /root/nginx
    docker compose down
    
question_mark
Is there anything I can help you with?
question_mark
AI Assistant ×