Deployment Overview of Qwen3-32B on Server¶
Prerequisites and Basic Requirements¶
The deployment requires a server running the Ubuntu operating system. The following conditions must be met before proceeding:
- Root privileges or
sudoaccess are required to install system packages and manage services. - Docker must be installed and running on the server to host the web interface and proxy components.
- The server must have access to the internet to download the Ollama installer, the Qwen3-32B model, and Docker images.
- Port
8080must be available for the Open WebUI application. - Ports
80and443are required for the Nginx proxy and SSL certificate management. - The system must have a GPU with CUDA support to run the
qwen3:32bmodel efficiently.
File and Directory Structure¶
The application utilizes specific directories for configuration, data storage, and certificates:
/root/nginx/: Contains the Docker Compose configuration for the Nginx proxy and Certbot./root/nginx/compose.yml: The Docker Compose file defining the Nginx service./data/nginx/user_conf.d/: Stores custom Nginx configuration files, including the host-specific configuration for the application./data/nginx/nginx-certbot.env: Environment file containing settings for the Nginx-Certbot container./etc/systemd/system/ollama.service: Systemd unit file for the Ollama service./usr/share/ollama/.ollama/models/: Default storage location for the downloaded Qwen3-32B model./var/lib/docker/volumes/open-webui/: Docker volume storing the persistent data for the Open WebUI application.
Application Installation Process¶
The deployment involves installing the Ollama backend, pulling the specific AI model, and launching the Open WebUI frontend via Docker.
- Install Ollama: The Ollama package is installed using the official installation script.
- Configure Ollama Service: The
ollama.servicefile is updated to expose the service on all network interfaces and enable flash attention. The following environment variables are set:OLLAMA_HOST=0.0.0.0OLLAMA_ORIGINS=*LLAMA_FLASH_ATTENTION=1
- Download the Model: The
qwen3:32bmodel is pulled into the local Ollama repository. - Launch Open WebUI: The Open WebUI container is started with CUDA support, mapping port
8080and connecting to the local Ollama instance.
Docker Containers and Their Deployment¶
Two primary Docker components are deployed: the Open WebUI application and the Nginx proxy with Certbot.
Open WebUI Container¶
The Open WebUI container is deployed using the following docker run command parameters: - Image: ghcr.io/open-webui/open-webui:cuda - Container Name: open-webui - Ports: Maps host port 8080 to container port 8080. - GPU Access: The --gpus all flag is used to enable GPU acceleration. - Host Resolution: The --add-host=host.docker.internal:host-gateway flag allows the container to reach the host machine. - Volumes: A named volume open-webui is mounted to /app/backend/data for data persistence. - Environment Variables: - ENV=dev - OLLAMA_BASE_URLS=http://host.docker.internal:11434 - Restart Policy: Set to always to ensure the container restarts automatically.
Nginx and Certbot Container¶
The Nginx proxy is managed via Docker Compose located in /root/nginx/compose.yml. - Image: jonasal/nginx-certbot:latest - Network Mode: host - Volumes: - nginx_secrets (external) mounted to /etc/letsencrypt. - /data/nginx/user_conf.d mounted to /etc/nginx/user_conf.d. - Environment: - [email protected] - Loads additional settings from /data/nginx/nginx-certbot.env. - Restart Policy: unless-stopped
Proxy Servers¶
The Nginx proxy handles incoming traffic and SSL termination for the application.
- Configuration Location: Custom configurations are stored in
/data/nginx/user_conf.d/. - Proxy Pass: The Nginx configuration for the specific host includes a
location /block that forwards traffic to the Open WebUI container running on the host. - SSL Management: The
nginx-certbotcontainer automatically manages SSL certificates using Let's Encrypt. - Deployment: The proxy stack is started using the command: executed from the
/root/nginxdirectory.
Access Rights and Security¶
Security and access controls are implemented through system users, service configurations, and firewall considerations.
- Ollama User: A dedicated system user named
ollamais created to run the Ollama service. - Service Exposure: The Ollama service is configured to listen on
0.0.0.0, allowing connections from the Docker network and the host. - CORS: The
OLLAMA_ORIGINS=*environment variable allows requests from any origin, which is necessary for the Open WebUI frontend to communicate with the backend. - File Permissions:
- The
/root/nginxdirectory is owned byrootwith0755permissions. - The
compose.ymlfile is owned byrootwith0644permissions.
- The
- Firewall: Ensure that the server firewall allows inbound traffic on ports
80,443, and8080.
Starting, Stopping, and Updating¶
The services are managed using systemctl for Ollama and docker compose for the proxy stack.
Ollama Service¶
- Restart Service:
- Enable on Boot: The service is enabled to start automatically on system boot.
Open WebUI Container¶
- Start/Restart: The container is managed via Docker commands. To restart the container:
- Stop:
Nginx Proxy Stack¶
- Start/Update:
- Stop: