Skip to content

Deployment Overview of Apache Airflow on Server

Prerequisites and Basic Requirements

The server must meet the following operating system and privilege requirements to successfully host the application:

  • Operating System: Debian 12 (Bookworm) or Ubuntu (Bullseye or newer).

  • Privileges: Root access or sudo privileges are required for package installation and service management.

  • Domain: The server must be configured to resolve the hostkey.in zone.

  • Ports:

  • Port 8080 for the internal Airflow webserver.

  • Port 443 for the external HTTPS connection via the proxy.

FQDN of the Final Panel

The application is accessible via the following Fully Qualified Domain Name (FQDN) format: airflow<Server ID>.hostkey.in:443

File and Directory Structure

The deployment utilizes the following directories for configuration, data, and certificates:

  • /root/nginx: Contains the Docker Compose configuration for the Nginx proxy and Certbot.

  • /root/nginx/compose.yml: The Docker Compose file defining the Nginx service.

  • /data/nginx/nginx-certbot.env: Environment file containing proxy configuration variables.

  • /data/nginx/user_conf.d: Directory for custom Nginx configuration files.

  • /etc/letsencrypt: Volume mount for SSL certificates managed by Certbot.

  • /etc/systemd/system/airflow-webserver.service: Systemd unit file for the Airflow webserver.

  • /etc/systemd/system/airflow-scheduler.service: Systemd unit file for the Airflow scheduler.

  • /opt/pipx: Home directory for the pipx package manager (Debian Bookworm).

  • /usr/local/bin: Location for the airflow binary and pipx binaries.

Application Installation Process

The application is installed using Python package managers tailored to the specific distribution version:

  • Apache Airflow Version: 2.10.1

  • Python Version: 3.10

  • Installation Method:

  • On Debian 12 (Bookworm), the application is installed via pipx with the celery extra.

  • On Ubuntu or Debian 11 (Bullseye), the application is installed via pip with the celery extra, using constraints for Python 3.10.

  • Database Initialization: The Airflow database is initialized using the airflow db migrate command.

  • User Creation: An administrative user is created with the following credentials:

  • Username: admin

  • Role: Admin

  • Email: [email protected]

Docker Containers and Their Deployment

The reverse proxy and SSL termination are handled by Docker containers managed via Docker Compose:

  • Container Image: jonasal/nginx-certbot:latest

  • Deployment Script: The container is started using docker compose up -d from the /root/nginx directory.

  • Network Mode: The container runs in host network mode.

  • Volumes:

  • nginx_secrets: External volume mounted at /etc/letsencrypt for SSL certificates.

  • /data/nginx/user_conf.d: Mounted at /etc/nginx/user_conf.d for custom configurations.

  • Environment:

  • CERTBOT_EMAIL: Set to [email protected].

Proxy Servers

Nginx is configured as the reverse proxy to handle external traffic and SSL termination:

  • External Port: 443 (HTTPS).

  • Internal Port: 8080 (Airflow Webserver).

  • Path Mapping: The root path / on the external interface maps to the internal Airflow service.

  • SSL Management: Certbot is integrated within the Nginx container to manage SSL certificates automatically.

  • Configuration Location: The proxy configuration is defined in /root/nginx/compose.yml and utilizes environment variables from /data/nginx/nginx-certbot.env.

Starting, Stopping, and Updating

The Airflow services are managed as native systemd services, while the proxy is managed via Docker Compose.

Airflow Services:

  • Start: systemctl start airflow-webserver and systemctl start airflow-scheduler.

  • Stop: systemctl stop airflow-webserver and systemctl stop airflow-scheduler.

  • Enable on Boot: systemctl enable airflow-webserver and systemctl enable airflow-scheduler.

  • Reload Daemon: systemctl daemon-reload (required after modifying service files).

  • Status Check: systemctl status airflow-webserver and systemctl status airflow-scheduler.

Proxy Service:

  • Start: docker compose up -d (executed from /root/nginx).

  • Stop: docker compose down (executed from /root/nginx).

  • Restart: docker compose restart (executed from /root/nginx).

Available Ports for Connection

The following ports are configured for application access:

  • Port 443: HTTPS traffic routed through the Nginx proxy to the Airflow webserver.

  • Port 8080: Internal Airflow webserver port (accessible locally or via proxy routing).

Permission Settings

The deployment enforces the following permission settings for critical directories:

  • /root/nginx: Owned by root:root with mode 0644.

  • /root/nginx/compose.yml: Owned by root:root with mode 0644.

  • Airflow services run as the root user as defined in the systemd unit files.

  • The PATH environment variable for services includes /usr/local/bin, /usr/bin, /bin, /usr/local/games, /usr/games, and /root/.local/bin.

question_mark
Is there anything I can help you with?
question_mark
AI Assistant ×