Skip to content

Apache Airflow

In this article

Information

Apache Airflow is a powerful and flexible platform for developing, scheduling, and monitoring data pipeline tasks. It can be used in a wide range of applications: launching scripts for data collection, transformation, and loading from various sources, scheduling email campaigns, automating testing, and more.

Apache Airflow Features

  • Airflow uses Python to define workflows, making them transparent, easily customizable, and reproducible;
  • Thanks to its open API and a wide range of operators, Airflow can integrate with numerous technologies and tools;
  • The Airflow web interface provides an interactive overview of workflow status, allowing you to track task execution and manage them easily;
  • Airflow's built-in scheduler enables launching tasks at a specific time or with a defined periodicity (e.g., every hour, every day);
  • Airflow automatically manages dependencies between tasks, ensuring that work is performed in the correct order;
  • Airflow allows breaking down large tasks into smaller, manageable modules, simplifying development and debugging;
  • Parallel task execution and support for distributed computing accelerate the processing of large data volumes;
  • Airflow automatically restarts failed tasks, guaranteeing workflow stability;
  • Airflow automates routine tasks, freeing up developers' time for more important assignments.

Deployment Features

  • Supported operating system: Ubuntu 22.04, Debian 11, Debian 12;
  • Access to the control panel: https://airflow{Server_ID_from_Invapi}.hostkey.in;
  • The installation time of the panel along with the OS takes about 15 minutes.

Getting Started After Deploying Apache Airflow

After paying for the order, you will receive a notification at the email address registered during signup, indicating the server's readiness. This notification will include the VPS IP address and login credentials for connection. Our company's clients manage equipment through the server control panel and APIInvapi.

Authentication data, which can be found in the Info >> Tags tab of the server management panel or in the email sent upon server readiness:

  • Link to access the Apache Airflow web interface control panel: in the webpanel tag;
  • Login: admin;
  • Password: sent in an email after the server is ready for use.

Authentication

The following parameters are set by default for the Admin user:

The command line interface is accessible via the airflow command.

In Debian 12, a virtual environment is used, which can be activated with the command:

source /root/.local/pipx/venvs/apache-airflow/bin/activate

After this, the CLI will also be accessible via the airflow command.

Note

Detailed information about Apache Airflow's main settings can be found in the developers' documentation.

Ordering a server with Apache Airflow using the API

To install this software using the API, follow these instructions.