Overview
This template provides a production‑ready Apache Airflow instance as a Monk runnable. You can:- Run it directly to get a managed workflow orchestration platform
- Inherit it in your own data engineering infrastructure to schedule and monitor workflows
What this template manages
- Airflow webserver and scheduler
- PostgreSQL metadata database
- Redis for Celery executor
- Worker nodes for task execution
- Triggerer for deferrable operators
- Web UI on port 8080
- DAG management and execution
Quick start (run directly)
- Load templates
- Run Airflow stack
- Customize credentials (recommended via inheritance)
variables. Secrets added with monk secrets add will not affect this runnable unless you inherit it and reference those secrets.
- Preferred: inherit and replace variables with
secret("...")as shown below. - Alternative: fork/clone and edit the
variablesinstack.yml, thenmonk load MANIFESTand run.
http://localhost:8080.
Default credentials: airflow / airflow (change immediately in production!)
Configuration
Key variables you can customize in this template:${monk-volume-path}/airflow on the host.
Use by inheritance (recommended for data pipelines)
Inherit the Airflow stack in your application for workflow orchestration. Example:Ports and connectivity
- Service:
webserveron TCP port8080 - From other runnables in the same process group, use
connection-hostname("\<connection-name>")to resolve the Airflow host.
Persistence and configuration
- DAGs:
${monk-volume-path}/airflow/dags:/opt/airflow/dags - Logs:
${monk-volume-path}/airflow/logs:/opt/airflow/logs - Plugins:
${monk-volume-path}/airflow/plugins:/opt/airflow/plugins - You can drop DAG files into the
dagspath to deploy workflows.
Features
- DAG-based Workflows: Define workflows as Python code
- Rich Scheduling: Cron-based, interval-based, and event-based triggers
- Monitoring: Web UI with DAG visualization and execution history
- Extensible: 200+ operators and sensors (Spark, Kubernetes, AWS, GCP, etc.)
- Task Dependencies: Complex task graphs with branching and conditions
- Retry Logic: Automatic retry with exponential backoff
- SLA Monitoring: Track and alert on SLA violations
- Connection Management: Secure credential storage
- CeleryExecutor: Distributed task execution across worker nodes
Creating DAGs
Example DAG:Use cases
Airflow excels at:- ETL/ELT pipelines
- Data warehouse management
- ML model training pipelines
- Report generation
- Data quality checks
- Multi-cloud orchestration
Related templates
- See other templates in this repository for complementary services
- Combine with monitoring tools for observability
- Integrate with your application stack as needed
Troubleshooting
- If you changed
airflow_passwordafter initial setup, you may need to reset data volumes or update the user inside Airflow. - Ensure PostgreSQL and Redis are running before starting Airflow components.
- Generate Fernet key for encrypting secrets:
- Check logs:
- For task failures, check task logs in the Airflow UI
- Monitor worker health in the UI: Admin → Workers