Course Outline
Introduction to Apache Airflow
- What is workflow orchestration
- Key features and benefits of Apache Airflow
- Airflow 2.x improvements and ecosystem overview
Architecture and Core Concepts
- Scheduler, web server, and worker processes
- DAGs, tasks, and operators
- Executors and backends (Local, Celery, Kubernetes)
Installation and Setup
- Installing Airflow in local and cloud environments
- Configuring Airflow with different executors
- Setting up metadata databases and connections
Navigating the Airflow UI and CLI
- Exploring the Airflow web interface
- Monitoring DAG runs, tasks, and logs
- Using the Airflow CLI for administration
Authoring and Managing DAGs
- Creating DAGs with the TaskFlow API
- Using operators, sensors, and hooks
- Managing dependencies and scheduling intervals
Integrating Airflow with Data and Cloud Services
- Connecting to databases, APIs, and message queues
- Running ETL pipelines with Airflow
- Cloud integrations: AWS, GCP, Azure operators
Monitoring and Observability
- Task logs and real-time monitoring
- Metrics with Prometheus and Grafana
- Alerting and notifications with email or Slack
Securing Apache Airflow
- Role-based access control (RBAC)
- Authentication with LDAP, OAuth, and SSO
- Secrets management with Vault and cloud secret stores
Scaling Apache Airflow
- Parallelism, concurrency, and task queues
- Using CeleryExecutor and KubernetesExecutor
- Deploying Airflow on Kubernetes with Helm
Best Practices for Production
- Version control and CI/CD for DAGs
- Testing and debugging DAGs
- Maintaining reliability and performance at scale
Troubleshooting and Optimization
- Debugging failed DAGs and tasks
- Optimizing DAG performance
- Common pitfalls and how to avoid them
Summary and Next Steps
Requirements
- Experience with Python programming
- Familiarity with data engineering or DevOps concepts
- Understanding of ETL or workflow orchestration
Audience
- Data scientists
- Data engineers
- DevOps and infrastructure engineers
- Software developers
Custom Corporate Training
Training solutions designed exclusively for businesses.
- Customized Content: We adapt the syllabus and practical exercises to the real goals and needs of your project.
- Flexible Schedule: Dates and times adapted to your team's agenda.
- Format: Online (live), In-company (at your offices), or Hybrid.
Price per private group, online live training, starting from 4800 € + VAT*
Contact us for an exact quote and to hear our latest promotions
Testimonials (7)
The instructor adapted the training to the participants’ level and responded to all questions. He was very communicative, and it was easy to interact with him. I really appreciated the format of the training, which included many practical exercises. Overall, it was a very engaging and well-organized session.
Jacek Chlopik - ZAKLAD UBEZPIECZEN SPOLECZNYCH
Course - Apache Airflow: Building and Managing Data Pipelines
The training was spot on. Very useful theory and exercices.
Vladimir - PUBLIC COURSE
Course - Apache Airflow
The training was spot on in all aspects. Usefull theoretical aspects and exercises.
Vladimir - PUBLIC COURSE
Course - Apache Airflow
The training was spot on in all aspects. Usefull theoretical aspects and exercises.
Vladimir - PUBLIC COURSE
Course - Apache Airflow
The training was spot on in all aspects. Usefull theoretical aspects and exercises.
Vladimir - PUBLIC COURSE
Course - Apache Airflow
The training was spot on in all aspects. Usefull theoretical aspects and exercises.
Vladimir - PUBLIC COURSE
Course - Apache Airflow
The training was spot on in all aspects. Usefull theoretical aspects and exercises.