Integrate External Services with Apache Airflow
This course will teach you how to connect Apache Airflow with Google Cloud Storage, Google Cloud BigQuery, and AWS S3 enabling data management and processing across cloud platforms.
What you'll learn
Apache Airflow provides built-in operators to connect to external cloud services. In this course, Integrate External Services with Apache Airflow, you’ll gain the ability to connect and manage various external services within your Airflow workflows. You'll connect your Airflow workflow to Google Cloud Storage and Cloud BigQuery as well as to AWS S3, enabling data management and processing across cloud platforms. First, you’ll explore how to set up permissions on Google Cloud using a service account to enable Airflow to access Google Cloud Storage buckets and BigQuery datasets. Next, you'll discover how to read files from your local machine and upload to Google Cloud, create a BigQuery dataset, and create an external table in BigQuery. Then, you'll integrate your workflow with AWS S3 buckets to transfer data from S3 to Google Cloud Storage. Finally, you’ll use hooks to work with S3, Cloud Storage, and BigQuery - hooks give you much more control over how you access your data from external services. When you’re finished with this course, you’ll have the skills and knowledge of Apache Airflow needed to integrate and manage external services effectively within your data workflows.
Table of contents
- Introduction and Version Check 1m
- Demo: Setting up a Service Account on Google Cloud 4m
- Demo: Configuring a Connection to Google Cloud 1m
- Demo: Creating a Google Cloud Storage Bucket and Uploading Data 2m
- Demo: Creating a BigQuery Dataset and External Table 4m
- Demo: Running the DAG to Load Data to GCS and Query a BigQuery Table 2m
- Demo: Creating an AWS User 3m
- Demo: Setting up a Connection to AWS 2m
- Demo: Accessing AWS and Google Cloud Services Using Hooks 5m