Modernizing Data Lakes and Data Warehouses with Google Cloud
The two key components of any data pipeline are data lakes and warehouses.
What you'll learn
The two key components of any data pipeline are data lakes and warehouses. This course highlights use-cases for each type of storage and dives into the available data lake and warehouse solutions on Google Cloud in technical detail. Also, this course describes the role of a data engineer, the benefits of a successful data pipeline to business operations, and examines why data engineering should be done in a cloud environment. This is the first course of the Data Engineering on Google Cloud series. After completing this course, enroll in the Building Batch Data Pipelines on Google Cloud course.
Table of contents
- Module introduction 2m
- The role of a data engineer 4m
- Data engineering challenges 5m
- Introduction to BigQuery 2m
- Data lakes and data warehouses 5m
- Transactional databases versus data warehouses 5m
- Partner effectively with other data teams 5m
- Manage data access and governance 2m
- Demo: Finding PII in your dataset with the DLP API 2m
- Build production-ready pipelines 2m
- Google Cloud customer case study 1m
- Recap 1m
- Lab Intro: Using BigQuery to do Analysis 0m
- Getting Started with GCP and Qwiklabs 4m
- Lab: Using BigQuery to do Analysis 0m
- Module Introduction 1m
- Introduction to data lakes 9m
- Data storage and ETL options on Google Cloud 5m
- Build a data lake using Cloud Storage 9m
- Secure Cloud Storage 6m
- Store all sorts of data types 5m
- Cloud SQL as a relational data lake 7m
- Lab Intro: Loading Taxi Data into Google Cloud SQL 1m
- Lab: Loading Taxi Data into Google Cloud SQL 2.5 0m
- Module Introduction 1m
- The modern data warehouse 4m
- Introduction to BigQuery 6m
- Demo: Querying TB of data in seconds 7m
- Get started with BigQuery 11m
- Load data into BigQuery 12m
- Lab Intro: Loading Data into BigQuery 0m
- Lab: Loading data into BigQuery 0m
- Explore schemas 0m
- Demo: Exploring Schemas 10m
- Schema design 3m
- Nested and repeated fields 9m
- Demo: Nested and repeated fields 16m
- Design the optimal schema for BigQuery 1m
- Lab Intro: Working with JSON and Array data in BigQuery 0m
- Lab: Working with JSON and Array data in BigQuery 2.5 0m
- Optimize with partitioning and clustering 7m
- Review 2m