Data Engineering with AWS Machine Learning
The whole field of machine learning revolves around data. This course will teach you how to properly choose between the various AWS data repositories, ingestion services, and transformation services in a cost-effective, best-practice manner.
What you'll learn
Storing data for machine learning is challenging due to the varying formats and characteristics of data. Raw ingested data must first be transformed into the format necessary for downstream machine learning consumption, and once the data is ready to be used, it must be ingested from storage to the machine learning service. In this course, Data Engineering with AWS Machine Learning, you’ll learn to choose the right AWS service for each of these data-related machine learning ML tasks for any given scenario. First, you’ll explore the wide variety of data storage solutions available on AWS and what each type of storage is used for. Next, you’ll discover the differing AWS services used to ingest data into ML-specific services and when to use each one. Finally, you’ll learn how to transform your raw data into the proper formats used by the various AWS ML services. When you’re finished with this course, you’ll have the skills and knowledge of how to properly provide data solutions for storing, preparing, and ingesting data needed to architect data engineering solutions on AWS for Machine Learning, and be prepared to take the AWS Machine Learning Certification exam.
Table of contents
- Important Data Characteristics to Consider in a Machine Learning Solution 2m
- Choosing an AWS Data Repository Based on Structured, Semi-structured, and Unstructured Data Characteristics 2m
- Choosing AWS Data Ingestion and Data Processing Services Based on Batch and Stream Processing Characteristics 1m
- Refining What Data Store to Use Based on Application Characteristics 2m
- Module Summary for the ML Exam and Segue into Next Topics 1m
- Module Overview and Your Job as a Data Engineer 2m
- The Almighty and Ubiquitous Amazon S3! 2m
- Partitioning Data in Amazon S3 and Demo: Creating a Partitioning Scheme in Amazon S3 3m
- Amazon S3 Storage Classes and Lifecycle Rules and Demo: Creating Amazon S3 Lifecycle Rules 3m
- Amazon S3 Security Encryption and Policies and Demo: Setting Up Encryption and Policies in Amazon S3 4m
- The Important Security Feature of VPC Endpoints for Amazon S3 2m
- Amazon EFS for Machine Learning 2m
- Amazon EBS for Machine Learning 1m
- Cost Effectiveness of Common Storage Options on AWS 2m
- Module Summary for the ML Exam and Segue into Next Topics 1m
- Module Overview and Amazon S3 Review 2m
- Amazon Relational Database Service (RDS) for AWS ML 1m
- Amazon Aurora for AWS ML 2m
- The Almighty Amazon DynamoDB for AWS ML 2m
- Amazon Redshift MPP Data Warehouse for AWS ML 2m
- Amazon DocumentDB with MongoDB Compatibility for AWS ML 2m
- Amazon Database Pricing Summary 2m
- Module Summary for the ML Exam and Segue into Next Topics 1m
- Module Overview and Data Warehouses vs. Data Lakes 3m
- Amazon S3 Data Lakes: Building Big Data Storage Solutions for Maximum Flexibility 3m
- AWS S3 Data Lakes: Immutable Logs and Materialized Views 4m
- The Amazing Amazon Lake Formation! 3m
- Amazon Redshift Data Warehouse 1m
- Amazon Redshift's Data Lake Export and Federated Query 1m
- When to Use a Data Warehouse vs. a Data Lake 1m
- Module Summary for the ML Exam and Segue into Next Topics 1m
- Module Overview and Stream Ingestion Characteristics 2m
- The Amazon Kinesis Family of Services 2m
- Amazon Kinesis Data Streams 2m
- Amazon Kinesis Data Firehose and Demo: Using Amazon Kinesis Data Firehose for Data Ingestion 5m
- Amazon Kinesis Data Streams vs. Amazon Kinesis Data Firehose 1m
- Amazon Kinesis Data Analytics and Demo: Using Amazon Kinesis Data Analytics on Streaming Data 8m
- Amazon Kinesis Video Streams 2m
- Module Summary for the ML Exam and Segue into Next Topics 1m
- Preparing Raw Data for Consumption for AWS Machine Learning 3m
- Data Engineering 101: Data Normalization for AWS Machine Learning 2m
- Data Engineering 101: Data Partitioning for AWS Machine Learning 2m
- Data Engineering 101: Data Compression for AWS Machine Learning 1m
- Data Engineering 101: Storage-optimized Data for AWS Machine Learning 1m
- ETL vs. ELT for AWS Machine Learning 1m
- Module Summary for the ML Exam and Segue into Next Topics 1m
- Module Overview and An Introduction to AWS Glue! 4m
- What Is AWS Glue and How Does It Work? 2m
- How Does AWS Glue Relate to AWS Lake Formation? 1m
- The AWS Glue Data Catalog: Schema and Versions 2m
- The Benefits of Using AWS Glue 2m
- What Are AWS Glue Crawlers and How Do They Work? 2m
- Demo: Creating and Manually Running an AWS Glue Crawler 5m
- What Are AWS Glue Jobs and How Do They Work? 1m
- AWS Glue's Builtin Data Transformations and ML Transforms for AWS Lake Formation 2m
- Demo: Automating an AWS Glue Crawler and Performing Data Transformation with An AWS Glue Job 9m
- What Is Amazon Athena and How Does It Work? 3m
- Using AWS Glue and Amazon Athena Together 1m
- Demo: Using Amazon Athena 4m
- Running Inference Using ML Models in Amazon Athena 1m
- Module Summary for the ML Exam 2m