Lab

Skills Expanded

Implement a Data Ingestion Solution Using AWS Glue

In this lab, you'll practice ingesting semi-structured JSON sample data into a normalized AWS Glue Data Catalog from a source S3 data store. When you're finished, you'll have configured AWS Glue to continuously crawl S3 for new data every 12 hours.

* Our Labs are Available for Enterprise and Professional plans only.
Terms and conditions apply.

Contact sales

Lab info

Rating (395)

Level

Intermediate

Duration

50m

Released

Aug 26, 2021

Lab author

Steve Mueller

Steve Mueller has spent the past 25 years providing in-person technical expertise to more than 300 of the largest enterprises across 5 continents. Steve spent his formative years with Java application servers while at BEA (now Oracle), where he built core devops and virtualization principles to achieve automation and scale. He has worked for VMware and most recently AWS, where he spent 7 years in a number of roles addressing customer needs. Steve is now the CTO of Hypersive, a startup focused on... more

Challenge

Obtain Source Data Files

Download the JSON source data files from the official AWS Samples GitHub repository that will be imported into S3 and ingested by AWS Glue.

Challenge

Create an S3 Bucket

Provision an S3 bucket that will be used by AWS Glue as the primary data store for its Data Catalog.

Challenge

Upload Source Data Files to S3

Load the JSON source data files into the S3 data store.

Challenge

Create the Crawler

Provision a crawler in AWS Glue to populate the AWS Glue Data Catalog every 12 hours with tables from the source S3 data store.

Challenge

Manually Run the Crawler

Execute the crawler manually to verify it performs as expected, and populate the AWS Glue Data Catalog with tables from the source S3 data store.

Provided environment for hands-on practice

We will provide the credentials and environment necessary for you to practice right within your browser.

Guided walkthrough

Follow along with the author’s guided walkthrough and build something new in your provided environment!

Did you know?

On average, you retain 75% more of your learning if you get time for practice.

Recommended prerequisites

Amazon S3
AWS Glue

Ready to skill up
your entire team?

Subscriptions

Continue to checkout Continue to checkout

Cancel

With your Pluralsight plan, you can:

With your 30-day pilot, you can:

Access thousands of videos to develop critical skills
Give up to 50 users access to thousands of video courses
Practice and apply skills with interactive courses and projects
See skills, usage, and trend data for your teams
Prepare for certifications with industry-leading practice exams
Measure proficiency across skills and roles
Align learning to your goals with paths and channels

Ready to skill up
your entire team?

Subscriptions

Continue to checkout

Cancel

With your Pluralsight plan, you can:

With your 30-day pilot, you can:

Access thousands of videos to develop critical skills
Give up to 50 users access to thousands of video courses
Practice and apply skills with interactive courses and projects
See skills, usage, and trend data for your teams
Prepare for certifications with industry-leading practice exams
Measure proficiency across skills and roles
Align learning to your goals with paths and channels

Contact Sales

Implement a Data Ingestion Solution Using AWS Glue

Lab info