Lab
Cloud

Using Amazon S3 as a Machine Learning Repository

Imagine you are a starting Data Engineer. You have been tasked with preparing an environment for model building. In order to complete this task you need to ingest a csv file into S3 and then load that data source into a Jupyter Notebook. Finally you need to save that data back into S3 under a different table.

Get started Contact sales

Path Info

Level

Advanced

Duration

30m

Published

May 24, 2024

Challenge

Prepare the Environment
1. Create a Jupyter Notebook in SageMaker Create a new role that allows for interaction with S3
2. Create an S3 Bucket
3. Download Parking-Ticket-2022 data (https://open.toronto.ca/dataset/parking-tickets/)
4. Upload file: Parking_Tags_Data_2022.000.csv to the S3 Bucket
Challenge

Ingest Data Into SageMaker
1. Create a conda_python3 Jupyter Notebook
2. Ingest data from the S3 instance into the Jupyter Notebook as a data frame.
3. Confirm that you can see 5 rows of the Parking_Tags_Data table
4. Change the df.head command to display 11 rows instead of 5.
5. Drop the 'location3' column from the data frame
6. Write that table back into S3 as a csv file named 'Result.csv'
7. Verify the result in S3.

Author

Pluralsight

The Cloud Content team comprises subject matter experts hyper focused on services offered by the leading cloud vendors (AWS, GCP, and Azure), as well as cloud-related technologies such as Linux and DevOps. The team is thrilled to share their knowledge to help you build modern tech solutions from the ground up, secure and optimize your environments, and so much more!

What's a lab?

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Provided environment for hands-on practice

We will provide the credentials and environment necessary for you to practice right within your browser.

Guided walkthrough

Follow along with the author’s guided walkthrough and build something new in your provided environment!

Did you know?

On average, you retain 75% more of your learning if you get time for practice.

Ready to get started?

View individual plans View team plans

Using Amazon S3 as a Machine Learning Repository

Path Info

Table of Contents

Prepare the Environment

Ingest Data Into SageMaker

What's a lab?

Provided environment for hands-on practice

Guided walkthrough

Did you know?