- Lab
- A Cloud Guru
Creating a scikit-learn Random Forest Classifier in Amazon SageMaker
Scikit-learn is a great place to start working with machine learning. In this lab, we will use scikit-learn to create a Random Forest Classifier to determine if you prefer cats or dogs. The data set being used is entirely made up, but could easily be swapped with one of your own!
Path Info
Table of Contents
-
Challenge
Navigate to the Jupyter Notebook
Log in to the AWS console and navigate to the AWS SageMaker page. From there, load the Jupyter Notebook that has been provided with this hands-on lab.
-
Challenge
Load and Prepare the Data
- Load the survey data from
data.csv
, located beside the notebook. - View the data. Look at both the raw data and statistics for the data.
- Change the column data types so the model can understand them.
- Split the data into training and testing sets. Use 80% of the data for training.
- Load the survey data from
-
Challenge
Train the Random Forest Model
- Create a Random Forest Classifier model using scikit-learn.
- Train the model using the training data.
-
Challenge
Evaluate the Model
- Generate predictions for the testing data set.
- View the confusion matrix for the predictions.
- Calculate the sensitivity and specificity.
- Plot the ROC curve.
- Calculate the area under the curve.
-
Challenge
Predict for Yourself
- Create a survey response for yourself.
- Have the model predict if you prefer cats or dogs.
What's a lab?
Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.
Provided environment for hands-on practice
We will provide the credentials and environment necessary for you to practice right within your browser.
Guided walkthrough
Follow along with the author’s guided walkthrough and build something new in your provided environment!
Did you know?
On average, you retain 75% more of your learning if you get time for practice.