Skip to content

Contact sales

By filling out this form and clicking submit, you acknowledge our privacy policy.

Getting hands-on with Amazon SageMaker Linear Learner

Looking for an intro to machine learning? Here's how to get hands-on with Amazon SageMaker's Linear Learner algorithm and build up AWS ML skills.

Jun 08, 2023 • 10 Minute Read

Please set an alt value for this image...
  • Cloud
  • AWS
  • AI & Data

In this post, we'll talk about how Amazon SageMaker's Linear Learner algorithm is a great place to start if you're new to machine learning or looking for a hands-on intro to building AWS ML skills.

introduction to machine learningThis is the first in a series of posts about Amazon SageMaker's built-in algorithms. This series does assume that you have prior experience with machine learning. If you'd like to learn more, watch our Introduction to Machine Learning course.

I started my machine learning journey four years ago as a Java software engineering manager wanting to make a transition. I didn't know where to start, so I turned to Amazon Web Services (AWS) to level-up my skills. Fast forward to present day, I've spoken on the AWS re:Invent and TED stages about machine learning, been named an AWS Machine Learning Hero, and earned my AWS Certified Machine Learning - Specialty certification


Accelerate your career

Get started with ACG and transform your career with courses and real hands-on labs in AWS, Microsoft Azure, Google Cloud, and beyond.


Supervised Learning and Binary Classification

The first machine learning problem type I solved was supervised learning with binary classification. I found this use case to be simple and easy to understand. In this use case, you teach the machine to answer simple Yes/No questions by giving it data samples already labeled with the answer you want it to learn how to predict. 

Back then, I started with the Amazon Machine Learning service but quickly graduated to Amazon SageMaker. If you're not familiar with Amazon SageMaker, it's machine-learning-as-a-service and provides an end-to-end environment for preparing, building, training, and deploying machine learning models. 

I found it easy to replicate my Amazon Machine Learning use case using Amazon SageMaker. The built-in Linear Learner algorithm allowed me to be more hands-on using Python, Jupyter Notebooks, and various data science libraries to train my model.


Want to learn more about designing and deploying machine learning solutions on AWS? A Cloud Guru’s AWS Machine Learning learning path offers custom courses fit for beginners and advanced gurus!


Linear Learner Algorithm

The beauty of Amazon SageMaker is that it comes with several built-in algorithms that can be applied across several problem types.

Learning TypeProblem TypeBuilt-in algorithms
Supervised Binary classification
Multi-class classification
Regression
Linear Learner
Factorization Machines
K-Nearest Nearest Neighbors (KNN)
XGBoost
Supervised, RNNTime-series forecastingDeepAR
UnsupervisedDimensionality reductionPrincipal Component Analysis (PCA)
UnsupervisedAnomaly detectionRandom Cut Forest (RCF)
UnsupervisedIP anomaly detectionIP Insights
UnsupervisedEmbeddingsObject2Vec
UnsupervisedClusteringK-Means Algorithm
UnsupervisedTopic modelingLatent Dirichlet Allocation (LDA)
Neural Topic Model (NTM)
Textual AnalysisText classificationBlazingText
Textual AnalysisMachine translation
Text summarization
Speech-to-text
Sequence-to-Sequence
Image ProcessingImage and multi-labelImage Classification
Image ProcessingObject detection and classificationObject Detection
Image ProcessingComputer VisionSemantic Segmentation 

Linear Learner worked perfectly for my use case because it solves binary classification problems (among others). 

Problem Types Solved by Linear Learner

There are several problem types solved by the Linear Learner algorithm.

MethodProblem TypeDescriptionExamples
Logistic regressionBinary classificationAnswers Yes/No questions by predicting a 0 or 1Is this email spam or not?
Is this transaction fraudulent or not?
Is crime likely or not?
Multinomial logistic regressionMulti-class classification Answers 1 of many questions by predicting 0 to n-1 classesIs this item a book, movie, or toy?
Is this animal a dog, bird, or cat? 
Linear regressionRegression Answers continuous numeric value questionsWhat will the temperature be in Atlanta tomorrow?
How many units of this product will sell?
What will this house sell for?

How it works

For training, Linear Learner requires a data matrix with rows that represent the observations and columns representing the features. One column in the data matrix should represent the label that you want the machine to learn how to predict. For SageMaker's Linear Learner, the label should appear in the first column of the data matrix and column headers should be excluded. 

For my use case, I stored the training data in Amazon S3 and specified the input bucket for the source of the training data and the output bucket to hold the final model artifacts. Linear Learner supports training data in RecordIO (protobuf) or CSV formats, and accepts inference requests in JSON, CSV, or RecordIO (protobuf). For my initial use case, I used CSV. There are two modes for loading the S3 data to the Amazon SageMaker hosted Jupyter notebook instance: file or pipe. While I used file mode initially, pipe mode is more efficient because it reduces training time and saves money by streaming your data, instead of storing the full training dataset on disk like file mode. 

When configuring your training job, there are several hyperparameters. I've listed below a few of the more important hyperparameters for model training. The full list of hyperparameters for the Linear Learner algorithm can be found in the Amazon SageMaker Developer Guide

HyperparameterDescriptionPossible Values
predictor_typeThis indicates the target variable. For binary classification, I selected binary_classifier. If you're using multiclass classification, you'll select multiclass_classifer. For regression, you'll select regressor.binary_classifier
multiclass_classifier
regressor
epochsThe number of passes over the data.A positive integer with the default value being 15.
feature_dimThe number of features in the input data.auto or positive integer
l1The L1 regularization value.auto or non-negative float
wdThe L2 regularization value.auto or non-negative float
optimizerThe optimization algorithm. auto, sgd, adam, rmsprop
Adam
is the default setting for auto.
learning_rateThe step size for the optimizer.auto or positive float
loss The loss function.This varies based on the selected predictor_type.
For binary_classifier, the options are auto, logistic, or hinge_loss.
The default value for auto is logistic.
mini_batch_sizeThe number of observations per batch.positive integer; default value is 1000
num_modelsLinear Learner trains multiple models in parallel. This allows you to set the number of models trained and compared against each other.auto or positive integer

When you are ready to train your machine learning model, a single or multi-machine CPU and GPU instances. It is important to note that Linear Learner doesn't support incremental training; instead, it uses distributed training. 

Model Tuning

Linear Learner reports several metrics to help you evaluate and tune your model before releasing it to production. Each metric is reported as both a test and validation metric.

  • Objective Loss - This represents the mean value of the loss function. For my use case, the loss is logistic loss.
    • test:objective_loss
    • validation:objective_loss
  • Accuracy - This represents the number of correct true positives and true negatives.
    • test:binary_classification_ accuracy
    • validation:binary_classification_accuracy
  • Precision - This represents all of the predicted positives - the percentage that is actually positive.
    • test:precision
    • validation:precision
  • Recall - This represents all of the actual positives -  the percent that was predicted correctly. 
    • test:recall
    • validation:recall
  • F1 Score - This is the balance of precision and recall.
    • test:binary_f_beta
    • validation:binary_f_beta

Banner for 2021 re:Invent: AWS heroes on what to know before the show

Watch: What to Know Before re:Invent 2021
It’s time for the most exciting extravaganza of the year: re:Invent 2021. Join us Wednesday, Nov. 17, and grab a seat at the poker table with our panel of AWS Heroes as they place their bets on what announcements will be made at this year’s conference.


Linear Learner in Action

Now let's see a real-world example. The aim of my use case is to use SageMaker’s Linear Learner algorithm to train a linear model for crime prediction. For this illustration, stop-and-search crime data was pulled from the data.police.uk dataset available at https://data.police.uk/data/data.police.uk is a site for open data about crime and policing in England, Wales, and Northern Ireland.

The purpose here is to use this dataset to build a predictive policing model to determine whether or not crime is likely given the following data points:

  • Location
  • Age
  • Gender
  • Time of Day
  • Day of Week
  • Month

The model returns a 'Crime' or 'No Crime' prediction based on the input provided. The sample code for the use case is freely available. To start the process, I launched an Amazon SageMaker hosted Juptyer notebook.

Import Libraries

I imported the necessary data science and SageMaker Python libraries in the notebook.

Please set an alt value for this image...

Data Ingestion

Next, I read the dataset from the online URL into memory, for preprocessing prior to training.

Data Inspection and Visualization

Once the dataset is imported, it's typical as part of the machine learning process to inspect the data, understand the distributions, and determine what type(s) of preprocessing might be needed.

I inspected the first few rows of the data.

I used a histogram to understand distributions.

I analyzed the crime count across counties.

I used a crosstab to understand the distributions across gender. 

Data Cleaning

After visualizing and understanding the data, I removed null or bad values. In this stage of my learning, I did not consider any data imputation techniques.

I uncovered that the gender and average age fields had several observations that should be removed from the dataset.


Cloud Dictionary

Get the Cloud Dictionary of Pain
Speaking cloud doesn’t have to be hard. We analyzed millions of responses to ID the top concepts that trip people up. Grab this cloud guide for succinct definitions of some of the most painful cloud terms.


Data Encoding and Transformation

Before training, I converted the categorical features into numeric features since the classifiers only work with numeric values.

I converted "day of week" to its numerical representation.

I converted Gender to a numerical representation.

Splitting into Training, Validation, and Test Sets

To prevent model overfitting and to allow me to test the model's accuracy on data that it hadn't seen yet, I split the dataset into training, validation, and test sets.

Training the Linear Model

After I loaded the cleaned up training data to S3, I trained the model using Linear Learner. The first step is to set the container image.

Then, I set up the necessary hyperparameters and the bucket location for the final model artifact.

Model Evaluation

During each epoch, the evaluation metrics are logged. I used this to determine how well the model was performing at each pass over the dataset. These scores helped me tune my model for better performance.

Model Hosting

Once I had a trained model, I hosted it on Amazon SageMaker so other users and applications could access it using the deploy function.

Inferences

Delete Model Endpoint

When I was finished using my model for predictions, I deleted it to make sure that I would no longer be charged for it. 

Learn more about machine learning

Amazon SageMaker's Linear Learner algorithm is a great place to start if you're new to machine learning. I found answering simple Yes/No questions to be the easiest use case. 

In the next post in this series, I'll be reviewing the K-Means built-in algorithm that is used for clustering and finding discrete groupings within data.

Looking to learn more about machine learning? Check out ACG’s Introduction to Machine Learning and AWS Certified Machine Learning - Specialty certification courses.

Want to keep up with all things cloud? Subscribe to A Cloud Guru on YouTube for weekly AWS news (plus news from those other cloud providers too). You can also like us on Facebook, follow us on Twitter, or join the conversation on Discord!

Kesha Williams

Kesha W.

Kesha Williams is an Atlanta-based AWS Machine Learning Hero and Senior Director of Enterprise Architecture & Engineering. She guides the strategic vision and design of technology solutions across the enterprise while leading engineering teams in building cloud-native solutions with a focus on Artificial Intelligence (AI). Kesha holds multiple AWS certifications and has received leadership training from Harvard Business School. Learn more at https://www.keshawilliams.com/.

More about this author