Deploying PyTorch Models in Production: PyTorch Playbook
This course covers the important aspects of performing distributed training of PyTorch models, using the multiprocessing, data-parallel, and distributed data-parallel approaches. It also discusses which you can host PyTorch models for prediction.
What you'll learn
PyTorch is fast emerging as a popular choice for building deep learning models owing to its flexibility, ease-of-use, and built-in support for optimized hardware such as GPUs. Using PyTorch, you can build complex deep learning models, while still using Python-native support for debugging and visualization. In this course, Deploying PyTorch Models in Production: PyTorch Playbook you will gain the ability to leverage advanced functionality for serializing and deserializing PyTorch models, training, and then deploying them for prediction. First, you will learn how the load_state_dict and the torch.save() and torch.load() methods complement and differ from each other, and the relative pros and cons of each. Next, you will discover how to leverage the state_dict which is a handy dictionary with information about parameters as well as hyperparameters. Then, you will see how the multiprocessing, data-parallel, and distributed data-parallel approaches to distributed training can be used in PyTorch. You will train a PyTorch model on a distributed cluster using high-level estimator APIs. Finally, you will explore how to deploy PyTorch models using a Flask application, a Clipper cluster, and a serverless environment. When you’re finished with this course, you will have the skills and knowledge to perform distributed training and deployment of PyTorch models and utilize advanced mechanisms for model serialization and deserialization.
Table of contents
- Version Check 0m
- Module Overview 2m
- Prerequisites and Course Outline 1m
- Saving and Loading PyTorch Models 7m
- Building and Training a Classifier Model 5m
- Saving and Loading Models Using torch.save() 6m
- Saving Model Using the state_dict 5m
- Saving and Loading Checkpoints 4m
- Introducing ONNX 2m
- Exporting a Model to ONNX and Loading in Caffe2 8m
- Module Summary 1m
- Module Overview 1m
- Distributed Training on the Cloud 4m
- Setting up a SageMaker Notebook Instance 3m
- Setting up Training and Test Data Loaders 5m
- Define the Training Function 5m
- Functions to Test and Save the Trained Model 3m
- Running Distributed Training Using the PyTorch Estimator 8m
- Module Summary 1m
- Module Overview 1m
- Exploring Options to Deploy PyTorch Models 4m
- Installing Libraries and Uploading Model Parameters to a GCP Bucket 3m
- Creating a Flask App to Serve the PyTorch Model 4m
- Using the Model for Prediction 3m
- Installing Docker 3m
- Creating and Using a Clipper Cluster for Prediction 7m
- Deploying a Model for Prediction to a Serverless Environment 7m
- Summary and Further Study 1m