Preparing Data for Feature Engineering and Machine Learning
This course covers categories of feature engineering techniques used to get the best results from a machine learning model, including feature selection, and several feature extraction techniques to re-express features in the most appropriate form.
What you'll learn
However well designed and well implemented a machine learning model is, if the data fed in is poorly engineered, the model’s predictions will be disappointing.
In this course, Preparing Data for Feature Engineering and Machine Learning, you will gain the ability to appropriately pre-process your data -- in effect engineer it -- so that you can get the best out of your ML models.
First, you will learn how feature selection techniques can be used to find predictors that contain the most information. Feature selection can be broadly grouped into three categories known as filter, wrapper, and embedded techniques and we will understand and implement all of these.
Next, you will discover how feature extraction differs from feature selection, in that data is substantially re-expressed, sometimes in forms that are hard to interpret. You will then understand techniques for feature extraction from image and text data.
Finally, you will round out your knowledge by understanding how to leverage powerful Python libraries for working with images, text, dates, and geo-spatial data.
When you’re finished with this course, you will have the skills and knowledge to identify the correct feature engineering techniques, and the appropriate solutions for your use-case.
Table of contents
- Version Check 0m
- Module Overview 1m
- Prerequisites and Course Outline 1m
- Features and Labels 6m
- The Machine Learning Workflow 4m
- Components of Feature Engineering 3m
- Feature Selection, Feature Learning, and Feature Extraction 7m
- Feature Combination and Dimensionality Reduction 4m
- Training, Validation, and Test Data 6m
- K-fold Cross Validation 4m
- Module Summary 1m
- Module Overview 2m
- Types of Data 5m
- Measuring Correlations 5m
- Understanding Feature Selection Using Filter, Embedded, and Wrapper Methods 6m
- Feature Selection Using Missing Value Ratio 5m
- Calculating and Visualizing Correlations Using Pandas 6m
- Calculating and Visualizing Correlations Using Yellowbrick 3m
- Feature Selection Using Filter Methods 6m
- Feature Selection Using Wrapper Methods 6m
- Feature Selection Using Embedded Methods 5m
- Module Summary 2m
- Module Overview 1m
- Tokenization and Visualizing Frequency Distributions 4m
- Performing Normalization Using Different Techniques 5m
- Creating Feature Vectors from Text Data 6m
- Loading and Transforming Images 5m
- Extracting Features from Images 3m
- Detecting Keypoints and Descriptors to Perform Image Matching 5m
- Extracting Text from Images Using OCR 4m
- Extracting Features from Dates 5m
- Working with Geospatial Features 7m
- Summary and Further Study 2m