Evaluating a Data Mining Model
This course covers the important techniques in model evaluation for some of the most popular types of data mining techniques. These techniques range from association rules learning to clustering, regression, and classification.
What you'll learn
Data Mining is an umbrella term used for techniques that find patterns in large datasets. Thus, data mining can effectively be thought of as the application of machine learning techniques to big data.
In this course, Evaluating a Data Mining Model, you will gain the ability to answer the two most important questions that every practitioner of data mining must answer - is a particular model valid for this data? And, if yes, what is that model telling us?
First, you will learn that evaluating model fit and interpreting model results are key steps in the data mining process. Next, you will discover how association rules learning - a classic data mining technique - is implemented and evaluated.
Finally, you will round out your knowledge by seeing how the popular ML solution techniques - regression, classification, and clustering - can be implemented and evaluated for fit.
When you’re finished with this course, you will have the skills and knowledge to implement data mining techniques, evaluate them for model fit, and then intelligently interpret their findings.
Table of contents
- Version Check 0m
- Module Overview 2m
- Prerequisites and Course Outline 1m
- Evaluating the Results of Data Mining 6m
- White-box Models and Concept Drift 4m
- Model Simplicity 5m
- Evaluating Clustering Models 7m
- Demo: Performing Clustering Analysis Using K-means Clustering 7m
- Demo: Performing Clustering Analysis Using Agglomerative Clustering and Mean Shift Clustering 4m
- Demo: Evaluating K-means Clustering Using Sum of Squared Distances and Silhoutte Score 5m
- Demo: Evaluating Agglomerative Clustering and Estimating the Right Bandwidth for Mean Shift Clustering 5m
- Module Summary 2m
- Module Overview 2m
- Association Rule Mining for Market Basket Analysis 3m
- Support and Frequent Itemsets 4m
- Confidence, Lift, and Conviction 8m
- An Overview of the Apriori Algorithm 3m
- Demo: Using the Apriori Algorithm to Generate Frequent Itemsets 7m
- Demo: Association Rule Mining on a Toy Dataset 5m
- Demo: Exploring the Bread Basket Dataset 4m
- Demo: Association Rule Mining Using the Bread Basket Data 3m
- Module Summary 1m
- Module Overview 1m
- Finding the Best Fit Line 3m
- Interpreting Regression Results 3m
- R-square and Adjusted R-square 3m
- T-statistics and F-statistic 2m
- Demo: Exploring the Regression Dataset 6m
- Demo: Building and Evaluating a Regression Model 5m
- Demo: Interpreting Results Using Residuals and Learning Curves 5m
- Demo: Evaluating Multiple Regression Models 7m
- Module Summary 1m
- Module Overview 1m
- Accuracy as an Evaluation Metric 2m
- Precision and Recall to Evaluate Classifiers 5m
- The ROC Curve 5m
- Validating Models Using Training, Validation, and Test Sets 5m
- K-fold Cross Validation 3m
- Demo: Exploring the Classification Dataset 5m
- Demo: K-fold, Hold-out, and Shuffle Split Cross Validation 6m
- Demo: Grid Search for Hyperparameter Tuning with Cross Validation 5m
- Demo: Evaluating the Model Using Accuracy, Precision, Recall and the ROC Curve 3m
- Module Summary 2m