Experimental Design for Data Analysis
This course covers conceptual and practical aspects of building and evaluating machine learning models in a way that uses data judiciously, while also accounting for considerations such as ordering and relationships within data and other biases.
What you'll learn
Providing crisp, clear, actionable points-of-view to senior executives is becoming an increasingly important role of data scientists and data professionals these days. Now, a point-of-view must represent a hypothesis, ideally backed by data. In this course, Experimental Design for Data Analysis, you will gain the ability to construct such hypotheses from data and use rigorous frameworks to test whether they hold true. First, you will learn how inferential statistics and hypothesis testing form the basis of data modeling and machine learning. Next, you will discover how the process of building machine learning models is akin to that of designing an experiment and how training and validation techniques help rigorously evaluate the results of such experiments. Then, you will round out the course by studying various forms of cross-validation, including both singular and iterative techniques to cope with independent, identically distributed data and grouped data. Finally, you will also learn how you can refine your models using these techniques with hyperparameter tuning. When you’re finished with this course, you will have the skills and knowledge to build and evaluate models, specifically including machine learning models, using rigorous cross-validation frameworks and hyperparameter tuning.
Table of contents
- Module Overview 1m
- Cross-validation in the ML Workflow 2m
- Singular Cross-validation 4m
- Cross-validation Using Azure ML Studio 6m
- K-fold Cross-validation and Variants 6m
- K-fold Cross-validation in scikit-learn 7m
- Repeated K-fold Cross-validation in scikit-learn 4m
- Stratified K-fold Cross-validation in scikit-learn 5m
- Group K-fold in scikit-learn 4m
- Summary 1m