Summarizing Data and Deducing Probabilities
This course covers the most important aspects of exploratory data analysis using different univariate, bivariate, and multivariate statistics from Excel and Python, including the use of Naive Bayes' classifiers and Seaborn to visualize relationships.
What you'll learn
Data science and data modeling are fast emerging as crucial capabilities that every enterprise and every technologist must possess these days. Increasingly, different organizations are using the same models and the same modeling tools, so what differs is how those models are applied to the data. So, it is really important that you know your data well.
In this course, Summarizing Data and Deducing Probabilities, you will gain the ability to summarize your data using univariate, bivariate, and multivariate statistics in a range of technologies.
First, you will learn how measures of mean and central tendency can be calculated in Microsoft Excel and Python. Next, you will discover how to use correlations and covariances to explore pairwise relationships. You will then see how those constructs can be generalized to multiple variables using covariance and correlation matrices.
You will understand and apply Bayes' Theorem, one of the most powerful and widely-used results in probability, to build a robust classifier.
Finally, you will use Seaborn, a visualization library, to represent statistics visually.
When you are finished with this course, you will have the skills and knowledge to use univariate, bivariate, and multivariate descriptive statistics from Excel and Python in order to find relationships and calculate probabilities.
Table of contents
- Module Overview 1m
- Working with Excel Workbooks 3m
- Descriptive Statistics for Univariate Data 7m
- Visualizing Univariate Statistics 3m
- Using Pivot Tables for Summary Statistics 4m
- Performing Analysis Using Bucketing and Pivot Charts 6m
- Visualizing Bivariate Relationships 3m
- Performing Regression Analysis on Bivariate Data 4m
- Covariance and Correlation Matrices for Multivariate Data 4m
- Visualizing Multivariate Data Using Pivot Charts 3m
- Regression Analysis with Multivariate Data 3m
- Module Summary 1m
- Module Overview 3m
- Getting Started with Azure Notebooks 2m
- Calculating Descriptive Statistics Using Python 6m
- Calculating Descriptive Statistics Using Python Libraries 7m
- Calculating Skewness Kurtosis and Simple Visualizations 3m
- Bivariate Analysis 4m
- Simple Regression on Bivariate Data Using Scipy 3m
- Regression on Multivariate Data Using Statsmodels and scikit-learn 6m
- Module Summary 1m
- Module Overview 1m
- Understanding Kernel Density Estimation 4m
- Histograms, KDE Plots, and Rug Plots for Univariate Analysis 5m
- Scatter Plots, Joint Plots, Hexbin Plots for Univariate Analysis 5m
- Regression Analysis on Bivariate Data 5m
- Representing Pairwise Relationships Using the Pairplot and Pairgrid 4m
- Visualizing Categorical Data Using Strip Plots and Swarm Plots 3m
- Visualizing Data Using Box Plots and Violin Plots 4m
- Visualizing Categorical Data Using Bar Plots, Point Plots, and Cat Plots 4m
- Summary and Further Study 2m