Representing, Processing, and Preparing Data
This course covers the different data processing tools - including spreadsheets, Python, and relational databases - and deals with data quality issues and visualizing data for insight generation.
What you'll learn
Data science and data modeling are fast emerging as crucial capabilities that every enterprise and every technologist must possess these days. As the process of actually constructing models becomes democratized, the general view is shifting toward using the right data and using the data right.
In this course, Representing, Processing, and Preparing Data, you will gain the ability to correctly represent information from your domain as numeric data, and get it into a form where the full capabilities of models can be leveraged.
First, you will learn how outliers and missing data can be dealt with in a theoretically sound manner.
Next, you will discover how to use spreadsheets, programming languages and relational databases to work with your data. You will see the different types of data that you may deal with in the real world and how you can collect and integrate data to a common destination to eliminate silos.
Finally, you will round out the course by working with visualization tools that allow every member of an enterprise to work with data and extract meaningful insights.
When you are finished with this course, you will have the skills and knowledge to use the right data sources, cope with data quality issues and choose the right technologies to extract insights from your enterprise data.
Table of contents
- Module Overview 1m
- Excel: Working with Duplicates and Missing Values 6m
- Excel: Identifying and Eliminating Outliers Using Z-scores 7m
- Excel: Clamping Outliers 3m
- Python: Filling Missing Values 5m
- Python: Working with Missing Values on Real World Data 7m
- Python: Identifying and Removing Outliers 6m
- Python: Training an ML Classifier Using the Cleaned Dataset 3m
- Summary 1m
- Module Overview 1m
- Continuous and Categorical Data 3m
- Numeric Representations of Text Data 5m
- Representing Image Data as Matrices 3m
- Azure Data Products 2m
- Installing and Working with Azure Data Studio 5m
- Visualizing Insights Using Azure Data Studio 7m
- Installing and Visualizing Data in Power BI 4m
- Creating Different Visualizations in Power BI 3m
- Summary and Further Study 1m