Combining and Shaping Data
This course covers both conceptual and practical aspects of pulling together from different data sources, with different schemas and orientations, into a cohesive whole using Excel, Python, and various tools available on the Azure cloud platform.
What you'll learn
Connecting the dots between data from different sources is becoming the most sought-after skill these days for everyone ranging from business professionals to data scientists.
In this course, Combining and Shaping Data, you will gain the ability to connect the dots by pulling together data from disparate sources and shaping it so that extracting connections and relationships becomes relatively easy.
First, you will learn how the most common constructs in shaping and combining data stay the same across spreadsheets, programming languages, and databases.
Next, you will discover how to use joins and vlookups to obtain wide datasets, and then use pivots to shape that into long form. You will then see how both long and wide data can be aggregated to obtain higher level insights. You will work with Excel spreadsheets and SQL as well as Python.
Finally, you will round out the course by integrating data from a variety of sources and working with streaming data, which helps your enterprise gain real-time insights into the world around you.
When you are finished with this course, you will have the skills and knowledge to pull together data from disparate sources, including from streaming sources, to construct integrated data models that truly connect the dots.
Table of contents
- Module Overview 1m
- Load Data and Join Tables 9m
- Aggregation Operations 2m
- Join Tables Using Power Query 6m
- Convert CSV Data to Table Format 3m
- Convert Wide Form Data to Long Form 2m
- Pivot Tables for Summary Statistics 5m
- Unpivot Tables 4m
- Calculating Cumulative Sum and Ranks on Partitioned Data 7m
- Summary 1m
- Module Overview 1m
- Introducing the Azure SQL Database 2m
- Loading Data to Azure Blob Storage 4m
- Creating an Azure SQL Database and Tables 4m
- Loading Data Using Azure Data Factory Pipelines 10m
- Querying Data in SQL Using Aggregations 3m
- Performing Join Operations in SQL 3m
- Performing Pivot and Unpivot Operations in SQL 3m
- Summary 1m
- Module Overview 1m
- Data Cleaning: Missing Data and Outliers 4m
- Getting Started with Azure Notebooks 2m
- Combining and Shaping Data Using Pandas 3m
- Identifying and Coping with Outliers 5m
- Detecting Outliers Using Z-scores 4m
- Handling Missing Values 5m
- Cleaning Data 5m
- Working with Imbalanced Data 4m
- Handling Imbalanced Data with Scikit Learn 7m
- Summary 2m
- Module Overview 1m
- Transactional and Analytical Processing on Azure 6m
- Creating an Azure SQL Data Warehouse and Uploading Files to Azure Blob Storage 5m
- Loading Data from Blob Store to the Data Warehouse 4m
- Loading Data from GCP Cloud Storage Buckets to the Azure SQL Data Warehouse 7m
- Enabling Resource Permissions to an Azure Active Directory App 4m
- Python Script to Transform Data Using Pyspark 2m
- Transform Data Using a Spark Activity 8m
- Copying Data from a Blob Storage Folder 3m
- Summary 1m
- Module Overview 1m
- Batch and Streaming Data 5m
- Types of Windows 6m
- Watermarks and the Notion of Time 4m
- Creating a Stream Analytics Job and a Power BI Pro Account 4m
- Configuring Input and Output for the Stream Analytics Job 5m
- Visualizing Streaming Data Using Power BI 5m
- Summary and Further Study 2m