Building Your First Data Lakehouse Using Azure Synapse Analytics
In this course, you will learn about Azure Synapse Analytics & its integrated services - Workspace, Dedicated SQL, Apache Spark, Serverless SQL, Pipelines, Synapse Link to Cosmos DB, Power BI - and how it can be used to build a Data Lakehouse.
What you'll learn
Data Lakehouse is an architecture that intends to bring the best of Data Warehouses and Data Lakes together. It helps to put together a common platform to handle huge volumes of data, process data faster, and serve multiple use cases. In this course, Building Your First Data Lakehouse Using Azure Synapse Analytics, you'll learn to use Azure Synapse Analytics. It is a totally new product that brings together data integration, enterprise data warehousing, and big data analytics together. It is a set of multiple, well-integrated, Azure Data Services - Workspace, Dedicated SQL Pool, Apache Spark Pool, Serverless SQL, Synapse Pipelines, Synapse Link to Cosmos DB, integrated Power BI workspace, and much more. First, you'll learn how to set up the Synapse workspace by bringing in multiple data sources. Next, you'll discover how to explore the data and how to work with Dedicated SQL Pool, and Apache Spark Pool to extract and transform the data. Finally, you'll explore how to ingest, transform, and orchestrate data using Synapse Pipelines, use a Serverless SQL Pool to build a logical data model, and use connected services to Synapse. By the end of this course, you'll have knowledge and skills to use Synapse Link for Cosmos DB, and the integrated Power BI experience to help to build an end-to-end Data Lakehouse.
Table of contents
- Module Overview 1m
- Understanding Architecture and Components of Dedicated SQL Pool 9m
- Provisioning Dedicated SQL Pool 3m
- Working with Polybase 4m
- Polybase Demo 8m
- Loading Data Using COPY Statement 4m
- Implementing Table Distributions 6m
- Table Distributions and Data Shuffling 7m
- Summary and Further Study 3m