Skip to content

Contact sales

By filling out this form and clicking submit, you acknowledge our privacy policy.
  • Labs icon Lab
  • A Cloud Guru
Azure icon
Labs

Partition Data in Azure Data Lake Storage Gen2

In this hands-on lab, you work for Bethany's Pie Shop, because really, who doesn't love pie? Bethany is in the process of migrating their infrastructure from multiple disparate data sources into an Azure Data Lake. You will be creating an Azure Data Lake, configuring the folder structure to support a medallion architecture utilizing file partitioning best practices, and creating a data pipeline to move data between folders.

Azure icon
Labs

Path Info

Level
Clock icon Intermediate
Duration
Clock icon 1h 0m
Published
Clock icon Nov 09, 2023

Contact sales

By filling out this form and clicking submit, you acknowledge our privacy policy.

Table of Contents

  1. Challenge

    Prepare the Environment

    1. Create a Synapse Analytics instance in the West US region. It should utilize a new Data Lake Storage Gen 2 account.

    2. Create a SQL database instance and server in the West US region. It should utilize DTU-Based Basic Tier Compute. Make sure to choose Sample Data as the Data Source.

  2. Challenge

    Create the Medallion Architecture

    1. For the storage account, in Settings, Configuration.enable Allow Blob anonymous access.
    2. Create a Bronze Layer Container.
      1. Create folders for the ingestion sources Support and Marketing.
    3. Create a Silver Layer Container.
    4. Create a Gold Layer Container.
  3. Challenge

    Create Bronze Files

    1. Create a Data Factory pipeline and populate the bronze folder with a SalesLT.Customer table in Parquet format. This table should go to the Marketing folder and be named bronze_hr_03_20_2024.
    2. Create a Data Factory pipeline and populate the bronze folder with a SalesLT.Address table in Parquet format. This table should go to the Support folder and be named bronze_support_03_20_2024.
    3. Run the pipeline.
  4. Challenge

    Create Gold File

    1. Create a Data Factory pipeline to move both files from bronze and combine them as a single folder in gold (For the sake of brevity, we are bypassing the silver step).
    2. Once complete, go back to the storage account to verify there are now files in the correct folders.

The Cloud Content team comprises subject matter experts hyper focused on services offered by the leading cloud vendors (AWS, GCP, and Azure), as well as cloud-related technologies such as Linux and DevOps. The team is thrilled to share their knowledge to help you build modern tech solutions from the ground up, secure and optimize your environments, and so much more!

What's a lab?

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Provided environment for hands-on practice

We will provide the credentials and environment necessary for you to practice right within your browser.

Guided walkthrough

Follow along with the author’s guided walkthrough and build something new in your provided environment!

Did you know?

On average, you retain 75% more of your learning if you get time for practice.

Start learning by doing today

View Plans