- Lab
- A Cloud Guru
Work with GitHub in Azure Data Factory
Azure Data Factory and Synapse Pipelines allow you to connect and configure a Git repository, where all the artifacts can be stored. In this lab, we'll create a new source control repository in GitHub. We'll then configure GitHub with Azure Data Factory by defining collaboration and publish branches. Then, we'll see what changes can be committed to these branches, and how it generates an ARM template for deployment.
Path Info
Table of Contents
-
Challenge
Set Up the Azure Environment
- Create an Azure Data Lake Gen2 account.
- Create a container called taxidata in the Data Lake account.
- Upload the
TaxiRides.csv
file to the container.Note: Make sure you have downloaded the
TaxiRides.csv
file that is located in this GitHub repository. - Create an Azure Data Factory instance.
-
Challenge
Set Up a New Repository in GitHub
- Log in to an existing GitHub account, or create a new GitHub account if necessary. The link to the GitHub site is available in the Additional Information and Resources section.
- Create a new public repository.
-
Challenge
Configure GitHub in Azure Data Factory
- Authenticate to GitHub from Data Factory, and configure the repository.
- Define collaboration and publish branches.
- Select the collaboration branch as the working branch.
-
Challenge
Save Changes to Collaboration Branch
- Create a pipeline with Copy activity. Do not fill all the properties so that the pipeline remains invalid.
- Save the pipeline with invalid changes, and verify them in the collaboration branch of the GitHub repository.
- Complete the pipeline by copying a file from one data lake folder to another. Create a linked service to the data lake and two datasets (one for the source file and the other for the sink file).
- Save the pipeline with valid changes, and verify them in the collaboration branch of the GitHub repository.
-
Challenge
Publish the Changes to Publish Branch
- Publish the pipeline with valid changes.
- Verify that an ARM (Azure Resource Manager) template has been generated and stored in the publish branch.
What's a lab?
Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.
Provided environment for hands-on practice
We will provide the credentials and environment necessary for you to practice right within your browser.
Guided walkthrough
Follow along with the author’s guided walkthrough and build something new in your provided environment!
Did you know?
On average, you retain 75% more of your learning if you get time for practice.