Skip to content

Contact sales

By filling out this form and clicking submit, you acknowledge our privacy policy.
  • Labs icon Lab
  • A Cloud Guru
Azure icon
Labs

Upserting Data in Azure Synapse Analytics

Upsert allows you to insert and update data in the target table as a single transaction. Here, we’ll upsert the data using a Synapse pipeline in a dedicated SQL pool table. In the first run, we'll load all the records in the table. And in the second run, we'll insert and update records as a single transaction.

Azure icon
Labs

Path Info

Level
Clock icon Intermediate
Duration
Clock icon 45m
Published
Clock icon Nov 09, 2023

Contact sales

By filling out this form and clicking submit, you acknowledge our privacy policy.

Table of Contents

  1. Challenge

    Set Up the Environment

    1. Create an Azure Synapse Analytics instance by defining a new Azure Data Lake Gen2 account.
    2. Create a container named taxidata in the Data Lake account.
    3. Upload the TaxiRides.csv file in the container. The file link is available in the Additional Information and Resources section.
  2. Challenge

    Set Up Dedicated SQL Pool Instance

    Within Synapse workspace, create a dedicated SQL pool instance named TaxiRidesWarehouse, with the performance level of DW100c.

  3. Challenge

    Create a Pipeline to Copy Data from Data Lake File to Dedicated SQL Table

    1. Create a linked service for the data lake.
    2. Create a TaxiRides table in the dedicated SQL pool. The script to create it is available in the Additional Information and Resources section.
    3. Create a pipeline with a Copy activity.
    4. From the source of the Copy activity, create an integration dataset for the data lake file. Use the format as Delimited Text.
    5. From the sink of the Copy activity, create an integration dataset for the dedicated SQL pool table, TaxiRides. Use Azure Synapse dedicated SQL pool as the data store.
    6. In the sink of the Copy activity, set Copy method as Upsert. Add RideId as the key column.
  4. Challenge

    Complete Initial File Load

    1. Run the pipeline.
    2. Use an SQL query to verify that 100 records have been successfully added to the table.
    3. Verify that the PassengerCount column value for RideId = 1 is 1.
  5. Challenge

    Upsert Data from File to Dedicated SQL Table

    1. In the data lake account, open and edit the file TaxiRides.csv.
    2. In the file, only keep the record for RideId = 1 and set PassengerCount = 2. Remove all other records.
    3. In the file, add a new record for RideId = 10000. Add any data for the record but make sure to add the right number of columns.
    4. Save the file changes and run the pipeline.
    5. Verify that the table now has 101 records, with an additional record for RideId = 10000.
    6. Verify that the PassengerCount column value for RideId = 1 is now updated to 2.

The Cloud Content team comprises subject matter experts hyper focused on services offered by the leading cloud vendors (AWS, GCP, and Azure), as well as cloud-related technologies such as Linux and DevOps. The team is thrilled to share their knowledge to help you build modern tech solutions from the ground up, secure and optimize your environments, and so much more!

What's a lab?

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Provided environment for hands-on practice

We will provide the credentials and environment necessary for you to practice right within your browser.

Guided walkthrough

Follow along with the author’s guided walkthrough and build something new in your provided environment!

Did you know?

On average, you retain 75% more of your learning if you get time for practice.

Start learning by doing today

View Plans