Skip to content

Contact sales

By filling out this form and clicking submit, you acknowledge our privacy policy.
  • Labs icon Lab
  • A Cloud Guru
Azure icon
Labs

Search Data in Multiple Languages with Azure AI Search

In this hands-on lab, you will get a chance to set up an AI-enhanced Azure AI Search solution, including creating the search service, importing data to generate a search index, adding a skillset to translate an English description field to Swedish, and add it to the index. You will validate your work by searching the data using the JSON query editor in the Search Explorer. In the process, you will explore the use of both skillsets and analyzers in defining an Azure AI Search index and indexer. All work will take place in the Azure portal, and no coding is required to create the resources and test your solution.

Azure icon
Labs

Path Info

Level
Clock icon Beginner
Duration
Clock icon 30m
Published
Clock icon Feb 09, 2024

Contact sales

By filling out this form and clicking submit, you acknowledge our privacy policy.

Table of Contents

  1. Challenge

    Create an Azure AI Search Service

    You should already be logged into the Azure portal, using the credentials provided with the lab. When you first log in to the Azure Portal, you will land on the overview page for the resource group already deployed for you. Note the location/region for the resource group.

    Using the Azure portal, create an Azure AI Search service with the following configuration:

    • Create the service in the existing resource group, and in the same location as that resource group.
    • Use any valid name you choose.
    • Create the service in the Basic pricing tier. Do not choose the Free tier, as only one is allowed per subscription, and your lab environment is on a shared subscription with other students.
  2. Challenge

    Import Data and Add Translation Skillset

    In this objective, you will complete the tasks required to import sample data and build an index on your search service, based on that data. The indexer that populates the index will also include a skillset that translates one of the fields into another language, allowing users to search in a preferred language. The source data already has the description field translated into a handful of languages other than English, but you want to add one more to the search index.

    You should have already created an Azure AI Search service. Navigate to the newly deployed service, and on the "Overview" page, select "Import Data" to kick off the wizard. Let the wizard guide you through the process, including the following specific details and properties.

    • Import from the existing "Samples" provided by Microsoft and choose the Azure SQL dataset called "realestate-us-sample."
    • Under Add cognitive skills, Attach AI Services, confirm that the AI Services Resource Name is "Free (Limited enrichments)." This setting determines the compute resource that will be used to power the AI enrichment(s) you select. The free option allows you to use non-billed resources instead of setting up and paying for compute and storage — on an Azure AI multi-service resource.
    • Under Add Enrichments you should add a new enrichment using the description field as your source data field, and the enrichment you want to apply should translate the description field from English into Swedish and populate a new field, called translated_text. All other features on the Add cognitive skills tab can remain at their defaults.
    • Under Customize target index, familiarize yourself with the fields extracted from the source data, but don't change any of the configurations. Find the translated_text field you added as a part of the translation skillset and note that the analyzer is currently set to English. Set the analyzer to the appropriate language to improve handling of the translated text. Either the Microsoft or the Lucene analyzer is fine.
    • In the final step of the wizard, you can complete creating the index, the translation skillset, and the indexer -- and kick it off the first indexer run by selecting Submit.
    • When the indexer run is complete, navigate to the index populated by the indexer and note the number of JSON documents created and the storage size. There should be 20 documents, each containing the data related to a single real estate property. There are actually many more documents in the sample data source, but the indexer run times out at 20 documents intentionally. This is because you chose "Free (Limited enrichments)" as your AI Services Resource in the wizard. Having only 20 documents is suitable for demo and testing purposes.
    Tip: If the UI Indicates 0 Documents and 0 Bytes

    If the UI in the index screen appears to indicate that there are no documents, make sure the indexer has completed running. However, there may also be a quirk in the UI. You can also perform a quick query by putting an asterisk (*) in the search bar and running the query. If the query returns documents, the UI just hasn't caught up with the underlying data statistics; you can proceed with the next objective.

  3. Challenge

    Search Index in Multiple Languages

    Context: For this objective, you should already be on the index page for your search index, which defaults to the Search Explorer pane.

    In this objective, you should feel free to explore the data in the description columns, using various terms in other languages, including those in the originally translated description fields you'll see at the top of each document, as well as the translated_text field that provides a description in Swedish, which you will see at the bottom of each document.

    To see all 20 documents, simply execute a search with nothing or a * in the search bar. Keep in mind that only the English description column is translated into other languages.

    A few search examples:
    • beach: strand (Swedish), playa (Spanish)
    • apartment, condo, condominium: lägenhet (Swedish), wohnung (German)

      Note that even by providing some near-synonyms in the form of "apartment, condo, condominium," the Swedish translation returned fewer results, and the Gernam translation returned more results as compared to the English search. When it comes to language translation, there are frequently no one-to-one equivalents.

The Cloud Content team comprises subject matter experts hyper focused on services offered by the leading cloud vendors (AWS, GCP, and Azure), as well as cloud-related technologies such as Linux and DevOps. The team is thrilled to share their knowledge to help you build modern tech solutions from the ground up, secure and optimize your environments, and so much more!

What's a lab?

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Provided environment for hands-on practice

We will provide the credentials and environment necessary for you to practice right within your browser.

Guided walkthrough

Follow along with the author’s guided walkthrough and build something new in your provided environment!

Did you know?

On average, you retain 75% more of your learning if you get time for practice.

Start learning by doing today

View Plans