Skip to content

Contact sales

By filling out this form and clicking submit, you acknowledge our privacy policy.
  • Labs icon Lab
  • A Cloud Guru
Azure icon
Labs

Implement Knowledge Mining in Azure AI Search

In this hands-on experience, you will get a chance to set develop a knowledge mining solution an extension of an Azure AI Search solution, including creating the search service, retrieving data from an external source, importing that data to generate a search index, and adding a simple skillset. You will then configure the skillset-enhanced data to not only appear in the search index, but also persist in a separate knowledge store for use by analytics teams or as a source for additional AI processing. You will validate your work by searching the data using the JSON query editor in the Search Explorer, as well as the knowledge store deployed to an Azure Storage account. In the process, you will also see, first-hand, how the indexer automatically converts comma-delimited data to JSON. All work will take place in the Azure portal, and no coding is required to create the resources and test your solution. However, note that you will need to be allowed to download a small CSV file into your local environment in order to complete the lab objectives.

Azure icon
Labs

Path Info

Level
Clock icon Intermediate
Duration
Clock icon 45m
Published
Clock icon Feb 09, 2024

Contact sales

By filling out this form and clicking submit, you acknowledge our privacy policy.

Table of Contents

  1. Challenge

    Create an Azure AI Search Service

    You should already be logged into the Azure portal, using the credentials provided with the lab. When you first log in to the Azure Portal, you will land on the overview page for the resource group already deployed for you. Note the location/region for the resource group and note the Azure Storage account already deployed for you, with a name that starts with, "labstorage...".

    Using the Azure portal, create an Azure AI Search service with the following configuration:

    • Create the service in the existing resource group, and in the same location as that resource group.
    • Use any valid name you choose.
    • Create the service in the Basic pricing tier. Do not choose the Free tier, as only one is allowed per subscription, and your lab environment is on a shared subscription with other students.
  2. Challenge

    Import Data and Configure Knowledge Store

    In this objective, you will complete the tasks required to import sample data and build an index on your search service, based on that data. The indexer that populates the index will also include a simple skillset that identifies the language of the text in a specific data field. Additionally, by leveraging the ability to set up a knowledge store and pipe enhanced data to another location, in parallel with index creation, you will implement a basic knowledge mining solution that analysts can use outside of search use cases.

    Retrieve and store sample data
    • You should have already created an Azure AI Search service. Navigate to the newly deployed service, and on the "Overview" select the Resource Group to navigate back to the resource group that was set up for you and that was the landing page when you first logged into the lab.
    • Navigate to the Azure Storage resource already created for you and navigate to the blob container called lab-container.
    • In a separate browser window, navigate to the GitHub repository link provided in the "Additional Information and Resources" section of this lab.
    • You should see a file called adoptable-cats.csv. Download that file to your local environment, and then return to the storage account in the Azure portal and upload the adoptable-cats.csv file to the blob container called lab-container.
    • Use the breadcrumb trail to return to the resource group Overview page. Naviage to the newly deployed Azure AI Search service.
    Import data and configure index and skillset

    On the "Overview" page for the search service, select "Import Data" to kick off the wizard. Let the wizard guide you through the process, including the following specific details and properties.

    • Choose to import data from Azure Blob Storage. Set up the new Data Source with any name you prefer, include all content and metadata, and set the parsing mode to Delimited text. Affirm that the first line contains a header and that the delimiter character is a comma.
    • Select the pre-existing storage account and lab-container. You will not need to specifically name the CSV file; data will be pulled from all valid sources of data in the selected blob storage container.
    • Under Add cognitive skills, Attach AI Services, confirm that the AI Services Resource Name is "Free (Limited enrichments)." This setting determines the compute resource that will be used to power the AI enrichment(s) you select. The free option allows you to use non-billed resources instead of setting up and paying for compute and storage — on an Azure AI multi-service resource.
    • Under Add Enrichments you should add a new enrichment using the behavior_2 field as your source data field. Do not use the default source data field. It must be set to behavior_2 for the enrichment to make any sense.
    • The enrichment you want to apply should identify the language of the text stored behavior_2 field and populate a new field called language.
    • Under Save enrichments to a knowledge store we want to create an Azure table projection. Select Choose an existing connection and choose the same Azure Storage account where you uploaded the CSV file. Then add a new container called projections and select that.

      Selecting a blob container is a quirk of the wizard, because you are really just choosing the storage account. The knowledge store will be in the form of a table; tables are not , stored in a blob container.

    • All other features on the Add cognitive skills tab can remain at their defaults.
    • Under Customize target index, familiarize yourself with the fields extracted from the source data. For simplicity, make all fields Retrieveable and Searchable, including the language field.
    • Move on to the Create an indexer and Submit to create the index, the skillset, and the indexer, and which will kick off the first run of the indexer.
    • When the indexer run is complete, navigate to the index populated by the indexer and note the number of JSON documents created and the storage size. There should be 6 documents, each containing the data related to a single cat for adoption.
    Tip: If the UI Indicates 0 Documents and 0 Bytes

    If the UI in the index screen appears to indicate that there are no documents, make sure the indexer has completed running. However, there may also be a quirk in the UI. You can also perform a quick query by putting an asterisk (*) in the search bar and running the query. If the query returns documents, the UI just hasn't caught up with the underlying data statistics; you can proceed with the next objective.

  3. Challenge

    Verify Data in Both Index and Knowledge Store

    Context: For this objective, you should already be on the index page for your search index, which defaults to the Search Explorer pane.

    Browse the index data with Search Explorer

    Using the Search Explorer, browse the index to examine the output of the data source data, as well as the language field that stores the extracted terms from the behavior_2 column. The value should correctly identify the text in that column as en or Español.

    Tip: To see all 6 documents, simply execute a search with nothing or a * in the search bar.

    Browse the Knowledge Store in Azure Storage
    1. Click on Home in the breadcrumb trail at the top of the screen.
    2. Navigate to the Azure Storage account where you stored the CSV file and where you projected the source data in parallel with creating a search index.
    3. Use the Storage browser to open the table and browse the data. Note that the language field is named slightly differently in the projected table data.
    Manage the Columns in the Table Projection

    Suppose you want fewer columns in the data projected to the storage account.

    1. Navigate back to the Azure AI Search service you created and select Skillsets from the left menu. Open the skillset you created and edit the JSON to remove a few fields from the inputs object under the #Microsoft.Skills.Util.ShaperSkill.

      For example, remove the gender and age fields. In reality, you are more likely to remove some of the metadata fields, but gender and age are easier to remember when checking your work.

    2. Save your altered skillset.
    3. Using the left menu, navigate to your indexer and Reset it to clear out the index.
    4. Run the indexer again and refresh the screen to confirm a successful run.
    5. Return to the Storage browser to view the projected table and note that the columns you removed in the skillset definition are no longer present.

The Cloud Content team comprises subject matter experts hyper focused on services offered by the leading cloud vendors (AWS, GCP, and Azure), as well as cloud-related technologies such as Linux and DevOps. The team is thrilled to share their knowledge to help you build modern tech solutions from the ground up, secure and optimize your environments, and so much more!

What's a lab?

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Provided environment for hands-on practice

We will provide the credentials and environment necessary for you to practice right within your browser.

Guided walkthrough

Follow along with the author’s guided walkthrough and build something new in your provided environment!

Did you know?

On average, you retain 75% more of your learning if you get time for practice.

Start learning by doing today

View Plans