Implementing Fuzzy Search Using Azure Cognitive Search: A Step by Step Guide

Implementing Fuzzy Search Using Azure Cognitive Search

Azure Cognitive Search supports fuzzy search, a type of query that compensates for typos and misspelled terms in the input string. It does this by scanning for terms having a similar composition. Expanding search to cover near-matches has the effect of auto-correcting a typo when the discrepancy is just a few misplaced characters.

Every document in a search result set is assigned a relevance score. The function of the relevance score is to rank higher those documents that best answer a user question as expressed by the search query. The score is computed based on statistical properties of terms that matched. At the core of the scoring formula is TF/IDF (term frequency-inverse document frequency). In queries containing rare and common terms, TF/IDF promotes results containing the rare term.

Architecturally, a search service sits in between the external data stores that contain your un-indexed data, and your client app that sends query requests to a search index and handles the response.

Implementing Fuzzy Search Using Azure Cognitive Search

In this blog we will be talking about:

  • Create Storage Account and upload data into Azure Blob Storage
  • Create Azure Cognitive Search and Integrate with Azure Blob Storage account
  • Use Postman to fuzzy search and retrieve data from Azure Search Index

Below is the step-by-step guide to achieve fuzzy search capability over your data.

Create Storage Account and upload data into Azure Blob Storage

1. Create storage account from Azure portal. In Azure marketplace, search for Storage Account and Create.

2. Choose the subscription and resource group in which storage account would be deployed. Enter a unique storage account name, choose location nearest to area of use. Choose performance as Standard and redundancy as LRS for low costs.

Implementing Fuzzy Search Using Azure Cognitive Search - create storage account

3. In storage account. Go to Containers. Add a Container. Enter name of container and choose public access level as Blob.

Implementing Fuzzy Search Using Azure Cognitive Search - containers

4. The container is created.

Implementing Fuzzy Search Using Azure Cognitive Search - containers

5. Click on container. Choose upload option and upload the data.json file. 

Implementing Fuzzy Search Using Azure Cognitive Search - container

The format should be in Json arrays. So that the index is able to consume data.

To refer, download provided data.json file.

We are done here creating a storage account and uploading data into the Azure blob storage. Now we will be looking into how we can access this data.

Create Azure Cognitive Search and Integrate with Azure Blob Storage account

1. Create Azure Cognitive Service from Azure portal. In Azure Marketplace search for Azure Cognitive Search and create.

Implementing Fuzzy Search using Azure Cognitive Search

2. Choose the subscription and resource group in which service would be deployed. Enter a unique service name, choose location nearest to area of use. We can choose pricing tier as Free.

Implementing Fuzzy Search using Azure Cognitive Search - new search service

3. Free version provides us with the below Usage. If we need more indexes and data storage, then we can change the pricing tier.

Implementing Fuzzy Search using Azure Cognitive Search - free version

4. Import data in Azure Search Index. Go to Azure Search service created. Click on Import data.

Implementing Fuzzy Search using Azure Cognitive Search - azure search index

5. Connect to your data. Fill in the required fields.

Implementing Fuzzy Search using Azure Cognitive Search - import data

For connection string: choose an existing connection. Select the storage account and container that contains data.json that we uploaded in the above steps.

6. Move to Customize target index section. The data is read into index and columns are extracted from the provided data.  

7. Now, select the columns we want to retrieve, filter, sort, face, or search data. Choose the analyzer. By default – Standard Lucene.

Implementing Fuzzy Search using Azure Cognitive Search - import data

8. Move to next step. Create an indexer and click create.

Implementing Fuzzy Search using Azure Cognitive Search - import data

Below are the Usage details after importing data into index:

Implementing Fuzzy Search using Azure Cognitive Search - usage details

We are done here creating an index and connecting it with the data source. Now we will be looking into how we can use Azure Cognitive Search in other applications.

Use Postman to fuzzy search and retrieve data from Azure Search Index

1. Go to indexes, click on the created index.

Implementing Fuzzy Search using Azure Cognitive Search - indexes
Implementing Fuzzy Search using Azure Cognitive Search - index

3. We need api-key to be provided in header for authentication. Go to Keys. Copy the Primary admin key.

Implementing Fuzzy Search using Azure Cognitive Search - apikey

4. Open Postman application. To query data from Postman. Choose HTTP GET method and paste the Request Url and api-key.

Implementing Fuzzy Search using Azure Cognitive Search - postman application

5. Now, we can search for data. In Query parameters we can provide search text. Let’s search Beth in data. Unfortunately, we didn’t find any data. This is because we did not have this search term in our data. This is to show, how fuzzy search will be helpful to search data for misspelled words. 

Implementing Fuzzy Search using Azure Cognitive Search

6. For fuzzy search 

  • Set the full Lucene parser on the query (queryType=full).
  • Use the tilde (~) parameter. Append the tilde (~) operator at the end of the whole term (search=<string>~). For example, Beth~1. The fuzzy search will work on search term – Beth with an edit distance of 1. The data will be searched on words such as Both, Bath, Beth, etc.

7. Now we can retrieve data as shown below. The default distance of an edit is 2. A value of ~0 signifies no expansion (only the exact term is considered a match), but you could specify ~1 for one degree of difference, or one edit. The edit distance can be increased or decreased as per use case.

Implementing Fuzzy Search using Azure Cognitive Search

We have seen how we can fuzzy search data leveraging free Azure Cognitive Search service. I hope this blog was helpful.

Karan Singh

What do you think?

Subscribed! We'll let you know when we have new blogs and events...