Give AlbumentationsX a star on GitHub — it powers this leaderboard

Star on GitHub

azure-search-documents

Microsoft Azure Cognitive Search Client Library for Python

Rank: #1598Downloads: 5,854,508 (30 days)Stars: 5,497Forks: 3,248

Description

Azure AI Search client library for Python

Azure AI Search (formerly known as "Azure Cognitive Search") is an AI-powered information retrieval platform that helps developers build rich search experiences and generative AI apps that combine large language models with enterprise data.

Azure AI Search is well suited for the following application scenarios:

  • Consolidate varied content types into a single searchable index. To populate an index, you can push JSON documents that contain your content, or if your data is already in Azure, create an indexer to pull in data automatically.
  • Attach skillsets to an indexer to create searchable content from images and unstructured documents. A skillset leverages APIs from Azure AI Services for built-in OCR, entity recognition, key phrase extraction, language detection, text translation, and sentiment analysis. You can also add custom skills to integrate external processing of your content during data ingestion.
  • In a search client application, implement query logic and user experiences similar to commercial web search engines and chat-style apps.

Use the Azure.Search.Documents client library to:

  • Submit queries using vector, keyword, and hybrid query forms.
  • Implement filtered queries for metadata, geospatial search, faceted navigation, or to narrow results based on filter criteria.
  • Create and manage search indexes.
  • Upload and update documents in the search index.
  • Create and manage indexers that pull data from Azure into an index.
  • Create and manage skillsets that add AI enrichment to data ingestion.
  • Create and manage analyzers for advanced text analysis or multi-lingual content.
  • Optimize results through semantic ranking and scoring profiles to factor in business logic or freshness.

Source code | Package (PyPI) | Package (Conda) | API reference documentation | Product documentation | Samples

Getting started

Install the package

Install the Azure AI Search client library for Python with pip:

pip install azure-search-documents

Prerequisites

  • Python 3.8 or later is required to use this package.
  • You need an [Azure subscription][azure_sub] and an [Azure AI Search service][search_resource] to use this package.

To create a new search service, you can use the [Azure portal][create_search_service_docs], [Azure PowerShell][create_search_service_ps], or the [Azure CLI][create_search_service_cli].

az search service create --name <mysearch> --resource-group <mysearch-rg> --sku free --location westus

See choosing a pricing tier for more information about available options.

Authenticate the client

To interact with the search service, you'll need to create an instance of the appropriate client class: SearchClient for searching indexed documents, SearchIndexClient for managing indexes, or SearchIndexerClient for crawling data sources and loading search documents into an index. To instantiate a client object, you'll need an endpoint and Azure roles or an API key. You can refer to the documentation for more information on supported authenticating approaches with the search service.

Get an API Key

An API key can be an easier approach to start with because it doesn't require pre-existing role assignments.

You can get the endpoint and an API key from the Search service in the Azure portal. Please refer the documentation for instructions on how to get an API key.

Alternatively, you can use the following Azure CLI command to retrieve the API key from the Search service:

az search admin-key show --service-name <mysearch> --resource-group <mysearch-rg>

There are two types of keys used to access your search service: admin (read-write) and query (read-only) keys. Restricting access and operations in client apps is essential to safeguarding the search assets on your service. Always use a query key rather than an admin key for any query originating from a client app.

Note: The example Azure CLI snippet above retrieves an admin key so it's easier to get started exploring APIs, but it should be managed carefully.

Create a SearchClient

To instantiate the SearchClient, you'll need the endpoint, API key and index name:

<!-- SNIPPET:sample_authentication.create_search_client_with_key -->
from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient

service_endpoint = os.environ["AZURE_SEARCH_SERVICE_ENDPOINT"]
index_name = os.environ["AZURE_SEARCH_INDEX_NAME"]
key = os.environ["AZURE_SEARCH_API_KEY"]

search_client = SearchClient(service_endpoint, index_name, AzureKeyCredential(key))
<!-- END SNIPPET -->

Create a client using Microsoft Entra ID authentication

You can also create a SearchClient, SearchIndexClient, or SearchIndexerClient using Microsoft Entra ID authentication. Your user or service principal must be assigned the "Search Index Data Reader" role. Using the DefaultAzureCredential you can authenticate a service using Managed Identity or a service principal, authenticate as a developer working on an application, and more all without changing code. Please refer the documentation for instructions on how to connect to Azure AI Search using Azure role-based access control (Azure RBAC).

Before you can use the DefaultAzureCredential, or any credential type from Azure.Identity, you'll first need to install the Azure.Identity package.

To use DefaultAzureCredential with a client ID and secret, you'll need to set the AZURE_TENANT_ID, AZURE_CLIENT_ID, and AZURE_CLIENT_SECRET environment variables; alternatively, you can pass those values to the ClientSecretCredential also in Azure.Identity.

Make sure you use the right namespace for DefaultAzureCredential at the top of your source file:

from azure.identity import DefaultAzureCredential
from azure.search.documents import SearchClient

service_endpoint = os.getenv("AZURE_SEARCH_SERVICE_ENDPOINT")
index_name = os.getenv("AZURE_SEARCH_INDEX_NAME")
credential = DefaultAzureCredential()

search_client = SearchClient(service_endpoint, index_name, credential)

Key concepts

An Azure AI Search service contains one or more indexes that provide persistent storage of searchable data in the form of JSON documents. (If you're brand new to search, you can make a very rough analogy between indexes and database tables.) The Azure.Search.Documents client library exposes operations on these resources through three main client types.

<!-- * Most of the `SearchServiceClient` functionality is not yet available in our current preview -->

Azure AI Search provides two powerful features: semantic ranking and vector search.

Semantic ranking enhances the