azure-cosmos
Microsoft Azure Cosmos Client Library for Python
Description
Azure Cosmos DB SQL API client library for Python
Disclaimer
Azure SDK Python packages support for Python 2.7 has ended 01 January 2022. For more information and questions, please refer to https://github.com/Azure/azure-sdk-for-python/issues/20691
Azure Cosmos DB is a globally distributed, multi-model database service that supports document, key-value, wide-column, and graph databases.
Use the Azure Cosmos DB SQL API SDK for Python to manage databases and the JSON documents they contain in this NoSQL database service. High level capabilities are:
- Create Cosmos DB databases and modify their settings
- Create and modify containers to store collections of JSON documents
- Create, read, update, and delete the items (JSON documents) in your containers
- Query the documents in your database using SQL-like syntax
[SDK source code][source_code] | [Package (PyPI)][cosmos_pypi] | Package (Conda) | [API reference documentation][ref_cosmos_sdk] | [Product documentation][cosmos_docs] | [Samples][cosmos_samples]
This SDK is used for the SQL API. For all other APIs, please check the Azure Cosmos DB documentation to evaluate the best SDK for your project.
Getting started
Important update on Python 2.x Support
New releases of this SDK won't support Python 2.x starting January 1st, 2022. Please check the CHANGELOG for more information.
Prerequisites
- Azure subscription - [Create a free account][azure_sub]
- Azure [Cosmos DB account][cosmos_account] - SQL API
- [Python 3.8+][python]
If you need a Cosmos DB SQL API account, you can create one with this [Azure CLI][azure_cli] command:
az cosmosdb create --resource-group <resource-group-name> --name <cosmos-account-name>
Install the package
pip install azure-cosmos
Configure a virtual environment (optional)
Although not required, you can keep your base system and Azure SDK environments isolated from one another if you use a virtual environment. Execute the following commands to configure and then enter a virtual environment with [venv][venv]:
python3 -m venv azure-cosmosdb-sdk-environment
source azure-cosmosdb-sdk-environment/bin/activate
Authenticate the client
Interaction with Cosmos DB starts with an instance of the [CosmosClient][ref_cosmosclient] class. You need an account, its URI, and one of its account keys to instantiate the client object.
Use the Azure CLI snippet below to populate two environment variables with the database account URI and its primary master key (you can also find these values in the Azure portal). The snippet is formatted for the Bash shell.
RES_GROUP=<resource-group-name>
ACCT_NAME=<cosmos-db-account-name>
export ACCOUNT_URI=$(az cosmosdb show --resource-group $RES_GROUP --name $ACCT_NAME --query documentEndpoint --output tsv)
export ACCOUNT_KEY=$(az cosmosdb list-keys --resource-group $RES_GROUP --name $ACCT_NAME --query primaryMasterKey --output tsv)
Create the client
Once you've populated the ACCOUNT_URI and ACCOUNT_KEY environment variables, you can create the [CosmosClient][ref_cosmosclient].
from azure.cosmos import CosmosClient
import os
URL = os.environ['ACCOUNT_URI']
KEY = os.environ['ACCOUNT_KEY']
client = CosmosClient(URL, credential=KEY)
AAD Authentication
You can also authenticate a client utilizing your service principal's AAD credentials and the azure identity package. You can directly pass in the credentials information to ClientSecretCredential, or use the DefaultAzureCredential:
from azure.cosmos import CosmosClient
from azure.identity import ClientSecretCredential, DefaultAzureCredential
import os
url = os.environ['ACCOUNT_URI']
tenant_id = os.environ['TENANT_ID']
client_id = os.environ['CLIENT_ID']
client_secret = os.environ['CLIENT_SECRET']
# Using ClientSecretCredential
aad_credentials = ClientSecretCredential(
tenant_id=tenant_id,
client_id=client_id,
client_secret=client_secret)
# Using DefaultAzureCredential (recommended)
aad_credentials = DefaultAzureCredential()
client = CosmosClient(url, aad_credentials)
Always ensure that the managed identity you use for AAD authentication has readMetadata permissions. <br>
More information on how to set up AAD authentication: Set up RBAC for AAD authentication <br>
More information on allowed operations for AAD authenticated clients: RBAC Permission Model
Preferred Locations
To enable multi-region support in CosmosClient, set the preferred_locations parameter.
By default, all writes and reads go to the dedicated write region unless specified otherwise.
The preferred_locations parameter accepts a list of regions for read requests.
Requests are sent to the first region in the list, and if it fails, they move to the next region.
For example, to set West US as the read region, and Central US as the backup read region, the code would look like this:
from azure.cosmos import CosmosClient
import os
URL = os.environ['ACCOUNT_URI']
KEY = os.environ['ACCOUNT_KEY']
client = CosmosClient(URL, credential=KEY, preferred_locations=["West US", "Central US"])
Also note that if all regions listed in preferred locations fail, read requests are sent to the main write region.
For example if the write region is set to East US, then preferred_locations=["West US", "Central US"]
is equivalent to preferred_locations=["West US", "Central US", "East US"] since the client will send all requests to the write region if the preferred locations fail.
Key concepts
Once you've initialized a [CosmosClient][ref_cosmosclient], you can interact with the primary resource types in Cosmos DB:
-
[Database][ref_database]: A Cosmos DB account can contain multiple databases. When you create a database, you specify the API you'd like to use when interacting with its documents: SQL, MongoDB, Gremlin, Cassandra, or Azure Table. Use the [DatabaseProxy][ref_database] object to manage its containers.
-
[Container][ref_container]: A container is a collection of JSON documents. You create (insert), read, update, and delete items in a container by using methods on the [ContainerProxy][ref_container] object.
-
Item: An Item is the dictionary-like representation of a JSON document stored in a container. Each Item you add to a container must include an
idkey with a value that uniquely identifies the item within the container.
For more information about these resources, see [Working with Azure Cosmos databases, containers and items][cosmos_resources].
How to use enable_cross_partition_query
The keyword-argument enable_cross_partition_query accepts 2 options: None (default) or True.
Note on using queries by id
When using queries that try to find items based on an id value, always make sure you are passing in a string type variable. Azure Cosmos DB only allows string id values and if you use any other datatype, this SDK will return no results and no error messages.
Note on client consistency levels
As of release version 4.3.0b3, if a user does not pass in an explicit consistency level to their client initialization,
their client will use their database account's default level. Previously, the default was being set to Session consistency.
If for some reason you'd like to keep doing this, you can change your client initialization to include the explicit parameter for this like shown:
from azure.cosmos import CosmosClient
import os
URL = os.environ['ACCOUNT_URI']
KEY = os.environ['ACCOUNT_KEY']
client = CosmosClient(URL, credential=KEY, consistency_level='Session')
Limitations
Currently, the features below are not supported. For alternatives options, check the Workarounds section below.
Data Plane Limitations:
- Group By queries
- Queries with COUNT from a DISTINCT subquery: SELECT COUNT (1) FROM (SELECT DISTINCT C.ID FROM C)
- Direct TCP Mode access
- Continuation token support for aggregate cross-partition queries like sorting, counting, and distinct.
Streamable queries like
SELECT * FROM WHEREdo support continuation tokens. - Change Feed: Processor
- Change Feed: Read multiple partitions key values
- Cross-partition ORDER BY for mixed types
- Enabling diagnostics for async query-type methods
Control Plane Limitations:
- Get CollectionSizeUsage, DatabaseUsage, and DocumentUsage metrics
- Get the connection string
- Get the minimum RU/s of a container
Workarounds
Control Plane Limitations Workaround
Typically, you can use Azure Portal, Azure Cosmos DB Resource Provider REST API, Azure CLI or PowerShell for the control plane unsupported limitations.
Using The Async Client as a Workaround to Bulk
While the SDK supports transactional batch, support for bulk requests is not yet implemented in the Python SDK. You can use the async client along with this [concurrency sample][cosmos_concurrency_sample] we have developed as a reference for a possible workaround.
[WARNING] Using the asynchronous client for concurrent operations like shown in this sample will consume a lot of RUs very fast. We strongly recommend testing this out against the cosmos emulator first to verify your code works well and avoid incurring charges.
Boolean Data Type
While the Python language [uses](https://docs.python.org/3/library/s