sqlalchemy-spanner

Name: sqlalchemy-spanner
Author: Google LLC
SQLAlchemy dialect integrated into Cloud Spanner database
Downloads: 0 (30 days)
View on PyPI GitHub Owner: cloudspannerecosystem
Description

Spanner dialect for SQLAlchemy
==============================

Spanner dialect for SQLAlchemy represents an interface API designed to
make it possible to control Cloud Spanner databases with SQLAlchemy API.
The dialect is built on top of `the Spanner DB
API <https://github.com/googleapis/python-spanner/tree/master/google/cloud/spanner_dbapi>`__,
which is designed in accordance with
`PEP-249 <https://www.python.org/dev/peps/pep-0249/>`__.

Known limitations are listed `here <#features-and-limitations>`__. All
supported features have been tested and verified to work with the test
configurations. There may be configurations and/or data model variations
that have not yet been covered by the tests and that show unexpected
behavior. Please report any problems that you might encounter by
`creating a new
issue <https://github.com/googleapis/python-spanner-sqlalchemy/issues/new>`__.

-  `Cloud Spanner product
   documentation <https://cloud.google.com/spanner/docs>`__
-  `SQLAlchemy product documentation <https://www.sqlalchemy.org/>`__

Quick Start
-----------

In order to use this package, you first need to go through the following
steps:

1. `Select or create a Cloud Platform
   project. <https://console.cloud.google.com/project>`__
2. `Enable billing for your
   project. <https://cloud.google.com/billing/docs/how-to/modify-project#enable_billing_for_a_project>`__
3. `Enable the Google Cloud Spanner
   API. <https://cloud.google.com/spanner>`__
4. `Setup
   Authentication. <https://googleapis.dev/python/google-api-core/latest/auth.html>`__

Installation
------------
Stable released version of the package is available on PyPi:

::

   pip install sqlalchemy-spanner

To install an in-development version of the package, clone its
Git-repository:

::

   git clone https://github.com/googleapis/python-spanner-sqlalchemy.git

Next install the package from the package ``setup.py`` file:

::

   python setup.py install

During setup the dialect will be registered with entry points.

Samples
-------------

The `samples directory <https://github.com/googleapis/python-spanner-sqlalchemy/blob/-/samples/README.md>`__
contains multiple examples for how to configure and use common Spanner features.


A Minimal App
-------------

Database URL
~~~~~~~~~~~~

In order to connect to a database one have to use its URL on connection
creation step. SQLAlchemy 1.3 and 1.4 versions have a bit of difference
on this step in a dialect prefix part:

.. code:: python

   # for SQLAlchemy 1.3:
   spanner:///projects/project-id/instances/instance-id/databases/database-id

   # for SQLAlchemy 1.4 and 2.0:
   spanner+spanner:///projects/project-id/instances/instance-id/databases/database-id

To pass your custom client object directly to be be used, create engine as following:

.. code:: python

    engine = create_engine(
        "spanner+spanner:///projects/project-id/instances/instance-id/databases/database-id",
        connect_args={'client': spanner.Client(project="project-id")},
        isolation_level="SERIALIZABLE"
    )

Create a table
~~~~~~~~~~~~~~

.. code:: python

   from sqlalchemy import (
       Column,
       Integer,
       MetaData,
       String,
       Table,
       create_engine,
   )

   engine = create_engine(
       "spanner:///projects/project-id/instances/instance-id/databases/database-id"
   )
   metadata = MetaData(bind=engine)

   user = Table(
       "users",
       metadata,
       Column("user_id", Integer, primary_key=True),
       Column("user_name", String(16), nullable=False),
   )

   metadata.create_all(engine)

Insert a row
~~~~~~~~~~~~

.. code:: python

   import uuid

   from sqlalchemy import (
       MetaData,
       Table,
       create_engine,
   )

   engine = create_engine(
       "spanner:///projects/project-id/instances/instance-id/databases/database-id"
   )
   user = Table("users", MetaData(bind=engine), autoload=True)
   user_id = uuid.uuid4().hex[:6].lower()

   with engine.begin() as connection:
       connection.execute(user.insert(), {"user_id": user_id, "user_name": "Full Name"})

Read
~~~~

.. code:: python

   from sqlalchemy import MetaData, Table, create_engine, select

   engine = create_engine(
       "spanner:///projects/project-id/instances/instance-id/databases/database-id"
   )
   table = Table("users", MetaData(bind=engine), autoload=True)

   with engine.begin() as connection:
       for row in connection.execute(select(["*"], from_obj=table)).fetchall():
           print(row)

Migration
---------

SQLAlchemy uses `Alembic <https://alembic.sqlalchemy.org/en/latest/#>`__
tool to organize database migrations.

Spanner dialect doesn't provide a default migration environment, it's up
to user to write it. One thing to be noted here - one should explicitly
set ``alembic_version`` table not to use migration revision id as a
primary key:

.. code:: python

   with connectable.connect() as connection:
       context.configure(
           connection=connection,
           target_metadata=target_metadata,
           version_table_pk=False,  # don't use primary key in the versions table
       )

As Spanner restricts changing a primary key value, not setting the ``version_table_pk`` flag
to ``False`` can cause migration problems. If ``alembic_versions`` table was already created with a primary key, setting the flag to ``False`` will not work, because the flag is only applied on table creation.    

Notice that DDL statements in Spanner are not transactional. They will not be automatically reverted in case of a migration fail. Also Spanner encourage use of the `autocommit_block() <https://alembic.sqlalchemy.org/en/latest/api/runtime.html#alembic.runtime.migration.MigrationContext.autocommit_block>`__ for migrations in order to prevent DDLs from aborting migration transactions with schema modifications.

| **Warning!**
| A migration script can produce a lot of DDL statements. If each of the
  statements is executed separately, performance issues can occur. To
  avoid it, it's highly recommended to use the `Alembic batch
  context <https://alembic.sqlalchemy.org/en/latest/batch.html>`__
  feature to pack DDL statements into groups of statements.

Features and limitations
------------------------

Interleaved tables
~~~~~~~~~~~~~~~~~~

| Cloud Spanner dialect includes two dialect-specific arguments for
  ``Table`` constructor, which help to define interleave relations:
  ``spanner_interleave_in`` - a parent table name
  ``spanner_inverleave_on_delete_cascade`` - a flag specifying if
  ``ON DELETE CASCADE`` statement must be used for the interleave
  relation
| An example of interleave relations definition:

.. code:: python

   team = Table(
       "team",
       metadata,
       Column("team_id", Integer, primary_key=True),
       Column("team_name", String(16), nullable=False),
   )
   team.create(engine)

   client = Table(
       "client",
       metadata,
       Column("team_id", Integer, primary_key=True),
       Column("client_id", Integer, primary_key=True),
       Column("client_name", String(16), nullable=False),
       spanner_interleave_in="team",
       spanner_interleave_on_delete_cascade=True,
   )
   client.add_is_dependent_on(team)

   client.create(engine)

**Note**: Interleaved tables have a dependency between them, so the
parent table must be created before the child table. When creating
tables with this feature, make sure to call ``add_is_dependent_on()`` on
the child table to request SQLAlchemy to create the parent table before
the child table.

Commit timestamps
~~~~~~~~~~~~~~~~~~

The dialect offers the ``spanner_allow_commit_timestamp`` option to
column constructors for creating commit timestamp columns.

.. code:: python

   Table(
       "table",
       metadata,
       Column("last_update_time", DateTime, spanner_allow_commit_timestamp=True),
   )

`See this documentation page for more details <https://cloud.google.com/spanner/docs/commit-timestamp>`__.

Unique constraints
~~~~~~~~~~~~~~~~~~

Cloud Spanner doesn't support direct UNIQUE constraints creation. In
order to achieve column values uniqueness, UNIQUE indexes should be used.

Instead of direct UNIQUE constraint creation:

.. code:: python

   Table(
       'table',
       metadata,
       Column('col1', Integer),
       UniqueConstraint('col1', name='uix_1')
   )

Create a UNIQUE index:

.. code:: python

   Table(
       'table',
       metadata,
       Column('col1', Integer),
       Index("uix_1", "col1", unique=True),
   )

Autocommit mode
~~~~~~~~~~~~~~~

Spanner dialect supports ``SERIALIZABLE``, ``REPEATABLE_READ``, and
``AUTOCOMMIT`` isolation levels. ``SERIALIZABLE`` is the default
isolation level.

``AUTOCOMMIT`` mode corresponds to automatically committing each
insert/update/delete statement right after is has been executed.
Queries that are executed in ``AUTOCOMMIT`` mode use a single-use
read-only transaction. These do not take any locks and do not need
to be committed.

Workloads that only read data, should use either ``AUTOCOMMIT`` or
a read-only transaction.

Isolation level change example:

.. code:: python

   from sqlalchemy import create_engine

   eng = create_engine("spanner:///projects/project-id/instances/instance-id/databases/database-id")
   autocommit_engine = eng.execution_options(isolation_level="AUTOCOMMIT")

Automatic transaction retry
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In ``SERIALIZABLE`` isolation mode, transactions may fail with an ``Aborted`` exception.
This happens if there are conflicts between different transactions, for example if one
transaction tries to read data that another transaction has modified. Aborted transactions
should be retried by the client. The Spanner SQLAlchemy provider automatically retries
aborted transactions.

Isolation level ``SERIALIZABLE`` takes lock for both **reads and writes**.

Use isolation level ``REPEATABLE READ`` to reduce the amount of locks that
are taken by read/write transactions. ``REPEATABLE READ`` only takes locks
for **writes** and for