redshift-connector
Redshift interface library
Downloads: 0 (30 days)
Description
=======================================================
redshift_connector
=======================================================
|Python Version| |PyPi|
.. |PyPi| image:: https://img.shields.io/pypi/v/redshift_connector.svg?maxAge=432000&style=flat-square
:target: https://pypi.org/project/redshift_connector/
.. |Python Version| image:: https://img.shields.io/badge/python->=3.6-brightgreen.svg
:target: https://pypi.org/project/redshift_connector/
``redshift_connector`` is the Amazon Redshift connector for
Python. Easy integration with `pandas <https://github.com/pandas-dev/pandas>`_ and `numpy <https://github.com/numpy/numpy>`_, as well as support for numerous Amazon Redshift specific features help you get the most out of your data
Supported Amazon Redshift features include:
- IAM authentication
- Identity provider (IdP) authentication
- Redshift specific data types
This pure Python connector implements `Python Database API Specification 2.0 <https://www.python.org/dev/peps/pep-0249/>`_.
Getting Started
---------------
Install from Binary
~~~~~~~~~~~~~~~~~~~
+----------------------------------------------------------------+--------------------+-----------------------------------------------------+
| Package Manager | Downloads | Installation Command |
+================================================================+====================+=====================================================+
| `PyPi <https://pypi.org/project/redshift-connector/>`_ | |PyPi Downloads| | ``pip install redshift_connector`` |
+----------------------------------------------------------------+--------------------+-----------------------------------------------------+
| `Conda <https://anaconda.org/conda-forge/redshift_connector>`_ | |Conda Downloads| | ``conda install -c conda-forge redshift_connector`` |
+----------------------------------------------------------------+--------------------+-----------------------------------------------------+
.. |PyPi Downloads| image:: https://pepy.tech/badge/redshift_connector
.. |Conda Downloads| image:: https://img.shields.io/conda/dn/conda-forge/redshift_connector.svg
Install from Source
~~~~~~~~~~~~~~~~~~~
You may install from source by cloning this repository.
.. code-block:: sh
$ git clone https://github.com/aws/amazon-redshift-python-driver.git
$ cd redshift_connector
$ pip install .
Tutorials
~~~~~~~~~
- `001 - Connecting to Amazon Redshift <https://github.com/aws/amazon-redshift-python-driver/blob/master/tutorials/001%20-%20Connecting%20to%20Amazon%20Redshift.ipynb>`_
- `002 - Data Science Library Integrations <https://github.com/aws/amazon-redshift-python-driver/blob/master/tutorials/002%20-%20Data%20Science%20Library%20Integrations.ipynb>`_
- `003 - Amazon Redshift Feature Support <https://github.com/aws/amazon-redshift-python-driver/blob/master/tutorials/003%20-%20Amazon%20Redshift%20Feature%20Support.ipynb>`_
- `004 - Amazon Redshift Datatypes <https://github.com/aws/amazon-redshift-python-driver/blob/master/tutorials/004%20-%20Amazon%20Redshift%20Datatypes.ipynb>`_
We are working to add more documentation and would love your feedback. Please reach out to the team by `opening an issue <https://github.com/aws/amazon-redshift-python-driver/issues/new/choose>`__ or `starting a discussion <https://github.com/aws/amazon-redshift-python-driver/discussions/new>`_ to help us fill in the gaps in our documentation.
Integrations
~~~~~~~~~~~~
``redshift_connector`` integrates with various open source projects to provide an interface to Amazon Redshift. Please `open an issue <https://github.com/aws/amazon-redshift-python-driver/issues/new/choose>`__ with our project to request new integrations or get support for a ``redshift_connector`` issue seen in an existing integration.
- `apache-airflow <https://github.com/apache/airflow>`_
- `querybook <https://github.com/pinterest/querybook>`_
- `sqlalchemy-redshift <https://github.com/sqlalchemy-redshift/sqlalchemy-redshift>`_
Basic Example
~~~~~~~~~~~~~
.. code-block:: python
import redshift_connector
# Connects to Redshift cluster using AWS credentials
conn = redshift_connector.connect(
host='examplecluster.abc123xyz789.us-west-1.redshift.amazonaws.com',
database='dev',
user='awsuser',
password='my_password'
)
cursor: redshift_connector.Cursor = conn.cursor()
cursor.execute("create Temp table book(bookname varchar,author varchar)")
cursor.executemany("insert into book (bookname, author) values (%s, %s)",
[
('One Hundred Years of Solitude', 'Gabriel García Márquez'),
('A Brief History of Time', 'Stephen Hawking')
]
)
cursor.execute("select * from book")
result: tuple = cursor.fetchall()
print(result)
>> (['One Hundred Years of Solitude', 'Gabriel García Márquez'], ['A Brief History of Time', 'Stephen Hawking'])
Enabling autocommit
~~~~~~~~~~~~~~~~~~~
**Following the DB-API specification, autocommit is off by default**. It can be turned on by using the autocommit property of the connection.
.. code-block:: py3
# Make sure we're not in a transaction
conn.rollback()
conn.autocommit = True
conn.run("VACUUM")
conn.autocommit = False
Configuring paramstyle
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Paramstyle can be set on both a module and cursor level. When paramstyle is set on a module level e.g. ``redshift_connector.paramstyle = 'qmark'``, the user specified paramstyle is used for all subsequent cursors unless set on the cursor.
When paramstyle is set on the cursor e.g. ``cursor.paramstyle = 'qmark'`` the user specified paramstyle is only used for that cursor object.
.. code-block:: python
# setting paramstyle to qmark on a module level
redshift_connector.paramstyle = 'qmark'
with redshift_connector.connect() as conn1:
with conn1.cursor() as cursor1: # this cursor will use qmark paramstyle as it's been set on the module level
pass
with conn1.cursor() as cursor2:
# setting paramstyle to numeric on the cursor level only this cursor will use numeric paramstyle
cursor.paramstyle = 'numeric'
with conn1.cursor() as cursor3: # this cursor will use qmark paramstyle as it's been set on the module level
pass
with redshift_connector.connect() as conn2:
with conn2.cursor() as cursor1: # this cursor will use qmark paramstyle as it's been set on the module level
pass
The module level default paramstyle used is ``format``. Valid values for ``paramstyle`` include ``qmark, numeric, named, format, pyformat``. The below example shows how to use various paramstyles after the paramstyle is set on the cursor.
When paramstyle is set to ``named`` or ``pyformat``, parameters must be passed as a Python dictionary to the ``execute()`` method. Other paramstyles require parameters to be passed as a Python tuple or list.
.. code-block:: python
# qmark
cursor.paramstyle = 'qmark'
sql = 'insert into foo(bar, jar) VALUES(?, ?)'
cursor.execute(sql, (1, "hello world"))
# numeric
cursor.paramstyle = 'numeric'
sql = 'insert into foo(bar, jar) VALUES(:1, :2)'
cursor.execute(sql, (1, "hello world"))
# named
cursor.paramstyle = 'named'
sql = 'insert into foo(bar, jar) VALUES(:p1, :p2)'
cursor.execute(sql, {"p1":1, "p2":"hello world"})
# format
cursor.paramstyle = 'format'
sql = 'insert into foo(bar, jar) VALUES(%s, %s)'
cursor.execute(sql, (1, "hello world"))
# pyformat
cursor.paramstyle = 'pyformat'
sql = 'insert into foo(bar, jar) VALUES(%(bar)s, %(jar)s)'
cursor.execute(sql, {"bar": 1, "jar": "hello world"})
Exception Handling
~~~~~~~~~~~~~~~~~~~
``redshift_connector`` uses the guideline for exception handling specified in the `Python DB-API <https://www.python.org/dev/peps/pep-0249/#exceptions>`_. For exception definitions, please see `redshift_connector/error.py <https://github.com/aws/amazon-redshift-python-driver/blob/master/redshift_connector/error.py>`_
Example using IAM Credentials
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
IAM Credentials can be supplied directly to ``connect(...)`` using an AWS profile as shown below:
.. code-block:: python
import redshift_connector
# Connects to Redshift cluster using IAM credentials from default profile defined in ~/.aws/credentials
conn = redshift_connector.connect(
iam=True,
database='dev',
db_user='awsuser',
password='',
user='',
cluster_identifier='examplecluster',
profile='default'
)
.. code-block:: bash
# ~/.aws/credentials
[default]
aws_access_key_id="my_aws_access_key_id"
aws_secret_access_key="my_aws_secret_access_key"
aws_session_token="my_aws_session_token"
# ~/.aws/config
[default]
region=us-west-2
If a region is not provided in `~/.aws/config` or you would like to override its value, `region` may be passed to ``connect(...)``.
Alternatively, IAM credentials can be supplied directly to ``connect(...)`` using AWS credentials as shown below:
.. code-block:: python
import redshift_connector
# Connects to Redshift cluster using IAM credentials from default profile defined in ~/.aws/credentials
conn = redshift_connector.connect(
iam=True,
database='dev',
db_user='awsuser',
password='',
user='',
cluster_identifier='examplecluster',
access_key_id="my_aws_access_key_id",
secret_access_key="my_aws_secret_access_key",
session_token="my_aws_session_token",
region="us-east-2"
)
Integration with pandas
~~~~~~~~~~~~~~~~~~~~~~~
Retrieving query results as a ``pandas.DataFrame``