Give AlbumentationsX a star on GitHub — it powers this leaderboard

Star on GitHub

orjson

Fast, correct Python JSON library supporting dataclasses, datetimes, and numpy

Downloads: 0 (30 days)

Description

orjson

orjson is a fast, correct JSON library for Python. It benchmarks as the fastest Python library for JSON and is more correct than the standard json library or other third-party libraries. It serializes dataclass, datetime, numpy, and UUID instances natively.

orjson.dumps() is something like 10x as fast as json, serializes common types and subtypes, has a default parameter for the caller to specify how to serialize arbitrary types, and has a number of flags controlling output.

orjson.loads() is something like 2x as fast as json, and is strictly compliant with UTF-8 and RFC 8259 ("The JavaScript Object Notation (JSON) Data Interchange Format").

Reading from and writing to files, line-delimited JSON files, and so on is not provided by the library.

orjson supports CPython 3.10, 3.11, 3.12, 3.13, 3.14, and 3.15.

It distributes amd64/x86_64/x64, i686/x86, aarch64/arm64/armv8, arm7, ppc64le/POWER8, and s390x wheels for Linux, amd64 and aarch64 wheels for macOS, and amd64, i686, and aarch64 wheels for Windows.

Wheels published to PyPI for amd64 run on x86-64-v1 (2003) or later, but will at runtime use AVX-512 if available for a significant performance benefit; aarch64 wheels run on ARMv8-A (2011) or later.

orjson does not and will not support PyPy, embedded Python builds for Android/iOS, or PEP 554 subinterpreters.

orjson may support PEP 703 free-threading when it is stable.

Releases follow semantic versioning and serializing a new object type without an opt-in flag is considered a breaking change.

orjson contains source code licensed under the Mozilla Public License 2.0, Apache 2.0, and MIT licenses. The repository from which PyPI artifacts are published is github.com/ijl/orjson and an alternative repository is codeberg.org/ijl/orjson. There is no open issue tracker or pull requests due to signal-to-noise ratio. There is a CHANGELOG available in the repository.

  1. Usage
    1. Install
    2. Quickstart
    3. Migrating
    4. Serialize
      1. default
      2. option
      3. Fragment
    5. Deserialize
  2. Types
    1. dataclass
    2. datetime
    3. enum
    4. float
    5. int
    6. numpy
    7. str
    8. uuid
  3. Testing
  4. Performance
    1. Latency
    2. Reproducing
  5. Questions
  6. Packaging
  7. License

Usage

Install

To install a wheel from PyPI, install the orjson package.

In requirements.in or requirements.txt format, specify:

orjson >= 3.10,<4

In pyproject.toml format, specify:

orjson = "^3.10"

To build a wheel, see packaging.

Quickstart

This is an example of serializing, with options specified, and deserializing:

>>> import orjson, datetime, numpy
>>> data = {
    "type": "job",
    "created_at": datetime.datetime(1970, 1, 1),
    "status": "🆗",
    "payload": numpy.array([[1, 2], [3, 4]]),
}
>>> orjson.dumps(data, option=orjson.OPT_NAIVE_UTC | orjson.OPT_SERIALIZE_NUMPY)
b'{"type":"job","created_at":"1970-01-01T00:00:00+00:00","status":"\xf0\x9f\x86\x97","payload":[[1,2],[3,4]]}'
>>> orjson.loads(_)
{'type': 'job', 'created_at': '1970-01-01T00:00:00+00:00', 'status': '🆗', 'payload': [[1, 2], [3, 4]]}

Migrating

orjson version 3 serializes more types than version 2. Subclasses of str, int, dict, and list are now serialized. This is faster and more similar to the standard library. It can be disabled with orjson.OPT_PASSTHROUGH_SUBCLASS.dataclasses.dataclass instances are now serialized by default and cannot be customized in a default function unless option=orjson.OPT_PASSTHROUGH_DATACLASS is specified. uuid.UUID instances are serialized by default. For any type that is now serialized, implementations in a default function and options enabling them can be removed but do not need to be. There was no change in deserialization.

To migrate from the standard library, the largest difference is that orjson.dumps returns bytes and json.dumps returns a str.

Users with dict objects using non-str keys should specify option=orjson.OPT_NON_STR_KEYS.

sort_keys is replaced by option=orjson.OPT_SORT_KEYS.

indent is replaced by option=orjson.OPT_INDENT_2 and other levels of indentation are not supported.

ensure_ascii is probably not relevant today and UTF-8 characters cannot be escaped to ASCII.

Serialize

def dumps(
    __obj: Any,
    default: Optional[Callable[[Any], Any]] = ...,
    option: Optional[int] = ...,
) -> bytes: ...

dumps() serializes Python objects to JSON.

It natively serializes str, dict, list, tuple, int, float, bool, None, dataclasses.dataclass, typing.TypedDict, datetime.datetime, datetime.date, datetime.time, uuid.UUID, numpy.ndarray, and orjson.Fragment instances. It supports arbitrary types through default. It serializes subclasses of str, int, dict, list, dataclasses.dataclass, and enum.Enum. It does not serialize subclasses of tuple to avoid serializing namedtuple objects as arrays. To avoid serializing subclasses, specify the option orjson.OPT_PASSTHROUGH_SUBCLASS.

The output is a bytes object containing UTF-8.

The global interpreter lock (GIL) is held for the duration of the call.

It raises JSONEncodeError on an unsupported type. This exception message describes the invalid object with the error message Type is not JSON serializable: .... To fix this, specify default.

It raises JSONEncodeError on a str that contains invalid UTF-8.

It raises JSONEncodeError on an integer that exceeds 64 bits by default or, with OPT_STRICT_INTEGER, 53 bits.

It raises JSONEncodeError if a dict has a key of a type other than str, unless OPT_NON_STR_KEYS is specified.

It raises JSONEncodeError if the output of default recurses to handling by default more than 254 levels deep.

It raises JSONEncodeError on circular references.

It raises JSONEncodeError if a tzinfo on a datetime object is unsupported.

JSONEncodeError is a subclass of TypeError. This is for compatibility with the standard library.

If the failure was caused by an exception in default then JSONEncodeError chains the original exception as __cause__.

default

To serialize a subclass or arbitrary types, specify default as a callable that returns a supported type. default may be a function, lambda, or callable class instance. To specify that a type was not handled by default, raise an exception such as TypeError.

>>> import orjson, decimal
>>>
def default(obj):
    if isinstance(obj, decimal.Decimal):
        return str(obj)
    raise TypeError

>>> orjson.dumps(decimal.Decimal("0.0842389659712649442845"))
JSONEncodeError: Type is not JSON serializable: decimal.Decimal
>>> orjson.dumps(decimal.Decimal("0.0842389659712649442845"), default=default)
b'"0.0842389659712649442845"'
>>> orjson.dumps({1, 2}, default=default)
orjson.JSONEncodeError: Type is not JSON serializable: set

The default callable may return an object that itself must be handled by default up to 254 times before an exception is raised.

It is important that default raise an exception if a type cannot be handled. Python otherwise implicitly returns None, which appears to the caller like a legitimate value and is serialized:

>>> import orjson, json
>>>
def default(obj):
    if isinstance(obj, decimal.Decimal):
        return str(obj)

>>> orjson.dumps({"set":{1, 2}}, default=default)
b'{"set":null}'
>>> json.dumps({"set":{1, 2}}, default=default)
'{"set":null}'

option

To modify how data is serialized, specify option. Each option is an integer constant in orjson. To specify multiple options, mask them together, e.g., option=orjson.OPT_STRICT_INTEGER | orjson.OPT_NAIVE_UTC.

OPT_APPEND_NEWLINE

Append \n to