Give AlbumentationsX a star on GitHub — it powers this leaderboard

Star on GitHub

simsimd

Portable mixed-precision BLAS-like vector math library for x86 and ARM

Rank: #2086Downloads: 3,427,246 (30 days)Stars: 1,658Forks: 104

Description

SimSIMD banner

Computing dot-products, similarity measures, and distances between low- and high-dimensional vectors is ubiquitous in Machine Learning, Scientific Computing, Geospatial Analysis, and Information Retrieval. These algorithms generally have linear complexity in time, constant or linear complexity in space, and are data-parallel. In other words, it is easily parallelizable and vectorizable and often available in packages like BLAS (level 1) and LAPACK, as well as higher-level numpy and scipy Python libraries. Ironically, even with decades of evolution in compilers and numerical computing, most libraries can be 3-200x slower than hardware potential even on the most popular hardware, like 64-bit x86 and Arm CPUs. Moreover, most lack mixed-precision support, which is crucial for modern AI! The rare few that support minimal mixed precision, run only on one platform, and are vendor-locked, by companies like Intel and Nvidia. SimSIMD provides an alternative. 1️⃣ SimSIMD functions are practically as fast as memcpy. 2️⃣ Unlike BLAS, most kernels are designed for mixed-precision and bit-level operations. 3️⃣ SimSIMD often ships more binaries than NumPy and has more backends than most BLAS implementations, and more high-level interfaces than most libraries.

<div> <a href="https://pepy.tech/project/simsimd"> <img alt="PyPI" src="https://static.pepy.tech/personalized-badge/simsimd?period=total&units=abbreviation&left_color=black&right_color=blue&left_text=SimSIMD%20Python%20installs" /> </a> <a href="https://www.npmjs.com/package/simsimd"> <img alt="npm" src="https://img.shields.io/npm/dy/simsimd?label=JavaScript%20NPM%20installs" /> </a> <a href="https://crates.io/crates/simsimd"> <img alt="rust" src="https://img.shields.io/crates/d/simsimd?label=Rust%20Crate%20installs" /> </a> <img alt="GitHub code size in bytes" src="https://img.shields.io/github/languages/code-size/ashvardanian/simsimd"> <a href="https://github.com/ashvardanian/SimSIMD/actions/workflows/release.yml"> <img alt="GitHub Actions Ubuntu" src="https://img.shields.io/github/actions/workflow/status/ashvardanian/SimSIMD/release.yml?branch=main&label=Ubuntu&logo=github&color=blue"> </a> <a href="https://github.com/ashvardanian/SimSIMD/actions/workflows/release.yml"> <img alt="GitHub Actions Windows" src="https://img.shields.io/github/actions/workflow/status/ashvardanian/SimSIMD/release.yml?branch=main&label=Windows&logo=windows&color=blue"> </a> <a href="https://github.com/ashvardanian/SimSIMD/actions/workflows/release.yml"> <img alt="GitHub Actions macOS" src="https://img.shields.io/github/actions/workflow/status/ashvardanian/SimSIMD/release.yml?branch=main&label=macOS&logo=apple&color=blue"> </a> <a href="https://github.com/ashvardanian/SimSIMD/actions/workflows/release.yml"> <img alt="GitHub Actions CentOS Linux" src="https://img.shields.io/github/actions/workflow/status/ashvardanian/SimSIMD/release.yml?branch=main&label=CentOS&logo=centos&color=blue"> </a> </div>

Features

SimSIMD (Arabic: "سيمسيم دي") is a mixed-precision math library of over 350 SIMD-optimized kernels extensively used in AI, Search, and DBMS workloads. Named after the iconic "Open Sesame" command that opened doors to treasure in Ali Baba and the Forty Thieves, SimSIMD can help you 10x the cost-efficiency of your computational pipelines. Implemented distance functions include:

  • Euclidean (L2) and Cosine (Angular) spatial distances for Vector Search. docs
  • Dot-Products for real & complex vectors for DSP & Quantum computing. docs
  • Hamming (~ Manhattan) and Jaccard (~ Tanimoto) bit-level distances. docs
  • Set Intersections for Sparse Vectors and Text Analysis. docs
  • Mahalanobis distance and Quadratic forms for Scientific Computing. docs
  • Kullback-Leibler and Jensen–Shannon divergences for probability distributions. docs
  • Fused-Multiply-Add (FMA) and Weighted Sums to replace BLAS level 1 functions. docs
  • For Levenshtein, Needleman–Wunsch, and Smith-Waterman, check StringZilla.
  • 🔜 Haversine and Vincenty's formulae for Geospatial Analysis.

Moreover, SimSIMD...

  • handles float64, float32, float16, and bfloat16 real & complex vectors.
  • handles int8 integral, int4 sub-byte, and b8 binary vectors.
  • handles sparse uint32 and uint16 sets, and weighted sparse vectors.
  • is a zero-dependency header-only C 99 library.
  • has Python, Rust, JS, and Swift bindings.
  • has Arm backends for NEON, Scalable Vector Extensions (SVE), and SVE2.
  • has x86 backends for Haswell, Skylake, Ice Lake, Genoa, and Sapphire Rapids.
  • with both compile-time and runtime CPU feature detection easily integrates anywhere!

Due to the high-level of fragmentation of SIMD support in different x86 CPUs, SimSIMD generally uses the names of select Intel CPU generations for its backends. They, however, also work on AMD CPUs. Intel Haswell is compatible with AMD Zen 1/2/3, while AMD Genoa Zen 4 covers AVX-512 instructions added to Intel Skylake and Ice Lake. You can learn more about the technical implementation details in the following blog-posts:

Benchmarks

<table style="width: 100%; text-align: center; table-layout: fixed;"> <colgroup> <col style="width: 33%;"> <col style="width: 33%;"> <col style="width: 33%;"> </colgroup> <tr> <th align="center">NumPy</th> <th align="center">C 99</th> <th align="center">SimSIMD</th> </tr> <!-- Cosine distances with different precision levels --> <tr> <td colspan="4" align="center">cosine distances between 1536d vectors in <code>int8</code></td> </tr> <tr> <td align="center"> <!-- scipy.spatial.distance.cosine --> 🚧 overflows<br/> </td> <td align="center"> <!-- serial --> <span style="color:#ABABAB;">x86:</span> <b>10,548,600</b> ops/s<br/> <span style="color:#ABABAB;">arm:</span> <b>11,379,300</b> ops/s </td> <td align="center"> <!-- simsimd --> <span style="color:#ABABAB;">x86:</span> <b>16,151,800</b> ops/s<br/> <span style="color:#ABABAB;">arm:</span> <b>13,524,000</b> ops/s </td> </tr> <tr> <td colspan="4" align="center">cosine distances between 1536d vectors in <code>bfloat16</code></td> </tr> <tr> <td align="center"> <!-- scipy.spatial.distance.cosine --> 🚧 not supported<br/> </td> <td align="center"> <!-- serial --> <span style="color:#ABABAB;">x86:</span> <b>119,835</b> ops/s<br/> <span style="color:#ABABAB;">arm:</span> <b>403,909</b> ops/s </td> <td align="center"> <!-- simsimd --> <span style="color:#ABABAB;">x86:</span> <b>9,738,540</b> ops/s<br/> <span style="color:#ABABAB;">arm:</span> <b>4,881,900</b> ops/s </td> </tr> <tr> <td colspan="4" align="center">cosine distances between 1536d vectors in <code>float16</code></td> </tr> <tr> <td align="center"> <!-- scipy.spatial.distance.cosine --> <span style="color:#ABABAB;">x86:</span> <b>40,481</b> ops/s<br/> <span style="color:#ABABAB;">arm:</span> <b>21,451</b> ops/s </td> <td align="center"> <!-- serial --> <span style="color:#ABABAB;">x86:</span> <b>501,310</b> ops/s<br/> <span style="color:#ABABAB;">arm:</span> <b>871,963</b> ops/s </td> <td align="center"> <!-- simsimd --> <span style="color:#ABABAB;">x86:</span> <b>7,627,600</b> ops/s<br/> <span style="color:#ABABAB;">arm:</span> <b>3,316,810</b> ops/s </td> </tr> <tr> <td colspan="4" align="center">cosine distances between 1536d vectors in <code>float32</code></td> </tr> <tr> <td align="center"> <!-- scipy.spatial.distance.cosine --> <span style="color:#ABABAB;">x86:</span> <b>253,902</b> ops/s<br/> <span st