Give AlbumentationsX a star on GitHub — it powers this leaderboard

Star on GitHub

agronholm/SageAttention

Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.

Stars: 0Language: Cuda
agronholm/SageAttention - GitHub Repository | PyPI Leaderboard