Give AlbumentationsX a star on GitHub — it powers this leaderboard
Code for ALBEF: a new vision-language pre-training method