Give AlbumentationsX a star on GitHub — it powers this leaderboard
Based on RapidOCR, extract the PDF content