openai/mle-bench
MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering
Stars: 1,332Language: Python
Give AlbumentationsX a star on GitHub — it powers this leaderboard
Star on GitHubMLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering