openai/evals
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
Stars: 17,934Language: Python
Give AlbumentationsX a star on GitHub — it powers this leaderboard
Star on GitHubEvals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.