All activity
AutoArena is an open-source tool that automates head-to-head evaluations using LLM judges to rank GenAI systems. Quickly and accurately generate leaderboards comparing different LLMs, RAG setups, or prompt variations—Fine-tune custom judges to fit your needs.
AutoArena
Automated GenAI evaluation that works