Agent Version Performance Overview

Compare evaluation scores across different agent versions

How to Use This View

This overview shows evaluation performance metrics across different agent versions. Click on any version card to view detailed per-trace evaluation outputs, contradictions, and reasoning.