Skip to content

AI Agent Evaluations

Performance results of AI coding agents on Next.js code generation and migration tasks, measuring success rate and execution time.

View on GitHub
Last run date: April 6, 2026

Agent Performance Results

Agent
GPT 5.4 (xhigh)
Codex
219.37s
83%
92%
GPT 5.3 Codex (xhigh)
Codex
178.20s
83%
96%
GLM 5.1
OpenCode
254.36s
75%
100%
Claude Opus 4.7 (max)
Claude Code
142.63s
75%
100%
Gemini 3.1 Pro Preview
Gemini CLI
244.70s
75%
96%
Claude Opus 4.6
Claude Code
186.96s
75%
100%
Cursor Composer 2.0
Cursor
113.53s
75%
96%
Gemini 3.0 Pro Preview
Gemini CLI
256.87s
67%
88%
Cursor Composer 1.5
Cursor
120.63s
67%
88%
Claude Sonnet 4.6
Claude Code
156.89s
58%
100%
GPT 5.2 Codex (xhigh)
Codex
148.75s
58%
83%
MiniMax M2.7
OpenCode
294.01s
50%
63%
Claude Sonnet 4.5
Claude Code
149.24s
50%
88%
Kimi K2.5
OpenCode
135.42s
21%
58%

* AGENTS.md provides bundled Next.js documentation for AI coding agents. The column shows additional evals that passed when agents had access to this documentation.