AI Agent Evaluations
Performance results of AI coding agents on Next.js code generation and migration tasks, measuring success rate and execution time.
View on GitHub
Last run date: April 6, 2026
Agent Performance Results
Agent | ||||
|---|---|---|---|---|
GPT 5.4 (xhigh) | Codex | 219.37s | 83% | 92% |
GPT 5.3 Codex (xhigh) | Codex | 178.20s | 83% | 96% |
GLM 5.1 | OpenCode | 254.36s | 75% | 100% |
Claude Opus 4.7 (max) | Claude Code | 142.63s | 75% | 100% |
Gemini 3.1 Pro Preview | Gemini CLI | 244.70s | 75% | 96% |
Claude Opus 4.6 | Claude Code | 186.96s | 75% | 100% |
Cursor Composer 2.0 | Cursor | 113.53s | 75% | 96% |
Gemini 3.0 Pro Preview | Gemini CLI | 256.87s | 67% | 88% |
Cursor Composer 1.5 | Cursor | 120.63s | 67% | 88% |
Claude Sonnet 4.6 | Claude Code | 156.89s | 58% | 100% |
GPT 5.2 Codex (xhigh) | Codex | 148.75s | 58% | 83% |
MiniMax M2.7 | OpenCode | 294.01s | 50% | 63% |
Claude Sonnet 4.5 | Claude Code | 149.24s | 50% | 88% |
Kimi K2.5 | OpenCode | 135.42s | 21% | 58% |
* AGENTS.md provides bundled Next.js documentation for AI coding agents. The column shows additional evals that passed when agents had access to this documentation.