Vercel Finds Static Docs Outperform Agent Skills in Next.js Code Generation Evals

Vercel reported surprising results from internal evaluations of AI coding agents attempting to use Next.js 16 APIs, detailing that a static documentation index outperformed dynamic tool invocation. The tests focused on new features like 'use cache' and 'connection()', which are absent from current large language model training data.

The two tested methods were 'skills,' an open standard for packaging domain knowledge that agents invoke on demand, and AGENTS.md, a markdown file providing persistent context available at every interaction turn.

When tested, agents failed to invoke the framework-specific Next.js skill in 56% of cases, yielding no improvement over the baseline performance. Even after adding explicit instructions to the AGENTS.md file mandating skill use, the pass rate only reached 79%, and the behavior proved brittle, changing significantly based on subtle prompt wording.

To isolate the issue, Vercel hardened its eval suite to specifically target these novel Next.js 16 APIs, ensuring tests measured observable behavior rather than implementation details.

By embedding a compressed, pipe-delimited index of documentation—reduced to just 8KB—directly into AGENTS.md, agents achieved a perfect 100% pass rate across build, lint, and test metrics. This suggested that eliminating the agent's decision point regarding tool usage was crucial for reliability.

According to Vercel analysis, this success stems from three factors: AGENTS.md offers no decision point, provides consistent availability across all turns, and avoids sequencing issues inherent in tool invocation.

While skills remain valuable for explicit, vertical workflows, the research indicates that for general framework knowledge, passive, readily available context currently yields superior performance for coding agents.

Developers can implement this indexing via the @next/codemod package, which automatically downloads version-matched documentation and injects the necessary index into the AGENTS.md file.

Vercel Finds Static Docs Outperform Agent Skills in Next.js Code Generation Evals

Tags

Comments

Keep reading

More from AI

Cloudflare Enables Self-Hosted AI Agent Moltbot to Run on Developer Platform

Autonomous AI Agents Coordinate on Social Platform to Enhance Memory Systems

DeepMind Showcases Gemini-Powered Models for Video, Weather, and Robotics

Latest news

Apple Acquires AI Startup Q for $2 Billion to Advance Silent Speech Recognition

Obsbot's New $350 Tiny 3 Gimbal Webcam Fails to Justify Premium Price Point

Majority of Game Developers View Generative AI as Negative for Industry, GDC Survey Finds