Anthropic raises concerns over human oversight of AI as self-improvement capabilities advance

by

Teddy Cambosa

-

2 days ago

Anthropic raises concerns over human oversight of AI as self-improvement capabilities advance

United States – Anthropic has published a report outlining the progress its AI systems have made toward what the industry terms recursive self-improvement — the point at which an AI could autonomously design and train its own successor — and flagging the governance challenges that such a development would pose.

The Anthropic Institute, the company’s research and policy arm, draws on both external benchmark data and internal operational figures to document how AI systems are already taking on a growing share of AI development work. 

While the report notes the potential benefits in science and healthcare, it also identifies risks that the company says existing institutions may not yet be equipped to manage.

What the data shows

The report’s concerns are grounded in internal figures that Anthropic describes as previously unreported. As of May 2026, more than 80 percent of code merged into the company’s production systems was authored by Claude. Engineers are shipping eight times as much code per quarter as in the 2021–2024 period. 

On fully open-ended engineering problems — those where the engineer has not specified what a solution should look like — Claude’s success rate reached 76 percent in May 2026, up 50 percentage points over six months.

In research, the company describes a demonstration from April 2026 in which Claude-powered agents were given an open-ended AI safety question and left to work through it independently. Two human researchers, working for about a week, recovered roughly 23 percent of the measurable performance gap on the problem; the agents recovered 97 percent over 800 cumulative hours. 

Anthropic notes that humans still chose the problem and set the scoring criteria, and that the result did not transfer cleanly to production-scale models — but observes that designing every experiment was left entirely to the agents.

Three scenarios, two of them pressing

Anthropic frames its analysis around three possible futures, stating it considers the first the least likely and is more concerned about the remaining two.

  1. Capability growth plateaus – Progress slows due to architectural limits or compute constraints. Today’s AI remains transformative, but institutions have more time to adapt. Anthropic considers this the least likely outcome based on current trends.
  2. Compounding efficiency gains continue – AI development becomes substantially automated while humans retain strategic direction. Organisations become significantly more efficient, but the same capabilities could also be applied to surveillance or large-scale influence operations.
  3. Full recursive self-improvement – AI systems begin designing their own successors. Development pace becomes determined by available compute. Humans shift primarily to oversight and validation. The report describes this as the scenario it is “least certain about” in terms of alignment outcomes.

The coordination problem

Anthropic’s report does not advocate for unilateral action by any single company. It argues that if slowing frontier AI development were practically achievable, doing so would likely be beneficial — but that a pause by one lab would shift who leads rather than create the broader deliberative process the situation calls for.

A credible slowdown, the report says, would require multiple well-resourced laboratories across multiple countries to agree to stop, with each able to verify that others had genuinely done so. The company describes this verification problem as significantly harder than comparable arms-control challenges: AI training runs are far easier to conceal than missile silos, the inputs are general-purpose, and the incentive to continue quietly is substantial.

Anthropic says it would participate in a verified, multi-party slowdown if other frontier developers agreed to the same conditions, and intends to convene policymakers, researchers, civil society organisations, and other AI companies in the coming months to discuss what workable coordination mechanisms might look like. It has committed to publishing the outcomes of those discussions.

Recognise the innovators redefining commerce at the Retail & E-commerce Excellence Awards Asia Pacific 2026! Taking place this December 2026, we celebrate the region’s most impactful retail strategies, standout e-commerce experiences, and forward-thinking leaders—submit your entries today!
Honour the women shaping the future of marketing and technology at the Empowered Women Awards 2026! This December 2026, we celebrate inspiring leaders, changemakers, and rising voices driving impact across the industry—submit your entries today!
Share

RECENT ARTICLES

PLDT board approves pursuit of potential VITRO REIT listing focused on data centre assets
Pos Malaysia unifies courier offerings as it expands support for cross-border e-commerce
Trading in 2026: What platforms like Versus Trade are doing differently
GTM leaders seek faster revenue growth as questions over AI ROI persist, research finds
NTT DATA, Google Cloud expand partnership to scale enterprise AI deployments
Ellipse 3

RELATED ARTICLES

Visibility and trust emerge as priorities in APJ's expanding use of AI: report
IBM, Google Cloud launch new practice to accelerate enterprise AI adoption
Hexaware expands Agentverse AI platform with new governance and lifecycle management capabilities
Ellipse 3

FEATURED ARTICLES

UpTech Media, MARKETECH APAC unveils second edition of ‘Retail & E-Commerce Innovation Summit’ 2026 in Malaysia
‘Retail & E-Commerce Excellence Awards Asia Pacific
Empowered Women Awards 2026 to honour leading women trailblazing the technology and marketing industries

Subscribe to UpTech Media Newsletter

JOIN OUR NEWSLETTER

Subscribe to our newsletter to get the latest APAC marketing news.