The widespread adoption of AI agents in the corporate sector is leading to unexpected repercussions. The unique token consumption structure of these agents is driving up AI implementation costs, while employees are engaging in a phenomenon known as "token maxing," where they repeatedly perform meaningless tasks to inflate internal AI usage metrics, deepening corporate concerns.
A paper released on May 29 by Microsoft Research and Stanford University reveals that agent-based coding tasks consume up to 1,000 times more tokens than standard chatbots.
Even simple AI tools that manipulate external systems use between 5 to 30 times more tokens compared to regular chat interactions. When the same task is executed repeatedly, token usage can vary by as much as 30 times, making cost predictions nearly impossible.
The study notably demonstrated that "higher token consumption does not equate to increased accuracy." It found that accuracy peaks at a mid-cost range before reaching a saturation point.
The financial impact is evident in actual billing. According to a case study from AI cost consulting firm LinOps, a SaaS startup with 35 engineers saw its monthly AI expenses soar to $87,000 within four months of implementing coding agents like Claude and CodeX, as well as its own bug triage agent. Only after applying optimization measures such as context pruning, lightweight model separation, and daily usage limits was it able to reduce costs to around $24,000 per month.
In addition to the token cost issues, organizational cultural side effects are surfacing. Amazon mandated that over 80% of its developers use AI weekly and implemented an internal leaderboard for token consumption, leading employees to engage in token maxing by repeatedly executing unnecessary tasks on the company’s AI agent platform to boost their scores.
In response, Amazon has restricted the visibility of the leaderboard. A similar trend has emerged at Meta, where a leaderboard for approximately 85,000 employees recorded a total token consumption exceeding 60 trillion over a 30-day period.
The surge in token costs may present a windfall for the Chinese AI industry. The API pricing for DeepSeek V3.2 is $0.14 per million input tokens, significantly cheaper than GPT-5's $2.50. For output tokens, DeepSeek costs $0.28 compared to GPT-5's $15, widening the gap further. As token consumption volumes increase, the price difference could lead more cost-conscious companies to consider adopting Chinese models.
The overall increase in token usage is also steep. An analysis of OpenRouter platform data by investment firm Alja revealed that weekly token usage surged by over 3,800% as of December last year, with a sharp acceleration beginning in January 2025. The average prompt length has quadrupled from about 1,500 tokens in early 2024 to approximately 6,000 tokens.
* This article has been translated by AI.
Copyright ⓒ Aju Press All rights reserved.