Rising Costs of AI Tokens Prompt New Management Strategies

Methods for Optimizing AI Token Usage[Graphic=Ajou Economy]

Managing the costs of artificial intelligence (AI) tokens has become a new challenge for companies. Unlike the previous rush to adopt high-performance AI models, businesses are now focusing on how to optimize AI expenses for similar tasks, establishing a new competitive edge.

According to the IT industry on June 25, major companies are actively implementing AI FinOps (Financial Operations) systems to visualize and control cloud and AI costs at the departmental and service levels.

These systems allow for real-time monitoring of internal AI application programming interface (API) calls and adjustments to agent usage limits and priorities based on departmental consumption and contributions to AI tasks.

This shift reflects the understanding that managing AI token usage at the individual level is insufficient to address rising costs. It has become crucial to manage the phenomenon of 'token maxing,' where employees repeatedly perform meaningless tasks on internal AI, as well as unexpected expenses arising during collaboration.

Nexon is examining the efficiency of AI token usage at the group level and is developing cost prediction models for both enterprise and project units based on historical usage data to estimate the token scale and budget needed for new projects.

Krafton announced on June 18 during the Nexon Developers Conference (NDC) that it has built a dashboard to manage the organization's AI cost status. The company is also considering varying AI usage methods based on the maturity of AI utilization within different departments.

The approach to using AI models at the operational level is also changing. In development settings, a 'hybrid model (routing) approach' is emerging as a cost-saving measure, allowing teams to select or automatically distribute appropriate models based on the difficulty and importance of requests. Complex planning and inference tasks are assigned to premium models, while simpler repetitive tasks or basic coding are handled by relatively cheaper models.

For instance, the input cost for 1 million tokens is $2.50 for OpenAI's GPT-5, while DeepSeek V3.2 is priced at approximately $0.14, representing a nearly 94% reduction. The gap widens further for output costs, underscoring the growing importance of model allocation as the use of AI agents increases.

A representative from a domestic software company using both Claude and Gemini models stated, "It was common to use expensive inference models for all tasks, but due to cost issues, we have recently started using different AI models based on task difficulty."

The structure of purchasing AI model APIs is also identified as a method for cost management. Instead of contracting and managing multiple generative AI models separately, integrated services that allow the use of various models through a single API have emerged. For companies with significant AI usage, the impact of contract structures and operational methods on costs is becoming as critical as model selection.

Services utilizing a model-as-a-service (MaaS) approach enable companies to access multiple generative AI models through a single API.

Another technical cost-saving measure is prompt caching. When companies repeatedly input the same instructions, codebase descriptions, data paths, or document structures, input token costs can accumulate unnecessarily. By temporarily storing and reusing recurring contexts and prompt patterns, unnecessary computations can be reduced.

The industry views AI cost management as an extension of operational capabilities for companies. As competition to adopt high-performance models intensifies, the ability to manage AI agent usage has emerged as a key issue for AI utilization. An industry insider remarked, "As companies increasingly adopt AI, considerations of model performance, operational costs, and management convenience will become critical decision-making factors."

* This article has been translated by AI.

Shin Hye An doubletap@ajunews.com