Korea's indigenous AI model project faces debate over use of open-source components

By Kim Dong-young Posted : January 14, 2026, 14:17 Updated : January 14, 2026, 14:17
Graphics by AJP Song Ji-yoon
 
SEOUL, January 14 (AJP) - South Korea's government-backed initiative to develop an indigenous artificial intelligence foundation model is facing scrutiny over whether the use of open-source components from Chinese firms aligns with the project's definition of "sovereign AI."

The Ministry of Science and ICT is set to announce Thursday the results of the first evaluation round for bidders in the national AI foundation model project. Of the five contenders — Naver Cloud, NC AI, Upstage, SK Telecom and LG AI Research — one will be eliminated in the initial cut.

The project was launched under the banner of "Leaping into the world's top three AI powers," with the government positioning it as a sovereign AI development effort. 

Eligible models were defined as those developed domestically from design through pre-training, excluding derivative models fine-tuned from foreign systems. 

The government is funding GPU resources, data and engineering costs, aiming to produce a model capable of reaching 95 percent of global AI benchmark performance.

However, questions have emerged over whether some contenders' development approaches meet the project's sovereignty criteria.

The debate began after Upstage was accused of using inference code from Chinese AI firm Zhipu AI, based on similarities identified by industry observers. Scrutiny later extended to Naver Cloud, which acknowledged incorporating Alibaba's Qwen 2.5-VL 32B vision encoder and weights, and to SK Telecom, which has faced claims that it used inference code linked to Chinese firm DeepSeek.

Upstage and SK Telecom have stated that inference code is separate from the AI model itself. They said such code functions as a deployment or distribution layer that improves compatibility and usability, without affecting core model training or capabilities. The inference code used by Upstage is released under the MIT license, while SK Telecom's code falls under the Apache 2.0 license.

Both licenses allow free use, modification and commercial distribution with attribution. The Apache 2.0 license additionally includes explicit patent grants and disclosure requirements for major modifications.
 
Participant view Naver Cloud's booths during the sidelines of a presentation of homegrown AI foundation models at the COEX in Seoul, Dec. 30, 2025. Yonhap
 
Naver Cloud's case has drawn closer scrutiny because it involved model weights. The company used Alibaba's Qwen 2.5-VL 32B vision encoder and weights, with analysis showing that its vision encoder weights share a cosine similarity of 99.5 percent with the Qwen series.

The 32B model is released under the Apache 2.0 license and can be freely used, while the larger 72B version requires a license request to Alibaba if monthly active users exceed 100 million. Naver Cloud has said it possesses comparable in-house technology and selected Qwen for ecosystem compatibility and system optimization, adding that it could switch to proprietary technology if licensing thresholds become applicable.

Some industry participants note that the ministry's project guidelines, announced in July, specify domestic model design and pre-training but do not explicitly require "from-scratch" development without any external components.

An industry expert said that while the use of open-source technology is common and generally uncontroversial in private-sector development, questions arise when government funding is involved, particularly in projects framed around technological independence.
 
Graphics by AJP Song Ji-yoon
 
Meanwhile, NC AI and LG AI Research have not been subject to similar scrutiny. NC AI said its VARCO model was developed independently, covering data collection, pre-training and tuning. LG AI Research's EXAONE was also developed without incorporating Chinese modules.

According to evaluation results from the first round, EXAONE ranked first in 10 of 13 benchmark categories, recording the highest average score at 72 points. The model ranked seventh globally and first domestically in the Intelligence Index compiled by Artificial Analysis.

The ministry has so far declined to clarify whether the project requires models to be developed entirely from scratch. Industry observers say the criteria may become clearer once the first elimination is announced.

Some AI developers argue that debates over the origins of specific code components are less important than governance and control over data and deployment, noting that many countries developing AI systems rely on open-source technologies.

Copyright ⓒ Aju Press All rights reserved.