S. Korean researchers develop AI that transforms single observer video into first-person perspective

By Park Sae-jin Posted : February 23, 2026, 15:31 Updated : February 23, 2026, 15:31
This file image shows how EgoX technology converts a third-person perspective into a first-person view. Courtesy of KAIST

SEOUL, February 23 (AJP) - Researchers at the Korea Advanced Institute of Science and Technology have developed an artificial intelligence model capable of converting a standard video of a person into a high-quality video showing exactly what that person was seeing. The technology, named EgoX, reconstructs a first-person perspective using only a single third-party video as a source.

This development addresses a significant hurdle in the fields of augmented reality, virtual reality, and robotics. While first-person data is essential for training AI and creating immersive content, it typically requires subjects to wear expensive action cameras or smart glasses. By eliminating the need for wearable hardware, the new model allows researchers to generate first-person data from the vast amount of existing video footage already recorded by outside observers.

The EgoX system works by analyzing the three-dimensional structure of the environment and the specific movements of the person in the video. Rather than simply rotating the camera angle, the AI calculates the person's exact position and posture to recreate the scene from their eyes. It specifically models the relationship between head movement and visual field, ensuring that when a person in a video turns their head, the generated first-person view shifts naturally.

In testing across various daily activities such as cooking and exercising, the model maintained high visual quality without the glitches often found in earlier conversion attempts. Because it understands the geometry of the space, the AI can accurately estimate depth and surroundings even when the lighting or movement in the original video is complex.

The technology is expected to have an immediate impact on immersive media and robotics. In the metaverse and VR industries, it can turn standard broadcasts into experiences where users feel as if they are the protagonist. For robotics, it provides a way for machines to practice imitation learning by seeing human tasks from the correct physical perspective.

Professor Ju Jae-geul, who led the research, stated that the significance of the work lies in the AI learning to reconstruct human vision and spatial understanding. He noted that the environment is now being created where anyone can produce immersive content using previously filmed videos.

The research was a collaborative effort involving Kang Tae-woong and Kim Ki-nam, doctoral students at KAIST, and Kim Do-hyun, an undergraduate researcher at Seoul National University. The findings were first shared on the preprint server arXiv on December 9, 2025, and will be officially presented at the IEEE/CVF Conference on Computer Vision and Pattern Recognition in Colorado on June 3, 2026.

The project was supported by the Ministry of Science and ICT through the National Research Foundation of South Korea and the Korea Institute of Science and Technology Information.

(Paper information)
Journal: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Title: EgoX: Egocentric Video Generation from a Single Exocentric Video
DOI: https://keh0t0.github.io/EgoX/

Copyright ⓒ Aju Press All rights reserved.