Naver develops AI voice synthesizing tech capable of expressing emotions

By Park Sae-jin Posted : November 14, 2019, 14:55 Updated : November 14, 2019, 14:55

[Courtesy of Naver]

SEOUL -- Naver, a top web portal operator in South Korea, has developed an artificial intelligence-based voice synthesis technology using a real person's voice to create a synthesized robot voice capable of expressing emotions. Such technology can be adopted by call centers and electronic audio bookmakers to give customers more realistic services.

On Thursday, Naver unveiled natural end-to-end speech synthesis (NES) through the website of Clova, an AI voice assistant service. "Everyone can make voice fonts easily and conveniently," Naver Clova Voice research head Kim Jae-min was quoted as saying. Naver plans to add more features such as the voices of popular figures and various emotions to NES.

Naver said that AI synthesized voices can be created by studying voice recordings of about 40 minutes (about 400 sentences). Similar technologies developed by tech companies so far needed to analyze and study at least 40 hours of actual voice recordings to create an artificial voice.

NES can control the emotions of the artificial voice to make it sound happy or sad. Synthesized voice technology can be useful in service, electronic book and other sectors. In November last year, Naver released an audiobook service using synthesized voices of Yoo In-na, a 36-year-old actress who served as a radio DJ, using Hybrid DNN Text-to-Speech (HDTS), a technology that converts texts into synthesized voices.

Copyright ⓒ Aju Press All rights reserved.

0 comments
0 / 300
View more comments

Are you sure you want to delete this comment?

Close

You can write comments after logging in.
Do you want to log in?

Close

You have already participated.

Close
기사 이미지 확대 보기
닫기