[INTERVIEW] Kakao researcher hopes to develop hyperscale AI models with commercial quality and usability

By Lim Chang-won Posted : February 17, 2022, 16:02 Updated : February 18, 2022, 07:52

[Photo by Yoo Dae-gil dbeorlf123@ajunews.com]

SEOUL -- Hyperscale computing is massive-scale computing that is normally incorporated in the processing of big data and cloud computing. The technique is ideal for running infrastructure for information technology services offered by global companies. AI sovereignty has been a national goal in South Korea, which is striving to catch up with the United States and China in an international AI hegemony war.

Kakao Brain, the AI research unit of South Korea's web service company Kakao, is among South Korean tech firms trying to develop an artificial brain which is software and hardware with cognitive abilities similar to those of the human brain in the future. However, the company's short-term goal is to develop a machine learning inference platform. 

Machine learning inference is the process of running live data points into a machine-learning algorithm to calculate an output such as a single numerical score. When a machine learning model is running in production, it is described as AI since it is performing functions similar to human thinking and analysis. Machine learning inference basically entails deploying a software application into a production environment. 

Kakao Brain's chief technology officer Kim Kwang-seob said his company focuses on developing products or platforms that utilize AI source technologies for commercial purposes. "We are conducting research and development to provide people with the better performance of AI models that have become bigger than in the past," he said in an interview with Aju Business Daily. 

It is the work of extracting the performance of any complex model in a short time so that it can be used for real services, Kim said, promising to open the machine learning inference platform to the outside. "I think it is important not just to provide technology to the outside world, but to meet the right level for service."

"One axis is to create a platform that guarantees commercial quality and usability with AI models, while the other axis is to directly develop services for end-users," Kim said, suggesting that his company will be able to release an English-related app during the first half of 2022. "This is just the beginning and we will showcase much more AI-based services. If only the purpose can be achieved, it does not have to be based on a super-large model."

In late 2021, Kakao Brain unveiled two hyperscale open-source AI models, KoGPT and minDALL-E. KoGPT is a Korean-language AI platform based on GPT-3, an autoregressive language model created by OpenAI, a San Francisco-based AI research lab. GPT-3 with a capacity of 175 billion machine learning parameters uses deep learning to produce human-like texts.

KoGPT performs tasks such as the summary of long sentences, the prediction of conclusions, and understanding context to answer questions. Because writing is automatically possible depending on the context, high-level language tasks can be solved and used in various fields. Meanwhile, minDALL-E is a text-to-image generation model trained on 14 million image-text pairs. The AI model is capable of understanding the context of the users’ requests and creating a whole new image.

A new and third AI model capable of performing more diverse tasks will be released in the first quarter of 2022, Kim said. "Similar to minDALL-E, it will be possible to simultaneously learn language texts and further understand the principles so that image text classification, as well as positive and negative judgments, can be made with this one model."  

As a study into a hyperscale AI model requires a lot of resources and money, Kim advocated the establishment of an independent data center as a cost-effective method from a long-term perspective. Kakao Brain also works on creating a digital human that combines artificial characters with virtual human figures and travels around metaverse spaces.

The metaverse world is a collective virtual shared space, created by the convergence of virtually enhanced physical reality and physically persistent virtual space. Various South Korean companies have actively adopted metaverse platforms to bridge the gap between online and offline spaces. "Just as people, in reality, look different, the character that expresses me in the metaverse world must have different characteristics to create a sense of immersion. Beyond the uncanny valley, we are studying high-level digital human technology that makes us feel comfortable," Kim said.

The uncanny valley is a hypothesized relationship between an object's degree of resemblance to a human being and the emotional response to the object. The concept suggests that humanoid objects that imperfectly resemble actual human beings provoke uncanny or strangely familiar feelings of uneasiness and revulsion in observers. 

"People's facial expressions should change to suit music and vocalization when singing to suit the languages of each country. It is extremely difficult to implement this in an existing way," Kim said, adding that a clue to solve problems may come from "neural rendering," which combines traditional graphics technology and deep learning technology. Neural rendering is closely related and combines ideas from classical computer graphics and machine learning to create algorithms for synthesizing images from real-world observations. 

(This story is based on a Korean-language interview conducted by Aju Business Daily reporter Im Min-cheol.)  
기사 이미지 확대 보기