KAIST develops AI that learns across hospitals and banks without sharing private data

This diagram shows how KAIST’s new FedLoG system works in a federated learning setup with two participating clients and three data categories Each client trains its own model locally while sharing only essential information allowing the system to learn from all participants without exchanging personal data — This diagram shows how KAIST’s new "FedLoG" system works in a federated learning setup with two participating clients and three data categories. Each client trains its own model locally while sharing only essential information, allowing the system to learn from all participants without exchanging personal data.

SEOUL, October 15 (AJP) - KAIST researchers have created a new kind of artificial intelligence that can learn from multiple institutions, like hospitals and banks, without ever sharing personal information. The technology is based on "federated learning," a method where different organizations train a shared AI model using their own data locally instead of sending it all to one place.

The team led by Professor Park Chan-young from KAIST's Department of Industrial and Systems Engineering found a way to fix a major weakness in existing federated learning systems. Normally, when each institution adjusts the shared AI model to fit its own environment, the AI becomes too specialized and loses its ability to handle new situations. This problem, called "local overfitting," makes the AI less useful outside of one organization's data.

For example, if several banks develop a shared AI for loan evaluations and one bank fine-tunes it using only large corporate customer data, the AI might work well for those clients but perform poorly when reviewing small business or personal loans.

To solve this, Professor Park's team used a method called "synthetic data." Instead of using actual personal data, they created artificial datasets that imitate key patterns found in the real data but do not contain any private information. This allows each organization to fine-tune the AI for its own use while keeping privacy intact and maintaining the AI's ability to generalize across different data sources.

Tests showed that this method worked well not only for secure fields like healthcare and finance but also for fast-changing areas such as social media and online shopping. The AI kept its performance stable even when new institutions joined or when the data environment changed quickly.

Professor Park said the research offers a new approach for developing AI that protects privacy without giving up performance. He said it could help fields like medical diagnostics and financial fraud detection, where sharing sensitive data has always been difficult.

The study was led by graduate student Kim Sung-won, with Professor Park as the corresponding author. Their paper, titled "Subgraph Federated Learning for Local Generalization," was presented at the International Conference on Learning Representations (ICLR) 2025 in Singapore, one of the world's leading AI conferences. It was selected as an oral presentation, an honor given to only about 1.8 percent of papers submitted.

Park Sae-jin swatchsjp@ajunews.com