Korean-Filipino Parallel Corpus Development

The Korean-Filipino Parallel Corpus Development is part of the project initiated by South Korea’s National Institute of Korean Language (국립국어원), which aims to develop a massive Korean-Foreign Languages Parallel Corpus. A total of eight languages are included in the development: (1) Vietnamese, (2) Bahasa Indonesia/Malaysia, (3) Thai, (4) Hindi, (5) Khmer, (6) Russian, (7) Uzbek, and (8) Filipino. The corpus will be used in improving AI-based NLP technology in order to help enhance the quality of intercultural communication and machine translation. The project also aims to contribute to the increase of cultural and trade exchanges between Korea and Southeast Asia, as well as Eurasian countries, in line with South Korea’s New Southern and New Northern Policies.

This long-term project began in October 2021.

The parallel corpus will eventually be made available to the public.

Project Members