A*STAR I²R, IMDA and AISG pioneers Southeast Asia’s first large language model
06 Dec 2023
A*STAR’s Institute for Infocomm Research (I²R) is partnering Infocomm Media Development Authority (IMDA) and AI Singapore (AISG) to launch the Southeast Asia’s first large language model under the Singapore’s National Multimodal LLM Programme (NMLP).
Funded by the National Research Foundation, Singapore (NRF), the new S$70 million Initiative will develop Singapore’s research and engineering capabilities in multi-modal Large Language Models.
As Singapore and the region’s local and regional cultures, values and norms differ from those of Western countries, where most large language models originate, one of the key areas to be explored under this NMLP is the Southeast Asia’s first regional LLM.
This will build on the early outcomes of AISG’s SEA-LION (Southeast Asian Languages in One Network) model. This is an open-sourced large language model that is more representative of Southeast Asia’s cultural contexts and linguistic nuances.
The SEA-LION1 will draw on A*STAR’s Institute for Infocomm Research’s (I²R)’s expertise in speech and language research of Southeast Asia languages that has been applied widely in language transcription and translation supporting various agencies and companies in the private sector. I²R’s multimodal speech-text foundation model could help identify non-verbal cues and enable SEA-LION to have a closer read of the user intent.
The development of this LLM will be a strategic cornerstone for Singapore and the regional countries, having our own AI model specifically trained to understand context and values related to the diverse cultures and languages of Southeast Asia.
Read more about the National Multimodal LLM Programme (NMLP) here. The Straits Times coverage can be read here: $70m S’pore AI initiative to develop first large language model with South-east Asian context | The Straits Times