A*STAR Institute for Infocomm Research (I2R)’s Aural & Language Intelligence (ALI) department and the Ministry of Communications and Information’s (MCI) Translation Department are currently developing a Machine Translation (MT) engine for the translation of localised content. Awarded the Translational Research and Development for Application to Smart Nation (TRANS) Grant*, this Machine Translation (MT) engine will cater to the needs of government agencies and the whole-of-government as part of the Smart Nation Digital Government Initiative.
This machine translation engine produces faster translations, and ensures consistency in the rendering of Government terms and messages so that the translations are suited to the local context. For example, the engine will be able to correctly identify and translate names of organisations such as the Housing and Development Board (HDB) as “建屋发展局” or the Public Utilities Board (PUB) as “公用事业局”. Such names or terms that are unique to Singapore often do not translate well using machine translation tools currently available in the market, such as Google Translate, Bing Translate and Babylon.
To help the MT engine recognise local context, A*STAR I2R trained the engine using a repository of human translations of local terms and sentences from Government public communications materials. This data set used to train the engine was provided by MCI’s Translation Department and included materials such as letters, notices, posters, pamphlets, advertisements and more.
Furthermore, A*STAR I2R’s ALI team enriched the embedding of these local terms and biased the training towards local content so that the engine would be sensitive to the content of Government communications. This ensured relevance and correct translation of localised terms such as the names of Government organisations, places of interest, landmarks, and road names in Singapore. For example, when the engine is attempting to translate the abbreviation “MOM”, it will translate it as “人力部” (“Ministry of Manpower”), instead of “妈妈” (“mummy”). It can also translate place names like “武吉士” (“Bugis”) correctly.
Benchmarking is performed against reputable translation engines such as Google, Bing, and Baidu periodically to ensure the quality of this MT engine in the translation of local content. Several translation engines were tested on their Chinese-to-English translation of contents related to Government communications, and A*STAR I2R’s MT engine excelled over various performance measures (refer to table below).
*Conducted on Oct 2017. BLEU, NIST & METEOR are performance measures of positive correlation in translations, while TER is a performance measure of translation error rate.
This MT engine will continue to undergo further refining and training using a wider range of localised terms and phrases, while A*STAR I2R’s ALI team is also progressively expanding its capabilities into English-Malay, as well as English-Tamil translation.
*This project is supported by the Smart Nation and Digital Government Group (SNDGG) and the National Research Foundation (NRF), under the Public Sector Translational R&D Grant Funding Initiative (TRANS Grant). The aim of the funding initiative is to tap on the research community to solve public sector challenges with innovative use of digital technologies
You can see what this Translation engine is used for on our I2R Facebook page. Read more :
The Straits Times (4 March 2020) : AI-powered engine to lift local translation standards
Also reported in:
Channel News Asia, Lianhe Zaobao,Berita Harian,Channel 8 News,SURIA and CNA 938