The A*STAR Career Development Fund (CDF) aims to support promising early-career researchers in developing their careers in A*STAR by providing project management experience and seed funding.
Congratulations to the following recipients of the A*STAR Career Development Fund (CDF) 2024:
Dr He Xin Scientist | Algorithm-System Co-Design for Efficient and IP-Protected LLMs: From Model Optimisation to Cluster Deployment
As large language models (LLMs) continue to grow in size, complexity, and commercial value, concerns around computational efficiency and intellectual property (IP) protection have become increasingly prominent. Dr He Xin’s project explores a novel algorithm-system co-design approach to address these issues. The research aims to improve the performance and deployability of LLMs while ensuring their originality and ownership can be reliably protected, even in resource-limited deployment environments. This work seeks to strengthen trust in LLM-based systems and support responsible, secure AI development across both academic and industrial settings. |
 Dr He Yang Scientist
|
More is Less: Leveraging Large Language Models to Compress Vision Models
As vision models become essential in daily life, their expanding use on resource-limited devices creates an urgent need for more efficient solutions. Dr He Yang proposes bridging large language models with small vision models, offering three strategies to enhance efficiency. His method uses language models to create more effective training datasets, implements a streamlined training process that modifies far fewer parameters, and develops smarter pruning decisions for faster model inference. This comprehensive approach supports Singapore's Smart Nation initiatives by enabling more efficient AI systems that require less storage, learn more efficiently, and respond more quickly across various sectors, including transportation, healthcare, and manufacturing. |
Dr Lyu Yueming
Scientist
|
Nonparametric Distributional Black-box Optimisation for Diffusion-model Target Generation
Guided diffusion-model generation presents a promising avenue for customising the generation process to satisfy the users’ requirements and preferences. However, both theoretical foundations and practical algorithms for query-efficient black-box target generation remain largely unexplored. Dr Lyu Yueming's project aims to develop a nonparametric distributional black-box optimisation framework to analyse and design diffusion-based target generation in a theoretically rigorous manner. This framework will establish a solid foundation for advancing both target generation and black-box optimisation, potentially benefiting a wide range of downstream tasks. |
Dr Qian Hangwei Scientist
|
Multi-Modal Composite Material Design: Benchmark, Alignment, and Transfer
Conventional material design remains an extremely challenging and slow process, often taking a period of 20 to 30 years from starting of the design to final deployment. Artificial Intelligence (AI) methods have the potential to drastically shorten this cycle to a few years - or even months. Dr Qian Hangwei's proposal seeks to expedite the research on composite material design by developing a comprehensive multi-modal learning framework. The proposed research brings together interdisciplinary ideas from material science, machine learning, and explainable AI, and will establish a strong foundation for AI in Scientific Discovery. |
 Dr Zhang Xin Scientist
|
Distilling in Regions: A Novel Multimodal Dataset Distillation Framework
This proposal aims to address the unique challenges of dataset distillation in multimodal scenarios. While existing techniques predominantly focus on unimodal data such as images, this work proposes a region-aware framework to distill multimodal datasets—particularly those combining images and text. The approach seeks to enhance alignment between visual regions and their corresponding textual descriptions by identifying and aggregating semantically rich feature clusters. By leveraging spatial semantics and cross-modal attention mechanisms, the method aims to generate compact yet highly representative synthetic datasets. This proposal holds the potential to significantly reduce training cost while preserving performance, offering a scalable solution for efficient multimodal learning. |