Knowledge-Driven Multimodal Representation Learning

[CFAR Rising Star Lecture Series]
Knowledge-Driven Multimodal Representation Learning by Assoc Prof Xie Weidi

31 Mar 2023 | 10.30pm (Singapore Time)

In recent years, foundation models have shown tremendous success. In essence, these models trained on web data have shown to encode a large amount of human knowledge. For example, ChatGPT, GPT4 are able to freely chat with humans on most topics.

In this talk, Associate Prof Xie Weidi from Shanghai Jiao Tong University will introduce some of his recent works on exploiting knowledge within the foundation models and expand the ability of existing computer vision systems towards open-vocabulary scenarios – for example, action recognition, object detection, segmentation, audio description for movies, etc. He will also discuss his recent research focus on AI4science, specifically on representation learning in the medical domain that requires large amount of human knowledge to be involved, particularly in medical image analysis, disease prediction, and clinical decision-making, etc.

SPEAKER

Xie Weidi
Associate Professor
Shanghai Jiao Tong University
Principal Investigator
Shanghai AI Laboratory

Xie Weidi is an Associate Professor at Shanghai Jiao Tong University. He is also a Principal Investigator (PI) at the Shanghai AI Laboratory and a visiting researcher with the Visual Geometry Group (VGG) at the University of Oxford, where he completed his PhD in 2018 and worked with Professor Andrew Zisserman and Professor Alison Noble. Weidi was also a recipient of the Oxford-Google DeepMind Graduate Scholarships, China Oxford Scholarship Fund (COSF) and the Excellence Award from Department of Engineering Science at the University of Oxford. After completing his PhD, Weidi became a Senior Research Fellow within the same group, working on self-supervised training and multimodal representation learning, where he has published over 40 papers, with over 6500 citations. He also serves as Area Chair Editor for CVPR2023 and NeurIPS2023.

Knowledge-Driven Multimodal Representation Learning

A*STAR celebrates International Women's Day