Zhang Huayun

I²R Staff Profile

Dr. Zhang Huayun
Principal Scientist (Aural and Language Intelligence)
Email: Zhang_Huayun@i2r.a-star.edu.sg

Research areas:
Audio Signal Processing, Acoustic Modelling, Speech Evaluation, Emotional Speech and Language Processing

Dr Zhang Huayun is a Principal Scientist at the A*STAR Institute for Infocomm Research (A*STAR I²R). He is the tech lead in the AI in education collaboration with an educational agency in Singapore. Dr. Zhang received his B.E. degree in communications engineering in 1994 and M.E. degree in communication and electronics in 1997, both from the Nanjing Institute of Communications Engineering, Nanjing, China. He received his PhD degree in Pattern Recognition and Intelligent System from the National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China.

Prior to joining A*STAR I²R, Dr Zhang worked in several companies such as InfoTalk Corp (Singapore) where his work on multilingual speech recognition for embedded platforms won the Best Mobile Music Application Award from Nokia in 2006. His team was later acquired by Creative Technology where the team pioneered Creative's X-Fi audio processing technology, which won multiple awards for Creative at CES Las Vegas Show. In 2011, he joined Vcyber Technology as a senior research manager where the team developed commercial level speech and language technologies which enable speech navigation, information entertainment and other vehicle management services for the automobile industry. Their products and solutions have been adopted by major automakers in mainland China, as SAIC-GM, SAIC-Volkswagan, and FAW-Mazda.

Patents

“A Method and Apparatus for Accessing an Audio File from A Collection Of Audio Files Using Tonal Matching”, Filed on 22 May 2007.
- US Patent number: US8892565 Granted: 18 November, 2014
- CN Patent number: CN101454778 Granted: 07 December, 2011
“Method for enlarging a location with optimal three-dimensional audio perception”, Filed on 01 Feb 2010.
- US Patent number: US9247369 Granted: 26 January 2016
- CN Patent number: CN102783187 Granted: 03 August 2016
- TW Patent number: TW100102445 Granted: 01 April 2016

Publications

Jeremy Heng Meng Wong, Huayun Zhang and Nancy Chen, Distilling knowledge from Gaussian Process Teacher to Neural Network Student, Interspeech 2023
Perry Lam, Huayun Zhang, Nancy Chen and Berrak Sisman, EPIC TTS Models: Empirical Pruning Investigations Characterizing Text-To-Speech Models, Interspeech 2022.
Jeremy Heng Meng Wong, Huayun Zhang and Nancy Chen, Variations of multi-task learning for spoken language assessment, Interspeech 2022.
Huayun Zhang, Ke Shi, Nancy F.Chen, Multilingual Speech Evaluation: Case Studies on English, Malay and Tamil, Conference of the International Speech Communication Association (INTERSPEECH) 2021.
Ke Shi, Kyemin Tan, Huayun Zhang, Nancy F.Chen, WittyKiddy: Multilingual Spoken Language Learning for Kids, Conference of the International Speech Communication Association (INTERSPEECH) 2021.
Huayun Zhang, Jun Xu, “Pattern-Based Dynamic Compensation Towards Robust Speech Recognition in Mobile Environments”, International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2006.
Huayun Zhang, Jun Xu, “Cepstrum Interpolation towards Robust Speech Recognition over the Phone. Proceedings of the IASTED International Conference on Signal Processing, Pattern Recognition, and Applications, SPPRA 2006.
Huayun Zhang, Jun Xu “An Investigation into Subspace Rapid Speaker Adaptation”, International Symposium on Chinese Spoken Language Processing (ISCSLP) 2004.
Huayun Zhang, Zhaobing Han, Bo Xu, “Statistic Model Based Dynamic Channel Compensation for Telephony Speech Recognition”, Chinese Journal of Electronics, Vol.13 No.4, 2004.
Huayun Zhang, Zhaobing Han, Bo Xu, "Dynamic Channel Compensation Based on Maximum A Posteriori Estimation", the 8th European Conference on Speech Communication and Technology (EUROSPEECH 2003).
Huayun Zhang, Bo Xu, "Geometric Constrained Maximum Likelihood Linear Regression On Mandarin Dialect Adaptation", the 8th European Conference on Speech Communication and Technology (EUROSPEECH 2003).
Zhaobing Han, Shuwu Zhang, Huayun Zhang, Bo Xu, “A Vector Statistical Piecewise Polynomial Approximation Algorithm for Environment Compensation in Telephone LVCSR”, International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2003.
Huayun Zhang, Zhaobin Han, Bo Xu, "CodeBook Dependent Dynamic Channel Estimation for Mandarin Speech Recognition over Telephone", International Conference on Spoken Language Processing (ICSLP) 2002.
Huayun Zhang, Bo Xu, Taiyi Huang, "Improving Performance of Telephone-Based Mandarin Speech Recognition", International Symposium on Chinese Spoken Language Processing (ISCSLP) 2002.
Yiyan Zhang, Wenju Liu, Bo Xu, Huayun Zhang, “Improving Parametric Trajectory Modeling by Integration of Pitch and Tone Information”, International Conference on Spoken Language Processing (ICSLP) 2002.
Zhaobing Han, Huayun Zhang, Bo Xu, “Structure-Based Compensation Using An Improved Statistical Linear Approximation for Mandarin Speech Recognition Over Telephone”, International Symposium on Chinese Spoken Language Processing (ISCSLP) 2002.
Huayun Zhang, Bo Xu, “Speech Recognition Techniques used in Network Speech Servers”, Proc. of the 7th National Conference of Youth Communications Scientists, Apr. 2001.
Huayun Zhang, Zhaobin Han, Bo Xu, "A Study of Speech Recognition in Telephone-Based Translation System”, Proc. of the 6th National Conference on Man to Machine Speech Communications, Oct.2001.
Huayun Zhang, Xianzhi Chen，et al,” Perceptually Based ASR in Noisy Environment”, Proc. of the 4th National Conference on Man to Machine Speech Communications, Oct.1996, Beijing.

I²R Staff Profile

A*STAR celebrates International Women's Day