An LC–MS-based lipidomics pre-processing framework underpins rapid hypothesis generation towards CHO systems biotechnology

From left: Dr Ho Ying Swan, Yeo Hock Chuan and Chen Shuwen


Hock Chuan Yeo1,2, Shuwen Chen1, Ying Swan Ho1 and Dong Yup Lee1,3

1 Bioprocessing Technology institute, Agency for Science, Technology and Research (A*STAR), Singapore
2 Department of Chemical and Biomolecular Engineering, National University of Singapore, Singapore
3 School of Chemical Engineering, Sungkyunkwan University, Republic of Korea

Published in Bioinformatics 2019 14: 98 (Online Version)


Chinese Hamster Ovary (CHO) cells are the most prevalent host for producing biotherapeutic drugs in the biologics industry today. To increase productivity, a holistic understanding of the mechanisms involved is necessary. This is achieved by comprehensively profiling the cell line during bioprocessing using ‘omics’ technologies, in order to elucidate its complete set of genes (genomics), mRNA expressions (transcriptomics), and metabolites (metabolomics and lipidomics), among others. In particular,lipid metabolites are analysed using a combined liquid-chromatography mass-spectroscopy (LC-MS) platform. However, there are challenges in the pre-processing of LC-MS data for molecular identification, due to issues such as high dimensionalities,presence of elemental isotopes, noises, machine performance and variability, and the need for expert knowledge on metabolite-specific ionization patterns. For lipids, the presence of isomers also present a challenge, resulting in ambiguity and theneed for further tandem mass spectrometry (MS2) analysis to confirm their actual identities.

In this study, we develop a fully integrated and automatable framework combining knowledge on common ions, and lipid family-specific species to improve lipid identification in terms of coverage and accuracy. Although there were suggestions of using theirfragmentation patterns for discovery, we were the first to demonstrate a proof-of-concept based on actual bioprocessing data, without prior knowledge of the underlying lipids. Using the method, we correctly identified 101 species from 18 classes inChinese hamster ovary (CHO) cells (Fig 1a), achieving an accuracy of ~ 80% (Fig. 1b). The resulting inferences could explain the recombinant-producing capability of CHO-SH87 cells, compared to non-producing CHO-K1 cells (Yusufi et al., 2017).For comparison, an independent study of the same dataset based freely-available software, guided by user’s ad-hoc knowledge, confirmed less than 60 species of 12 classes from thousands of possibilities. To conclude, we describe a systematicLC–MS-based framework that identifies lipids for rapid hypothesis generation in CHO cells.

Yusufi, F. N. K., Lakshmanan, M., Ho, Y. S., Loo, B. L. W., Ariyaratne, P., Yang, Y., Ng, S. K., Tan, T. R. M., Yeo, H. C., Lim, H. L., Ng, S. W., Hiu, A. P., Chow, C. P., Wan, C., Chen, S., Teo, G., Song, G., Chin, J. X., Ruan, X., Sung,K. W. K., Hu, W.-S., Yap, M. G. S., Bardor, M., Nagarajan, N., Lee, D.-Y., 2017. Mammalian Systems Biotechnology Reveals Global Cellular Adaptations in a Recombinant CHO Cell Line. Cell Systems. 4, 530-542.e6.

Figure 1. Effectiveness of framework.

(a) Overall lipid coverage from both producer cells and non-producer cells (positive and negative acquisition modes). For GPL, there is one identified phosphatidylglycerol (PG), lyso-PCand semilysobisphosphatidic acid species. Left bottom panels: number of species identified in each acquisition mode at predefined confidence level; corresponding numbers with sufficient intensity for verification either by MS2 technique or spectralsignature; final confirmed number of lipids. Four species (fatty alcohols and cholesterols) are not within scope of our study. (b) Number and proportion of true positive predictions at score thresholds of 6 (two cartoon peaks), 9 (threecartoon peaks) and 12 (4 cartoon peaks).