Protein Sequence Analysis

BII - Protien Sequence Analysis Group Photo


Our group expertise is in computational protein sequence and structure analysis to predict various aspects of molecular and cellular functions (enzymatic activities, posttranslational modifications, cleavage, translocation signals, 3D structures, effects of mutations, phylogenetic relationships, cellular pathways etc.) for discovering the molecular mechanisms of biological and clinical phenotypes and experimental validation together with collaborators. Our repertoire of computational analysis methods is applicable and useful in multiple research areas but our main focus currently is on infectious diseases, human mutations, allergy and enzyme function prediction.

Infectious Diseases

One of our traditional strongholds since the swine flu in 2009 is infectious disease research. Our FluSurver (  is the most complete one-stop influenza mutation analysis tool being used by researchers and surveillance experts globally. We have several published and ongoing projects with the WHO CC in Australia and National Influenza Centres relating to influenza drug resistance, viral fitness, host specificity and antigenic changes. The FluSurver is also a primary analysis tool for GISAID, the most complete influenza database also known for always hosting the latest outbreak sequences.

In one of our highlights in influenza research 2019 we developed a new approach that could reduce animal studies aimed at understanding influenza virus mutations that change host specificity to adapt to replication in mammalian hosts which have been in the spotlight of government bans against gain of function experiments for concerns on safety. As a safe, higher throughput alternative, researchers from BII, Harvard and Amsterdam Medical Centre explored the possibility of using readily available passage bias data from 80,000 seasonal surveillance influenza strains shared via GISAID that were either grown in mammalian cells or eggs. Using a statistical approach to identify host adaptation sites form this data, we found that information from passage bias can identify the known and also provide new candidate sites for host specificity changes to aid in risk assessment for emerging strains. In other notable infectious disease work published in 2019, we identified and in vivo verified drugs approved for other diseases that can be repurposed against influenza and helped characterizing intense interseasonal influenza outbreaks in Australia. Led by former PhD student Alvin Han, our group also published novel methods PhyCLIP and Phydelity for parameter-free phylogenetic clustering as well as identification of transmission chains, respectively.

Because we can quickly go from genomes to protein structures through modelling in our computers often only requiring the new sequences as input, our group offers powerful support in infectious disease surveillance and rapid outbreak investigations to get a quick handle on bugs here and around the world. Besides Influenza, we also helped characterizing MERS, Ebola, HIV, Noro, Adeno, Hepatitis C, West Nile, Dengue and Zika viruses. Through close collaboration with the National Public Health Laboratory at the National Centre for Infectious Diseases of the Ministry of Health we contribute our knowledge and computational expertise at the national frontline for infectious disease surveillance. 

With the arrival of a new pandemic through a new Coronavirus causing COVID-19, the group has once again shown its value in reacting early and fast in the outbreak to not only help in sharing and analysing genomes globally via the GISAID platform but also work with other groups to quickly develop tools for diagnostics, repurposing treatment options and track mutations of the virus to understand global and local transmission and monitor phenotypic changes.

Human Mutations, Allergy and Enzyme Function Prediction

We aim at bridging the gap from nucleotide variation to protein structures to interpret effects of human mutations. For example, we have helped clinical collaborators to analyze variants found in patients and tried to mechanistically explain their possible role in a range of diseases like cancer, myopia, leprosy or atopic dermatitis. We are participating in the National Precision Medicine Programme to help mapping mutations into 3D protein structures relative to drug binding sites also contributing to the prestigious Cell paper by our colleagues from GIS.

In our ongoing flagship industry project, large multinational Procter & Gamble and BII are jointly developing animal-testing free Bioinformatics techniques for assessing the allergy potential of proteins using their amino acid sequence and tertiary structure ( . Our new method (AllerCatPro, Figure 1) allows to assess allergenicity potential of proteins with 37-fold increase in specificity at 100% sensitivity. The method can be used to support risk assessment of using proteins (e.g. from plant material) in consumer care products. This reduces risks of failures in product development and increases safety for consumers. Our joint interest is to make the method available widely ( to facilitate acceptance by the scientific community and regulators. Going forward, our team, together with A*STAR’s Innovations in Food and Chemical Safety (IFCS) Programme and the new Singapore Institute of Food and Biotechnology Innovation (SIFBI), plan to apply AllerCatPro to the safety assessment of proteins found in novel foods, such as those replacing meat with alternative protein sources. By getting regulatory bodies such as the Singapore Food Authority and companies in the food and nutrition sector on board, we hope that AllerCatPro will contribute towards Singapore’s vision of ensuring national food security and safety.

In other notable projects published in 2019 we found that knockout of the non-essential gene SUGCT creates dietlinked, age-related microbiome disbalance with a diabetes-like metabolic syndrome phenotype and human missense variants that can affect the function of disease-relevant proteins by loss and gain of peroxisomal targeting motifs. Often including industry collaborations, we are applying our sequence function and pathway analysis capabilities to support A*STAR’s Natural Product Library and the A*STAR Biotransformation Innovation Platform as well as the Pharma Innovation Programme Singapore. Another direction to support A*STAR’s Innovations in Food and Chemical Safety programme is for in silico protein binding target identification, pathway analysis and to highlight common SNPs in the local population that may alter the response to toxic substances.

BII - Protein Sequence Analysis Figure 1
Figure 1: AllerCatPro workflow, search methods and databases.
(A) Decision workflow of AllerCatPro from the query protein to the results of either strong, weak or no evidence for allergenic potential.
(S1-S5) Search methods utilized at different stages of the workflow.
(D1-D3) Databases created and used for the searches in the workflow.


 Executive Director MAURER-STROH Sebastian   |    [View Bio]  
 Research Scientist  LIMVIPHUVADH Vachiranee
 Senior Post-Doctoral Research Fellow KENANOV Dimitar 
 Senior Post-Doctoral Research Fellow HO Wei-Hao Joses
 Senior Post-Doctoral Research Fellow MAK Tze Minn Sandy
Post-Doctoral Research Fellow  CHONG Cheng Shoong Ken
 Research Manager LEE Tze Chuan Raphael
 Senior Research Officer XU Yani Angela
 Research Officer MIYAJIMA Jhoann
 Research Officer CHEW Yi Hong
 Research Officer MAKHEJA Meera


Selected Publications