Natural Organism Library

Natural Product Discovery Platform


There is a perpetual demand for new drugs to provide better therapeutic outcome or overcome emerging drug resistance. While combinatorial chemistry approaches with pharmacophore fragments still await sophistication to fulfil high expectations towards lead discovery, using the structural diversity of natural products will greatly expand the chemical search space of traditional compound libraries. Furthermore, greater than a third of recently approved medicines are natural products or have been derived from lead compounds found in living organisms. Products derived from natural organisms are more likely to appeal to the ecologically conscious customers, especially in the food and personal care sectors. In addition, natural products are often perceived as being safer than chemically synthesized compounds. The A*STAR Natural Product Library (NPL) with ~160,000 plant, fungal and bacterial specimens, one of the largest in the world (Figure 1), is a prime resource partner for academia, industry and government research organizations who are looking for natural active ingredients.

This NPL has been developed over the past ~20 years through collection from targeted local habitats, diverse international collaborations and by strategic acquisitions. The genetic diversity within NPL is exceptional. With 57% of all known cultured fungal genera, over 67% of the world’s plant families and 70% of filamentous bacterial genera represented, the collection has been described as “the most diverse and comprehensive collection of plant and microbial samples in the world” (Prof Geoffrey A. Cordell, University of Illinois). In 2014, the Bioinformatics Institute was entrusted by the Agency for Science and Technology (A*STAR) to house the NPL previously acquired from MerLion Pharmaceuticals Pte Ltd. We do not only offer our collections of organisms, extracts and isolated natural compounds or the accompanying electronic databases. In contrast to NPLs elsewhere, we have an in-house Natural Product Discovery Platform (NPDP) that provides expertise in biological high-throughput screening (HTS), genomics, analytical chemistry, natural product research, synthetic biology and bioinformatics. For mining the collection, we use (i) traditional biological HTS as well as alternative (ii) chemical, (iii) genomic and (iv) in silico screening approaches. 

Biological Screening:  Bioassay-guided compound isolation is accomplished through screening of the extracts against biochemical/cellular assays to identify biological activities of interest. Once the active extracts have been found, the active principles are purified using bioassayguided compound isolation. While HTS is thought to be entrenched with drug lead discovery and, perhaps, the agrochemical industry, it has not yet been as widely used in the food and consumer care space. The HTS team at BII has accumulated the experiences of around 200 screening campaigns, involving biochemical assays as well as microbial and mammalian cellular assays. And 10% of the screens were for the discovery of bioactive ingredients for food and consumer care applications. From these screening efforts we had isolated more than 2,000 bioactive compounds from NPL (Figure 2A). Screenings can be carried out over the whole collection as well as over selected subsections such as plants, fungi or microbes from pre-defined habitats, with certain known usage such as edible plants or herbs and fungi used in Traditional Chinese Medicine. Our plant samples are usage-annotated (traditional and modern uses for the individual plants and products derived thereof) and the information were collated from several web-based plant-related databases. Despite missing scientific evidence to support certain claims, such information offers an opportunity to target certain plants for further investigation. In our two recent screening campaigns using only the edible plants to search for natural functional ingredients for food application, we identified several active plants for our industry partners, and clinical trial is planned for one active plant, in preparation for product development.

BII- Natural Organism Library Figure 1
Figure 1: Composition of the A*STAR Natural Product Library

Chemical Screening Option:  Approximately 25% of the crude extract library (~76,000 extracts; Figure 2B) have been chemically fingerprinted using an automated low resolution liquid chromatography mass spectrometry (UPLC-MS) system. The respective fingerprinting database (FPD) allows analyses with regard to production yield, similarity comparison or chemical diversity. The FPD can be mined to identify novel producers by first analyzing the targeted compound as a standard in the UPLC-MS system and then using this information to query the FPD for samples containing compounds with similar retention time and MS profiles. Hit extracts are then subsequently validated for the presence of target molecule by HPLC-MS. Using this approach we have successfully identified two new producers of a recently reported antibacterial compound, anthracimycin. Large-scale compound isolation on these two new producers yielded new anthracimycin analogues (manuscript in preparation). We also found new producers of other antimicrobial compounds that we had isolated from our HTS campaigns. Hence, chemical screening can be used to find new analogues, chemical derivatives and more efficient producers of novel bioactive compounds. 

BII - Natural Organism Library Figure 2
Figure 2: Isolated compounds and Metabolite Fingerprint Database of NPL

Genomic and in silico screening Options:  Many biosynthetic gene clusters are not readily expressed when the organisms are grown in typical laboratory conditions. Thus, despite decades of fermentation-based screening, many bioactive compounds remain to be discovered. Augmenting the information already available in the collection’s inhouse database with genetic sequence, protein annotation, genetic regulatory maps and high-resolution chemical indexing fingerprints (i.e., creating an in silico NPL similar to NPCARE) greatly increases its utility as an application platform. In silico screening will increasingly become the first step in a natural product research project by looking into annotations of genomes, proteins, metabolites, etc. collated in our in-house database that characterizes the collection and the results will guide follow-up experimental planning. To this end, we have accumulated the genome sequence data for more than 150 microbial strains from NPL, and the sequencing of at least another 2,000 strains is in progress. From the sequenced genomes putative biosynthetic gene clusters (BGCs) for secondary metabolites were identified using sequence annotation tools. Some of the identified clusters have been mapped to isolated secondary metabolites. We collaborated with other researchers to validate the identified biosynthetic clusters using various approaches, such as gene knock-in/knock-out experiments and transcriptomics analysis. We have a continuous workflow of genome sequencing of prioritized microbes, biosynthetic clusters identification, isolation of secondary metabolites and compound to biosynthetic clusters mapping and validation. As a proof of concept we used the published information on the biosynthetic gene cluster of anthracimycin to mine publicly available bacterial genomes and found a new non-streptomyces producer of this antibacterial compound. Experimental validation resulted in the discovery of a new analogue of anthracimycin.

Chemogenomic profiling in yeast (Saccharomyces cerevisiae) can be used to determine the biological targets of bioactive compounds, understand their mechanism of action and unravel functions of poorly understood genes. We use haploinsufficiency profiling (HIP), homozygous deletion profiling (HOP) and haplo-insufficiency profiling (HIP), and multi-copy suppression profiling (MSP) approaches for this purpose. Together with various industrial and academic partners, we have performed dozens of chemogenomic profiling screens and found new protein targets for bioactive natural compounds. Our work on a new antifungal compound, hypoculoside, has recently been published.10

Virtual screening of large libraries of chemicals for compounds that complement targets of known structure will identify leads that can subsequently be experimentally tested in molecular binding and biological function assays. With the limited supply of purified natural products and increasing number of protein structure being solved, virtual screening seems a good complementary approach to biological screening for natural product research. We tested the merit of this approach by in silico screening our natural product compound set against the bacterial enzyme sortase A and successfully identified two inhibitors that affected S. aureus biofilm formation (manuscript in preparation).

To conclude, BII’s NPDP offer complementary screening approaches to mine the NPL for useful natural substances with a wide range of applications. As the library and its usage opportunities are huge, we invite interested academic and industrial parties to explore the BII Natural Organism Library as a tool for their research.

– Adapted from our article published in Nature Biotechnology (2018)


1. Harvey AL, Edrada-Ebel R, Quinn RJ. The re-emergence of natural products for drug discovery in the genomics era. Nat. Rev. Drug Discov. 14, 111-129 (2015).
2. Newman DJ, Cragg GM. . Drugs and Drug Candidates from Marine Sources: An Assessment of the Current “State of Play”. Planta Med. 82, 775-789 (2016).
3. Jang KH, Nam SJ, Locke JB, Kauffman CA, Beatty DS, Paul LA, Fenical W. Anthracimycin, a potent anthrax antibiotic from a marinederived actinomycete. Angew. Chem. Int. Ed Engl. 52, 7822-7824 (2013).
4. Choi H, Cho SY, Pak HJ, Kim Y, Choi JY, Lee YJ, Gong BH, Kang YS, Han T, Choi G, Cho Y, Lee S, Ryoo D, Park H. NPCARE: database of natural products and fractional extracts for cancer regulation. J. Cheminform. 9, 2 (2017)
5. Blin K, Medema MH, Kottmann R, Lee SY, Weber T. The antiSMASH database, a comprehensive database of microbial secondary metabolite biosynthetic gene clusters. Nucleic Acids Res. 45, D555-D559 (2017).
6. Eisenhaber B, Kuchibhatla D, Sherman W, Sirota FL, Berezovsky IN, Wong WC, Eisenhaber F. The Recipe for Protein Sequence-Based Function Prediction and Its Implementation in the ANNOTATOR Software Environment. Methods Mol. Biol 1415, 477-506 (2016).
7. Medema MH, Osbourn A. Computational genomic identification and functional reconstitution of plant natural product biosynthetic pathways. Nat. Prod. Rep. 33, 951-962 (2016).
8. O'Brien RV, Davis RW, Khosla C, Hillenmeyer MEComputational identification and analysis of orphan assembly-line polyketide synthases. J Antibiotics 67:89–97(2014).
9. Sirota FL, Goh F, Low KN, Yang LK, Crasta SC, Eisenhaber B, Eisenhaber F, Kanagasundaram Y, Ng SB. Isolation and Identification of Anthracimycin Analogues from Nocardiopsis kunsanensis, a Halophile from a Saltern, by Genomic Mining Strategy. Journal of Genomics 6, 63 (2018).
10. Alfatah M, Wong JH, Nge CE, Kong KW, Low KN, Leong CY, Crasta S, Munusamy M, Chang AML, Hoon S, Ng SB, Kanagasundaram Y, Arumugam P. Hypoculoside, a sphingoid base-like compound from Acremonium disrupts the membrane integrity of yeast cells. Scientific Reports 9, 710(2019).