The group of Computer Vision and Pattern Discovery for BioImages uses advanced computer vision, machine learning and mathematical models to build better machines; for the improvement of health care and discovery of biological knowledge. The group analyses images of tissues, histological slides, radiology images and 2D/3D live cells assays. These images were acquired using a wide variety of imaging devices.
The number one cause of death in the world is heart disease. Thus, early detection is the key to treatment. Echocardiography is the most widely used tool for detection of heart disease. However, it has several disadvantages. Firstly, images analysis is manual and it takes up to 30 minutes per patient. Secondly, sonographer shortages are common and manual results vary widely. Lastly, the hardware and software needed are expensive as they cost approximately $200k. We aim to eliminate the manual processes and the expensive hardware used by doctors by developing an intelligent software that can review echo results to determine if a patient has heart disease, while having the option to review why the decisions were rendered by our system. We have unique access to proprietary data for training our deep learning models, clinical outcome data, inhouse clinical expertise, proprietary image processing & DICOM workflow technique.
Coronary angiography is the gold standard imaging technique for visualizing the coronary arteries which aids in diagnosing coronary artery disease, and guiding patient management. Iodine-based contrast is injected into the coronary arteries and multiple moving X-ray images are acquired from different view angles around the patient torso. Cardiologists are trained to interpret the coronary angiogram, but this takes time and there may be interobserver disagreement. In a new collaboration with the National Heart Centre Singapore, we are exploring artificial intelligence approaches to analyzing X-ray video sequences with the goal of developing a quantitative assessment tool for repeatable and objective angiographic measurements.
According to the World Health Organization, cancer is one of the major causes of death globally, and it is estimated to be responsible for 9.6 million deaths in 2018. This highly deadly disease starts in one cell or in a small group of cells that acquire mutations in their genetic material and become abnormal cells. Then, abnormal cells start to grow in an uncontrolled manner, which is called as cancer, come together and form tumors. Tumors are composed of abnormal cell groups with different genetic materials (so biological capabilities), which is called as intra-tumor heterogeneity, since cancer is a reiterative evolutionary process and abnormal cells are susceptible to further mutations during their lifetime. Hence, intra-tumor
heterogeneity exists within the tumors and results in therapeutic failure and drug resistance in cancer. Therefore, intra-tumor
heterogeneity is one of the key difficulties in cancer treatment.
We are developing deep learning models to predict the intratumor heterogeneity and reveal the histological features behind intra-tumor heterogeneity by analyzing histopathology images. We aim to support medical professionals in diagnosis, treatment plans, medication management and precision medicine of cancer in order to better address increased healthcare demands in the future.
Whole slide imaging (WSI) has increased availability of data to scientists and physicians for research, training, and more accurate diagnosis. With the help of this technology, more complex analysis of the tissue volume is now possible such as 3D analysis of the vascular network in the tissue volume of interest. A 3D analysis, however, requires reconstruction of the tissue volume from the acquired images. This task is not trivial as the process of cutting the tissue volume into thin slices and mounting them on the glass slides may impose different deformations to each individual slice. The majority of proposed WSI registration algorithms perform registration for the whole tissue slice in consecutive glass slides. In such algorithms, the registration results are not always desirable as the registration is poorly affected by the deformed tissue regions. As a result, regional registration is found to be more effective. In order to perform an accurate regional registration, rough registration of the consecutive tissue slides is crucial. We propose a robust algorithm for rough registration of whole slide images. We show that using our registration algorithm followed by a regional registration provides accurate and more robust registration results.
Acne vulgaris is one of the most common skin disease afflicting humanity. It is caused by overactive sebaceous glands, which
are clinically characterized as comedones, papules, pustules, nodules and, in some cases, scarring. Grading is a subjective method, which involves determining the severity of acne, based on observing the dominant lesions, evaluating the presence or absence of inflammation and estimating the extent of involvement. Investigator Global Assessment (IGA) score is normally used to classify the level of severity from 0 to 4, where 0 is the lowest and 4 is the highest scale according to the presence of different types of acnes and their density in the region of interest i.e., facial or truncal view.
In the process of acne grading, different types of acnes are observed and counted by the doctors to evaluate the presence
or absence of inflammation. This screening process is very tedious and time consuming, which can cause high number of false
positives. Therefore, an automated acne grading system is needed that can help the dermatologists and skin specialists in the screening, both before and after treatment. In this project, we work with dermatologists from the National Skin Center to develop an automated acne grading system, which will use deep learning architectures to classify a given image from IGA scale 0-4 based on the level of acne severity.
In our current dataset, we have a class imbalance problem; therefore, we convert five class problem (IGA score 0-4) to three class problem (low, medium and high). Where IGA score 0-1 is considered low, IGA score 2 is consider medium and IGA score 3-4 is consider high. Then we designed and trained a CNN model for three class classification problem. Our model is able to achieve a classification accuracy of 70%, 69% and 62% for low (Class A), medium (Class B) and high (Class C), respectively as shown in the confusion matrix above.
Deep learning methods have shown superior performance in many machine learning applications due to their ability to model high-level abstractions in the data. However, conventional neural networks are not efficient for regression on distributions. Since each node encodes just a real value in the neural network, the network is unable to encode distributions compactly, resulting in many parameters. To that end, we propose a novel network which generalizes the neural network structure by encoding an entire probability distribution in each node. Our network, called distribution regression network (DRN), exhibits non-linear level sets in the transformation in each node, increasing the degree
of non-linearity in each layer. On several real world datasets for distribution-to-distribution regression, DRN achieves higher accuracies than conventional neural networks while using much fewer parameters.
As an extension, we used DRN for the more complex task of performing forward prediction on time sequences of distributions. We also studied how the test accuracy varies with the size of training set since the number of data may be limited given the seasonality of a time series data. We found that DRN requires two to five times fewer training data than neural networks to achieve similar accuracies. However, DRN is a feedforward network and does not explicitly model time dependencies. Hence, we propose a new recurrent architecture for DRN, named recurrent distribution regression network (RDRN). RDRN and DRN outperform neural network models, with RDRN achieving similar or better accuracies than DRN. Given the importance of distributions in capturing characteristics of a population and the effects of noise, DRN and RDRN have applications in a wide range of fields such as human population studies, bioinformatics and finance.
Lee Hwee Kuan is a Senior Principal Investigator of the Imaging Informatics division in Bioinformatics Institute. His current research work involves developing of computer vision aglorithms for clinical and biological studies. Hwee Kuan obtained his Ph.D. in 2001 in Theoretical Physics from Carnegie Mellon University with a thesis on liquid-liquid phase transitions and quasicrystals. He then held a joint postdoctoral position with Oak Ridge National Laboratory (USA) and University of Georgia where he worked on developing advanced Monte Carlo methods and nano-magnetism. In 2003, with an award from the Japan Society for Promotion of Science, Hwee Kuan moved to Tokyo Metropolitan University where he developed solutions to extremely long time scaled problems and a reweighting method for nonequilibrium systems. In 2005 he returned home to join Data Storage Institute, investigating novel recording methods such as hard disk recording via magnetic resonance. In 2006, he joined Bioinformatics Institute as a Principle Investigator in the Imaging Informatics Division.
Lee Hwee Kuan's current research focuses on for analysis of tissues, histological and cellular images. These images are obtained from light microscopy, including image data sets from high-throughput screens.