Workshop on Advanced Techniques for Handling Imbalanced and Unlabelled Data for Prognostic Health Management

Date: 13 Oct 2011 - 13 Oct 2011

Venue: Auditorium

Efficiency and effectiveness in condition based maintenance are crucial for ensuring reliability, availability, quality and safety in aerospace industry. Handling the imbalanced and unlabelled data for classification is a crucial problem in condition based maintenance. The objective of this workshop is to share Aerospace Programme (AP) members for advanced data mining and computational intelligent techniques to handle imbalanced and unlabelled data for fault/failure classification, detection, diagnostics and prognostics in aerospace applications. The topics include overview and new trend of prognostic health management, advanced data mining techniques for imbalanced and unlabelled data clustering and classification.

1.00pm    Registration

1.15pm    Introduction by Project PI, I2R

1.30pm    Operational data Processing and Techniques for Innovative MAINTenance (OPTIMAINT) - Towards Condition-Based Maintenance by Dr David
                  Woon, Head of Operations Project Leader, Intelligent & Semantic System Team, EADS Innovation Works Singapore

2.00pm    Intelligent Approaches for Prognostic Health Management: Overview and New Trends by Dr Rafael Gouriveau, Associate Professor, École
                  Nationale Supérieure de Mécanique et des Microtechniques (ENSMM)

2.30pm    Data Mining Techniques for Handling Imbalanced and Unlabelled Data 
                  (a) Class-Imbalance & Semi-Supervised Learning for Anomaly Detection by Dr Li Xiang, Research Scientist / Mr Mark Ching, Research Engineer, 
2.50pm   (b) Research on Oversampling for Imbalanced Time Series Classification and Positive Unlabelled data classification by Dr Xiao Li, Research
                       Scientist / Dr Cao Hong, Research Scientist, I2R  

3.10pm     Coffee Break

3.30pm     Roundtable Discussion

5.00pm     End

Operational data Processing and Techniques for Innovative MAINTenance (OPTIMAINT) - Towards Condition-Based Maintenance
The road to Condition-Based Maintenance (CBM) in the aerospace industry is a long and arduous one. To accelerate the journey, a collaborative research program called OPTIMAINT (Operational data Processing and Techniques for Innovative MAINTenance) was established in Singapore in 2008. The project aims to demonstrate the ability of HUMS (Health and Usage Monitoring Systems) to cover all degradation modes for selected components. This talk covers the key steps of the project and shares some results obtained from historical HUMS data through the use of statistical and data mining techniques.

Intelligent Approaches for Prognostic Health Management: Overview and New Trends
In the light of the benefits that can follow from it, industrials show a growing interest in Prognostics and Health Management (PHM) methodology, tools and devices. This thematic becomes a major research framework. Also, the estimation of the remaining useful life of a system (RUL) is nowadays considered as a key point to increase reliability, availability and security of equipments while reducing costs and risks. This outline a major challenge that consists in transforming raw data gathered on the system into information revealing its current and future health. This task is hard to perform: behaviours of a system are highly nonlinear, useful data and information can be ambiguous and/or partial, the underlying uncertainty is hard to model and manage, etc. Following that, intelligent techniques for PHM are increasingly developed and applied. The aim of this talk is to give an overview of intelligent approaches for PHM, and to discuss new research challenges.

Data Mining Techniques for Handling Imbalanced and Unlabelled Data:
(a) Class-Imbalance & Semi-Supervised Learning for Anomaly Detection
Handling the imbalanced and unlabelled data for anomaly detection is a crucial problem in many applications of the aerospace industry. Two relatively new research methods in Data Mining have been proposed in this project for the purpose. They are the Class-Imbalanced (CI) and Semi-Supervised (SS) learning problems. Both exists in many real-world applications, mainly classification tasks, e.g. medical diagnosis, fraud detection, etc..., but are often overlooked by researchers due to the lack of understanding of the underlying problem. In this talk, we will introduce the concept of class-imbalanced and semi-supervised learning, discuss the challenges in each research area and give an overview on the current state-of-the art methodologies available.

(b) Research on Oversampling for Imbalanced Time Series Classification and Positive Unlabelled data classification
We will first survey the state-of-the-arts in handling imbalanced data classification. Then, we introduce our proposed structure preserving oversampling (SPO) technique for classifying imbalanced time series data. SPO generates synthetic minority samples based on multivariate Gaussian distribution by estimating the covariance structure of the minority class and regularizing the unreliable eigen spectrum. Our experimental results show that SPO can achieve better performances than existing oversampling methods and state-of-the-art methods in time series classification. The paper based on SPO technique has been recently accepted by the leading data mining conference ICDM (IEEE International Conference on Data Mining). Finally, we will also share our past experience on positive unlabelled based machine learning, which could be useful for the underlying project (in the second stage) as it can exploit unlabelled data to enhance classification.

About the Speakers
Dr David Woon is Head of Operations of EADS Innovation Works Singapore (IW-SG). He is also is the Project Leader of the Data Mining team in IW-SG. Prior to joining EADS, he served as a Senior Officer in the Science and Engineering Research Council (SERC) of the Agency for Science, Technology & Research (A*STAR). He graduated from the Nanyang Technological University (NTU) in Singapore in 2000 with a bachelor's degree in Applied Science with Honours (First Class). He went on to obtain a Ph.D. in Computer Engineering from NTU in 2004. His research interests include association rule mining, cluster analysis and Bayesian networks. He has published several papers in journals and conferences including the IEEE Transactions on Knowledge and Data Engineering and the ACM Conference on Information and Knowledge Management.

Dr Rafael Gouriveau received his engineer degree from National Engineering School of Tarbes (ENIT) in 1999. He then got his MS (2000) and his Ph.D. in Industrial Systems in 2003, both from the Toulouse National Polytechnic Institute (INPT). During his PhD, he worked in the field of risk management and dependability analysis. In September 2005, he joined the national high school of mechanics and microtechniques of Besançon (ENSMM) as Associate Professor. His main teaching activities are concerned with production, maintenance, manufacturing, and informatics domains. Nowadays, his research interests concern the development of industrial prognostics systems by using connexionist approaches like neuro-fuzzy methods, and the investigation of reliability modeling by using possibility theory.

Dr Li Xiang is a Research Scientist in Manufacturing Execution and Control group, Singapore Institute of Manufacturing Technology, A*STAR. She has more than 15 years of experience in research on artificial intelligence, data mining, machine learning and statistical analysis, such as neural networks, fuzzy logic systems, unsupervised data clustering and regression modeling. She has been jointly awarded together with NTU several grants under the A*STAR SERC. She led several projects to develop intelligent systems for tool condition monitoring, equipment/process performance prediction, multivariate anomaly detection for product quality control, manufacturing execution & decision support system, data warehousing, intelligent optimization & simulation, web-based intelligent forecasting and discovering customer demand for new product design. Her research interests include data mining and knowledge discovery, decision support systems, prognostic health management. She is a member of the Decision Sciences Institute, USA and member of IEEE.

Dr Xiaoli Li is a Principal Investigator in data mining department, Institute for Infocomm Research, A*Star. He also holds an adjunct assistant professor in Nanyang Technological University (School of Computer Engineering). He has published more than 70 papers in the leading data mining and machine learning conferences, such as ICML, IJCAI, AAAI, KDD, ICDM, SDM, etc. In particular, one research area he has focused on is Positive Unlabelled based machine learning and a few his papers have been cited for more than 100 times. Xiaoli has served as the PC members/session chairs for all the top data mining conferences. He is also serving as an Editor-in-Chief or editorial board members for a few international journals. He is IEEE and ACM member. In 2005 he won the Best PAPER Award of the 16th International Conference on Genome Informatics. In 2011, he won the Best Paper Runner-Up Award of the 16th International Conference on Database Systems for Advanced Applications.

Registration is open to Aerospace Program (AP) members only. Please register online to reserve a seat for this non-chargeable event.

Contact Us
For technical enquiries: Li Xiang, Email:
For general enquiries: Melissa Loh, Email: