BioMed DAR aims to serve as a data play space for the long-term hosting and value creation of programmatic research data from multiple health clusters (Singhealth, NUHS, NHG) and National platforms (NCTP, STCC). As such, our unit is set up as a SSSO (Standard Systems Support Office) that provides clinical research data management support and governance framework to strategic partners; Briefly, DAR manages the data life cycle of the cohort datasets through 3 operatives - data governance, data management operations (cleaning, integration, cataloguing, movement, usage) and s/w development. DAR builds and uses its own in-house data warehouse, data catalogue, data visualizers and secured data exchange API libraries for secured data transactions. In-house s/w builds allow us to adapt and integrate data management or governance workflow into our systems easily. On data security (See Figure 1), it is important to note that DAR does not store identifiable datasets and works with clinical data stakeholders to import only de-identified data into BII. Meanwhile, DAR operates within the BII ISO27001 certified environment for hosting its data warehousing, ETL processes and data sandboxes where strict SOP are in place to track data movement. Physically, the ISO27001 servers are located in designated server rooms where staff movement are logged and captured through CCTV. As an additional measure, the ISO27001 servers are caged up to pose further deterrence to the physical access of the disk drives.
Figure 1. DAR – Data architecture
On data governance (see Figure 2), data access of each cohort dataset needs to be approved by the respective cohort’s DAC (data access committee). This requires the project proposal to clearly define the purpose, scope and duration of the data usage and then subjected to a DAC review. Only upon DAC approval, the scoped dataset will then be ETLed (extract/transform/load) from DAR’s data warehouse to a designated VPN sandbox environment for its intended usage. Generally, no export of data will be allowed while the analysis results will need further approval by a data concierge team. These processes are collectively going through ISO38505 data governance certification. Together with ISO27001, ISO38505 delivers compliance assurance of regulated and secured data play of sensitive healthcare data to our health cluster stakeholders.
Figure 2. Data governance
Wong Wing Cheong received his undergraduate degree in Computer Engineering from NTU. He joined A*STAR under the joint MSc in Bioinformatics from A*STAR/NUS and subsequently completed his Dr. Rer. Nat (suma cum laude) in Computer Science from Leipzig University. He has been a Principal Investigator with BII since 2014 and is currently the Head/Principal Investigator of the BioMed DAR group since 2019.
Dr Tan Ming Zhen graduated from the National University of Singapore in 2012 with a B.Eng. (Hons) and B.Soc.Sci (Hons). He then moved on to the NUS Graduate School for Integrative Sciences to pursue his PhD in Engineering, which he obtained in 2017. During his PhD, he worked on several research projects involving mathematical modelling, image processing, and statistics. After completing his PhD, he joined a local FinTech start-up as the Head of Technology, developing the code base for passive investment. In 2021, Dr Tan Ming Zhen joined the Bioinformatics Institute as Assistant Principal Investigator, where he is currently researching data management support and privacy-protection technologies.
Dr Tan has previously worked on the diffeomorphic metric mapping of brain surfaces and volumes, as well as keeping a real-time dashboard for manipulating datasets, visual data exploration tools, and implementing statistical tests. His current area of study is synthetic data and its potential use as a privacy-protection tool. More specifically, he is researching methods for generating high-fidelity synthetic data from real-world data sets, and hopes to improve their potential as surrogates for code development and research.
From groundbreaking discoveries to cutting-edge research, our researchers are empowering the next generation of female science, technology, engineering and mathematics (STEM) leaders. Get inspired by our #WomeninSTEM