BioMedical Data Architecture & Repository



BioMed DAR aims to serve as a data play space for the long-term hosting and value creation of programmatic research data from multiple health clusters (Singhealth, NUHS, NHG) and National platforms (NCTP, STCC). As such, our unit is set up as a SSSO (Standard Systems Support Office) that provides clinical research data management support and governance framework to strategic partners; Briefly, DAR manages the data life cycle of the cohort datasets through 3 operatives - data governance, data management operations (cleaning, integration, cataloguing, movement, usage) and s/w development. DAR builds and uses its own in-house data warehouse, data catalogue, data visualizers and secured data exchange API libraries for secured data transactions. In-house s/w builds allow us to adapt and integrate data management or governance workflow into our systems easily.

On data security (See Figure 1), it is important to note that DAR does not store identifiable datasets and works with clinical data stakeholders to import only de-identified data into BII. Meanwhile, DAR operates within the BII ISO27001 certified environment for hosting its data warehousing, ETL processes and data sandboxes where strict SOP are in place to track data movement. Physically, the ISO27001 servers are located in designated server rooms where staff movement are logged and captured through CCTV. As an additional measure, the ISO27001 servers are caged up to pose further deterrence to the physical access of the disk drives.

Figure 1. DAR – Data architecture

On data governance (see Figure 2), data access of each cohort dataset needs to be approved by the respective cohort’s DAC (data access committee). This requires the project proposal to clearly define the purpose, scope and duration of the data usage and then subjected to a DAC review. Only upon DAC approval, the scoped dataset will then be ETLed (extract/transform/load) from DAR’s data warehouse to a designated VPN sandbox environment for its intended usage. Generally, no export of data will be allowed while the analysis results will need further approval by a data concierge team. These processes are collectively going through ISO38505 data governance certification. Together with ISO27001, ISO38505 delivers compliance assurance of regulated and secured data play of sensitive healthcare data to our health cluster stakeholders.


Figure 2. Data governance


 Head of BioMedical Data Architecture & Repository WONG Wing Cheong   |    [View Bio]  
 Assistant Principal Investigator TAN Ming Zhen   |    [View Bio]  
 Project Manager FUN Max
 Senior Research Officer II LIM Aloysius
 Research Officer KOH Ziying
 Research Officer DONG Jiahui


Selected Publications