Tuesday, July 10, 2018

Knowledge Extraction with Topology-based Clinical Data Mining

Intego Group’s Senior Data Scientist, Andrey Rekalo spoke on July 10, 2018 at the 29th International Biometric Conference in Barcelona, Spain.

The topic of Andrey’s presentation was on Knowledge Extraction with Topology-based Clinical Data Mining

Clinical data mining refers to the application of data mining methods to clinical data.

Lack of Data Integration and Visualization Tools

While many computational techniques focus on univariate relationships between a specific clinical outcome and a few predictive variables, there is a lack of data integration and visualization tools that can improve our understanding of an entire dataset. Examining clinical data with a focus on a single outcome in isolation from other factors may lead to an incomplete, or even misleading, view of the increasingly complex data

Andriy Rekalo’s Paper.

In this paper, we describe a novel topology-based clinical data mining (TCDM) methodology to discover multivariate patterns in clinical trial outcomes. Our approach leverages the benefits of three independent tools:

Multiple Outcomes Analysis

Nonparametric Statistics

Topological Data Analysis

View Presentation from the Conference

TCDM allows to construct comprehensive topological maps of complex data without first having to develop a model or hypothesis

A topological map provides a compressed, visual representation of a multidimensional set of interrelated clinical outcomes. They help identify and explore subgroups of patients with similar responses within each subgroup from a diverse study population.

The well-established techniques of nonparametric statistical analysis are used to find the predictive variables, e.g. patients’ demographic characteristics or medical history, associated with the subgroups.

Results of the TCDM Methodology Approach

The TCDM methodology was adopted to develop a prototype of a software platform that provides a computational environment in which researchers can perform data mining experiments on clinical datasets. We successfully applied the TCDM approach to several publicly available clinical studies


Standard statistical tools are typically used to confirm (or refute) the hypotheses generated by an investigator and, hence, rely on the researchers ability to develop a solid hypotheses. However, in the case of clinical trial datasets, the number of possible hypotheses to explore is very large, and it can be very difficult to select the most valuable. TCDM provides an integrated approach to data analysis and visualization which facilitates the extraction of new knowledge from clinical datasets.

Stay in touch

Get connected with Intego, and let us know how we can help.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Intego Insights