Intego Clinical was a proud participant of PHUSE US Connect 2021 which took place on June 14th – 18th. We’re excited to share that we had a few outstanding leaders from our Statistical teams who were a part of delivering presentations at this years virtual conference. The event attracted 749 people who attended the conference through live presentations, on-demand recordings, themed panel discussions and hands-on workshops.
The following are the speakers from Intego Clinical and the presentations they delivered.
Andrii Borzenkov. Senior Statistical Programmer/Analyst
Andrii joined Intego Clinical as a SAS Programmer and moved up to Senior Statistical Programmer/Analyst during his 2 years here at the company.
Presentation: SDTM, ADaM and Database Normal Forms Implementation
Database normal forms are an important part of database theory often underutilized by clinical programmers. Implementing normal forms improves the implementation of CDSIC standards and omits potential logical errors. They may also improve the process of internal documentation and project setup, increasing overall efficiency of clinical SAS programming.
This paper provides an overview of database normal forms and the application of SDTM and ADaM standards to them. We give examples showing how some CDSIC standards implementation are disadvantageous from a database theory perspective, and recommend ways these standards could be updated to satisfy normal form rules.
Serhii Kashyn. Statistical Programmer/Analyst
Starting out as student at Intego Clinical’s Center for Biostatistical Programming, Serhii joined the company as an intern and over the last 3 years moved up to Statistical Programmer/Analyst
Presentation: Comparative characteristics of logistic regression implementation in SAS and R on the example of cardiovascular diseases
Programmers working on clinical trials often have to cluster patients into different groups. This issue raises the question of the correct distribution of a person in a particular group, focusing on certain factors (feature selection – the question of selecting features that are important for the model).
One of possible methods for finding the answer to this question is logistic regression. A number of patients with data on the cardiovascular system were taken as material for studying logistic regression. Based on method Flow Mediated Dilatation, a clustering variable has been created for three groups – sufficient, insufficient and inadequate vascular reactivity.
Our purpose now is to determine the relationship between the clustering variable and all the possible factors of the cardiovascular system with help of SAS and R, simultaneously exploring the difference in implementation between the two languages.
Kostiantyn Drach. Consulting Data Scientist.
Kostiantyn joined Intego Clinical in 2016
Presentation: Exploring categorical data using unsupervised graph-based machine learning
In this paper, we rely on graphs as a fundamental approach to structuring and analyzing clinical data. Combined with modern machine learning techniques, both supervised and unsupervised, graphs can be an effective solution for the accelerated insights generation and real-world analytics. Unlike supervised learning that relies on an existing hypothesis, unsupervised machine learning algorithms are naturally built to discover hidden insights in the data without prior knowledge. We applied unsupervised learning to explore a large publicly-available clinical dataset of over 30,000 participants. The dataset from the National Health Interview Survey combines multiple questionnaires with categorical data. Unlike with continuous data, which provide an accurate and reliable assessment for primary clinical endpoints, analyzing categorical data on either a nominal or ordinal scale could be challenging. The computational experiment revealed hidden patterns in the dataset that it would have been difficult to discover using standard statistical methods for categorical data analysis.
Iryna Kotenko. Site Lead.
Iryna brings an exceptional 15 years of analysis in clinical trials using SAS and has been a part of Intego Clinical since 2012
Presentation: COVID-19 outbreak visualization using clinical data science platform
Atlas-TDA is a clinical data science platform that enables researchers to extract topological models from clinical datasets represented in the form of graphs. The platform provides robust solutions for enhanced exploratory analysis, new hypothesis generation, risk-based monitoring and for many other challenges. Real-world data can be essential for understanding clinical data, especially with the emergence of phenomena such as the COVID-19 outbreak. Using Atlas-TDA, we analyze how the pandemic spread has advanced across the US. A topological model was extracted from several publicly available datasets and represented as a graph in which every node corresponds to a single county (over 3000 nodes), whereby two counties are connected with an edge only if they have similar patterns in the advance of the pandemic spread over a specific timeframe. This model helped discover a set of unrelated features that could potentially cause the similarity in epidemic growth across the US.
Topics during the event covered areas such as COVID-19, data science, emerging trends, the future of clinical trials, leadership and regulatory & submissions.
PHUSE is an independent, not-for-profit organization run by a worldwide team of volunteers. They are a global community and platform for the discussion of topics encompassing the work of data managers, biostatisticians, statistical programmers, data scientists and eClinical IT professionals. PHUSE has become the industry voice to regulatory agencies and standards organizations such as the FDA, EMA & CDISC.