Navigation and service

Data analytics for high-throughput image-based cohort phenotyping

Scientific big data analytics

Scientific Big Data Analytics (SBDA) has become an important instrument for tackling scientific problems characterized by the greatest data and computational complexity. The Helmholtz-Analytics-Framework (HAF) project generalizes and standardizes data analytics methods for high-performance computation, guided by use cases from five different scientific fields: earth system modeling, structural biology, aeronautics and aerospace research, medical imaging, and neuroscience. The objective is to establish a software platform composed of data analytics methods optimized for HPC systems and to make them available to a broad range of scientific communities. Our group focuses on the neuroimaging use case.

High-throughput image-based cohort phenotyping

Advanced medical research, as carried out by INM-1,faces the challenge of understanding the correlation-and-effect model between environmental or genetic influence and the observed resulting phenotypes (e.g., morphological structures, function, variability) in healthy or pathologic tissue. The amount of imaging data to be analyzed has increased over the years and has pushed storage, processing, and analysis of neuroimaging data to its limits. In particular, understanding relations between environment factors, genetic influences and brain characteristics revealed by imaging techniques like diffusion tensor imaging (DTI) or functional magnetic resonance imaging (fMRI) require the analysis of hundreds or even thousands of subjects; overstraining traditional computational infrastructure and traditional data analytics.

Figure 1

Figure 1: Illustration of neuroimaging pipelines. First step: Deriving brain features from raw MRI data. Second step: Analyzing features by means of data mining techniques, machine learning and inference methods. Image from the Connectivity group of the INM-1.

Our contribution

We generalize image processing pipelines, data mining techniques, uncertainty management, machine learning and inference methods to HPC systems. This requires novel approaches for standardized and automated high-throughput processing of structural, functional and connectivity imaging data as well as new data analytics methods.

In order to gain maximal synergy effects, SBDA methods developed within the neuroimaging use case are made available to other research fields, e.g., earth system modeling or structural biology, as well as methods generalized by other research areas are analyzed for their applicability to neuroimaging.

Our collaboration partners

This project is being conducted in collaboration with the Connectivity group of the INM-1.