Master Thesis - Efficient Markov Chain Monte Carlo Techniques for Studying Large-scale Metabolic Models


As a leading research institution for microbial biotechnology the Institute of Bio- and Geosciences – Biotechnology (IBG-1) focuses on the development of biotechnological processes for the sustainable bio-based production of pharmaceutical and chemical products. We investigate how microorganisms and isolated enzymes can be used to produce a variety of products from renewable raw materials. IBG-1 is a leading institution in process development for industrial biotechnology with increasingly miniaturized and automated experiments. The institute provides an excellent infrastructure for parallelized lab robotic experiments on microtiter plates. Various analytical methods are available for online and at-line measurements. These are combined with advanced digital technologies for data analysis, modeling, experimental design and process optimization. Our Modeling and Simulation Group offers an interdisciplinary and agile research environment within a young and dynamic group. The project is an excellent example for research at the interface of computational systems biology and mathematics/statistics with a strong attitude to open research software development. For more information visit http://www.fz-juelich.de/ibg/ibg-1/modsim or http://github.com/modsim.

Quantifying the activity of enzymes operating within the large-scale biochemical network is a fundamental challenge in Systems Bio(tech)nology. Here, the unknown parameters must be inferred from models that are incomplete and data that involve errors. For such challenges, Bayesian analysis using Markov Chain Monte Carlo (MCMC) has become the gold standard.

For addressing high dimensional parameter inference problems with Bayesian statistics, powerful MCMC methods have been proposed, for example the MCMC differential evolution and the Riemann Manifold Langevin Monte Carlo methods. Because of the specific structure of the inference problems occurring in metabolic models, direct application of these MCMC algorithms is, however, not possible.

In this project, you will bring MCMC methods into the setting of metabolic flux inference and, with inspiration from existing algorithms, develop tailored MCMC algorithms. You will implement the ensuing algorithms in an existing C++ framework, validate and benchmark them with a realistic case study.

The main focus of the project can develop either more in the mathematical theory of MCMC, the implementation of code for the Jülich supercomputers (GPU/CPU), or being combined with a practical modeling project.

  • You are highly motivated, with an interest in probability theory, mathematics, and data science.
  • Very good practical C++ and Python programming skills allow you to make your ideas happen.
  • You have strong interest in curiosity-driven multidisciplinary research.

