Ideas for master's theses
Metagenomics
The ‘Metagenomics Toolkit’ project aims to analyse large amounts of data automatically using a cloud-based bioinformatics workflow. The workflow is implemented based on Nextflow (https://www.nextflow.io) and can be extended with various programming languages (Python, Bash, R, etc.). The workflow is to be extended with various analysis tools from the field of metagenomics.
Possible topics for theses:
- Evaluation of different strategies for metagenome assembly and binning
- Predicting resource consumption using AI methods
- Automatic evaluation of different workflow configurations
- Development of a ‘minimal workflow’ for high-throughput analysis of thousands of data sets
- Expansion of the toolkit with analysis tools for viral sequence data
- Further development of the TraceFlow tool for benchmarking Nextflow pipelines (insight into the resource consumption of tools over their entire runtime)
Cloud Computing
The German Network for Bioinformatics Infrastructure (de.NBI) provides its own federated cloud for the life sciences. The de.NBI cloud portal (https://cloud.denbi.de) offers a possible project mode with SimpleVM, through which virtual machines (VMs) can be started and managed. In order to further simplify the use of the cloud, SimpleVM is to be expanded.
Possible topics for theses:
- provision and exchange of (large) amounts of data in virtual environments
- monitoring of the resource consumption of individual virtual machines via the portal, expansion of the SimpleVM system with appropriate visualisations
- exchange of virtual environments between different de.NBI cloud locations
- exchange, update and synchronisation of (large) bioinformatics databases across multiple de.NBI cloud locations