# Guest Student Programme 2016

## Abstracts

### Photons with Möbius Boundary

**Giovanni Iannelli, Physics Department, University of Pisa, Italy**

**Adviser: Kálmán Szabó**

In this report I will present an implementation of a pure gauge 2D U(1) theory on a lattice. The objective is to measure the topological susceptibility, which, in the full QCD theory, is proportional to the mass squared of the axion, a candidate for dark matter. I will show that a Möbius boundary is useful to avoid charge freezing, the main problem that prevents performing a well controlled extrapolation to the continuum limit. Furthermore, I will discuss how the topological charge is inluenced by space topology, and why a new topological invariant is arising.

Floor Fields in JuPedSim

**Fabian Mack, Faculty of Chemistry and Biosciences, Karlsruhe Institute of Technology, Germany**

**Advisers: Arne Graf and Mohcine Chraibi**

For the JPScore module of JuPedSim, several changes with the aim of speeding up the router and the direction strategy computation when floor fields are used have been made. The implementation ideas are presented in detail in this report. Furthermore, the scaling behavior of the now-improved program has been investigated.

A Task-Based Approach to Parallelize the FMM

**Laura Morgenstern, Applied Computer Science, Chemnitz University of Technology, Germany**

**Advisers: David Haensel and Andreas Beckmann**

The Fast Multipole Method (FMM) is a fast summation technique for computing the long- and short-range interactions in particle systems. Based on a modern, sequential C++ implementation of the method we present and analyze a task-based intra-node parallelization approach by means of std::thread. In the course of that, we utilize the concept of work-stealing as dynamic load-balancing technique.

Distributed CPU/GPU parallelization of the ChASE library

**Josip Žubrinić, Department of Mathematics, University of Zagreb, Croatia**

**Adviser: Edoardo di Napoli**

We investigate the possibilities of parallelization of the computationally most intensive part of the ChASE eigensolver. This part, called Chebyshev filter greatly increases the convergence rate of the algorithm. Parallelization is done on multiple levels, involving both multi-GPU and multi-CPU implementations. We propose an alternating-cycle implementation of Chebyshev filter in which we reduce the required communication to the minimum.

Large Eddy Simulations for Turbulence

**Suryanarayana Maddu, Department of Environment and Civil Engineering, Ruhr University Bochum, Germany**

**Adviser: Anne Severt**

Turbulence modelling is one of the most important physical phenomenon for low and heat transport modeling. Accurate modeling and implementation of the physics is quintessential for comprehensive system analysis. In this direction, we investigate Large Eddy simulation for turbulent modeling for low in simple geometries for preliminary analysis. Two common models have been implemented: Constant Smagorinsky and Dynamic Smagorinsky. Qualitative benchmarking for the diffusion problem is conducted and the algorithm is ported to GPU using OpenACC.

Sorting and Administration of Particles in OpenCL

**Utkan Çalişkan, Computational Science and Engineering, Istanbul Technical University, Turkey**

**Advisers: Godehard Sutmann, Rene Halver, Willi Homberg**

Two parallel implementations of neighbor list techniques for particle-based simulations are presented. The first technique is based on a linked-cell approach, that is commonly used in Molecular Dynamics, whereas the second is a container-based approach, which collects particles in chunks of finite sizes representing cells. Both algorithms were implemented in OpenCL in order to achieve portability between different compute architectures. For this project the implementations were compared on Intel Xeon Phi and NVIDIA GPU.

Brain Cortex Segmentation using Deep Learning

**Monika Bajcer, Department of Mathematics, University of Zagreb, Croatia**

**Adviser: Morris Riedel**

Deep learning is a new field in Machine Learning that uses algorithms that are able to create data abstraction models. One of the most important tools are artiicial neural networks that have been inspired by the attempt to mimic biological neural networks. Our goal is to train a computer to detect gray matter in a human brain and to do so, build a fully convolutional neural network. We have been inspired by GoogLeNet to built an inception model of a neural network to segment brain images.

Brain simulators on JULIA: Initial performance evaluation

**Patrick Emonts, RWTH Aachen University, Germany**

**Advisers: Dirk Pleiter and Marcus Richter**

In this report, we describe an initial performance analysis of the prototype Cluster JULIA installed in the framework of the Human Brain Project (HBP). Special care is taken to understand the memory hierarchy of the Intel Xeon Phi Knights Landing processor. Finally, the performance of the neural simulator NEST is evaluated.

Independent Component Analysis on PLI Brain Images

**Fabian Preiß, Fakultät Mathematik und Naturwissenschaften, Universität Wuppertal, Germany**

**Adviser: Oliver Bücker**

The method 3D-Polarized Light Imaging (3D-PLI) is a promising tool in mapping the fiber tracts of the human brain on both, small and large scales. In the imaging process the signal is degraded by several sources of noise, which reduction is of importance for the quality of the 3D-reconstruction. A parallelized Independent Component Analysis (ICA) based method was successfully adapted to the JURECA supercomputer environment and extended to separate components based on their distribution being sub- or supergaussian.

Pixels, Matrices, and Circles on GPU

**Filip Srnec, Department of Mathematics, University of Zagreb, Croatia**

**Adviser: Andreas Kleefeld**

In this paper, a GPU implementation of the approach to color morphology based on Loewner order and Einstein addition, firstly introduced by B. Burgeth and A. Kleefeld is presented. The implementation of basic gray-scale morphology operations erosion and dilation is demonstrated and used as a basis for the suggested approach. The conversion from RGB space to a symmetric matrix field is introduced and a fast way of comparing matrices using the Loewner ordering via solving the smallest enclosing circle of circles problem is demonstrated. Additionally, a way of implementing higher order morphological operations such as top-hats and gradients using the Einstein addition and two GPU devices is introduced.

###

Gray-Scott simulations with SDC and DUNE

**Mia Jukić, Department of Mathematics, University of Zagreb, Croatia**

**Advisers: Robert Speck and Ruth Schöbel**

Reaction and difusion of chemical species can produce a variety of patterns. The Gray-Scott equation models such a reaction. In order to do numerical simulation of Gray-Scott model, for this work, Spectral Deferred Correction (SDC) is used for discretization of the time domain and then Finite Element Method (FEM) is used for solving system of PDEs.

### Alternative Communication Methods for Stencil-Based Operations in MPI

**Lukas Mazur, Department of Physics, Bielefeld University, Germany**

**Advisor: Stefan Krieg**

Most simulations in Quantum Chromodynamics (QCD) follow a speciic communication pattern. We present an implementation which is using one sided communication. First, we explain in detail how the communication pattern can be implemented with two sided communication calls. Then we show how to improve this pattern using one sided communication and by introducing double bufering. In the end we present measurements which show the performance difereces between both methods.

### A Violin Plot Plug-in for Cube

**Robert Poenaru, University of Bucharest, Romania**

**Advisor: Pavel Saviankou**

Cube is an open source software used for displaying performance data of HPC applications. Extending the toolset of Cube with an additional plug-in which is capable of making violin plots for numerical data sets will be of great help for the HPC community dealing with performance analysis. In this paper, a step-by-step description of the Violin Plot Plug-in is made, starting from the mathematical concept of a violin plot, up to the improvements on the algorithm as well as performance measurements of the algorithm.

## Group Photo

Copyright: Forschungszentrum Jülich

*left to right:*

*front row:* Utkan Çalişkan, Filip Srnec, Giovanni Iannelli, Lukas Mazur, Robert Poenaru

*back row:* Fabian Mack, Monika Bajcer, Josip Žubrinić, Mia Jukić, Suryanarayana Maddu, Laura Morgenstern, Fabian Preiß, Patrick Emonts