# Guest Student Programme 2012

## Proceedings

Winkel, Mathias (Ed.) (2012)

Technical Report FZJ-JSC-IB-2012-01, November 2012

## Abstracts

### Embedding Plasticity into a Parallel Green’s Function Molecular Dynamics Code

**Barbara Schlögl, University of Würzburg**

**Adviser: Prof. Dr. Martin Müser (NIC), Mykola Prodanov (NIC)**

Numerical simulation of plastic deformation of solid bodies is still a challenging field in contact mechanics. In this work we describe four different approaches to implement plastic deformation into a parallel Green’s function molecular dynamics (GFMD) code originally developed for handling only the elastic case. The first approach of local deformation in real space suffers from the fact that its optimization is not universal over a range of pressures, as well as from discontinuities in the displacement field. The second approach of pressure-dependent deformation taking into account the position of nearest neighbours produces ringing artifacts from the transformation to Fourier space. While the last two propositions seem promising, they need some further investigation to ensure or refute their feasibility.

### Multigrid, conjugate gradient solver for Reynolds thin film equation

**Sepastian Rupprecht, University of Augsburg**

**Adviser: Prof. Dr. Martin Müser (NIC), Dr. Wolf Dapp (NIC)**

In this report, we study the flow between two solids with rough surfaces, as described by Reynolds thin film equation. After deriving the equation’s weak formulation by variational calculus, we calculate the current with a conjugate gradient method. To obtain better runtime performances we implement sparse matrices and compare various iterative solvers and preconditioners. We show that the conjugate gradient method is clearly superior to a local solver. This holds for both approaches to model the contact area between the two surfaces, the traditional cutoff model and the elastic model.

### Visualizing Complex Functions Using GPUs

**Khaldoon Ghanem, RWTH Aachen University**

**Adviser: Prof. Dr. Erik Koch (GRS)**

This document explains some common methods of visualizing complex functions and how to implement them on the GPU. Using the fragment shader, we visualize complex functions in the complex plane with the domain coloring method. Then using the vertex shader, we visualize complex functions defined on a unit sphere like spherical harmonics. Finally, we redesign the marching tetrahedra algorithm to work on the GPGPU frameworks and use it for visualizing complex scalar fields in 3D space.

### Porting and optimization of EPOCH (a Particle-In-Cell code) to Blue Gene/Q

**David Martin Rodriguez, University of Salamanca**

**Adviser: Dr. Anupam Karmakar (JSC)**

Laser-plasma interaction can be simulated with Particle-In-Cell (PIC) codes. We had a MPI parallelized PIC code, called EPOCH, running on a BlueGene/P architecture in the Jülich Supercomputing Centre (JSC). Since the JSC was upgrading the hardware to a BlueGene/Q, the code needed to be ported to the new architecture. Here we analyze the parallelization of the code and discuss the possible need for hybrid parallelization to adapt the code to the multicore nature of BlueGene/Q. Therefore we implement several hybrid parallelization strategies and analyze the impact on the code performance.

### Optimization of Lattice QCD kernels for Blue Gene/Q

**Christian Jost, University of Bonn**

**Adviser: Dr. Stefan Krieg (JSC), Prof. Dr. Dirk Pleiter (JSC)**

In this project the QDP++ library of the USQCD software package was optimized for the Blue Gene/Q supercomputer. The sublibrary libintrin was recoded and works correctly. Due to compiler problems with the used templates, the C++ was recoded but could not be compiled. While the new library works correctly the full integration of the library into QDP++ could not be completed within this project. Due to the limitation of the tests not every aspect of the code could be tested thoroughly, but a problem with the data alignment in memory was found and fixed. The amount of data used was very small and it can be assumed that the performance of the code will increase with larger problem sizes due to better usage of the resources.

### Graph 500 benchmarking using flash memory cards

**Tommaso Zanca, University of Ferrara**

**Adviser: Prof. Dr. Dirk Pleiter (JSC)**

The performance of supercomputers is measured using benchmarks, where a large amount of data are computed by the machine. Graph 500 is a benchmark designed for testing the access speed to data that present highly irregular patterns, typical of graph structures. Volatile memory is easily exceeded for this kind of applications, so additional memory from external devices is necessary. Our analysis focused on flash memory cards, which present larger access speed to data than hard disks, and memory dimension of several hundreds of GBytes. The performance measurements have been performed on the computer cluster JUNIORS installed at Forschungszentrum Jülich.

### Generating parallel random numbers: As easy as 1, 2, 3?

**Artur Strebel, University of Wuppertal**

**Adviser: Oliver Bücker (JSC), Dr. Wolfgang Meyer (JSC)**

In this paper a general introduction into random number generation is given and some of the most common pseudo-random number generators (PRNGs) are described. Lately these PRNGs have encountered some problems with modern computer architectures. I also present a novel approach to random number generation proposed by John K. Salmon et al. in [1]: The Random123 library. It contains three PRNGs, the first two are based on cryptographic standards (AES and Threefish), the third one taking a new approach. All three PRNGs have excellent statistical properties and produce at least 2^{64} × 2^{128} random numbers while maintaining good performance on both multi core and single core architectures. These PRNGs have been compared with the SPRNG library which is currently

used at the Forschungszentrum Jülich.

### Design and Implementation of an Experimental Finite Element Solver

**Daniel Arndt, University of Göttingen**

**Adviser: Dr. Mike Nicolai (JSC)**

In Finite Element applications it is often desired to have a modular design in order to change different parts of the program easily. However, in the classical assembly approach, e.g., the element type and the physical problem are strongly coupled. In this project a Visitor Pattern is used to overcome this coupling. It turns out that this is a promising approach for a modular design. All basic functionalities of an Finite Element code were implemented and no serious problem occured.

### A parallel block iterative eigensolver optimized for sequences of correlated eigenproblems

**Mario Berljafa, University of Zagreb**

**Adviser: Dr. Edoardo Di Napoli (JSC)**

In many materials science applications simulations are made of dozens of sequences; each sequence groups together eigenproblems with increasing self–consistent cycle outer–iteration index. Successive eigenproblems in a sequence possess a high degree of correlation. In particular, it was demonstrated that eigenvectors of adjacent eigenproblems become progressively more collinear to each other as the outer–iteration index increases. This result suggests one could use eigenvectors, computed at a certain outer–iteration, as approximate solutions to improve the performance of the eigensolver at the next one. In order to exploit this correlation we developed a block iterative eigensolver and showed the benefit of the usage of approximate versus random starting vectors. Moreover, we showed that the algorithm performs substantially better than the correspondent direct eigensolver, even for significant portion of the sought spectrum.

### Observation of a Universal Boltzmann Distribution in Dynamic Simulation Experiments of the 1D-Heisenberg Spin Model

**Kieran Austin, University of Leipzig**

**Adviser: Dr. Thomas Neuhaus (JSC)**

In this exploratory study, computer simulations of the classical Heisenberg spin model are carried out in the microcanonical ensemble. It is shown that the Boltzmann distribution can be generated in a dynamical simulation for various settings of the system parameters. The temperature was measured with the use of a novel expression and its correctness is verified. A short outlook is given for the use of this new tool.

### Efficient Communication Schemes for Stochastic Thermostats in parallel MD Simulations

**Felix Uhl, Bochum University**

**Adviser: Dr. Viorel Chihaia (JSC), Dr. Godehard Sutmann (JSC)**

A new algorithm for the parallelization of the Lowe-Andersen thermostat is presented. The implemented algorithm allows for a better control of the system’s temperature compared to the original implementation in the IBIsCO code. This algorithm is more efficient than the original one, obtaining a speedup of a factor up to 14, depending on the chosen processor scheme.

### Identification of Gravity Waves in AIRS Brightness Temperatures

**Anne Springer, University of Bonn**

**Adviser: Dr. Lars Hoffmann (JSC)**

The Atmospheric Infrared Sounder (AIRS) provides infrared radiance data, which are used to calculate brightness temperatures. Stratospheric gravity waves can be found in temperature perturbation data. A toolbox is developed to identify gravity waves in the AIRS data and to analyse their properties. This information can be used to gain a better understanding of gravity wave sources and their propagation in the stratosphere. The output of our toolbox is a statistic of amplitudes and wavevectors. Case studies give information about the functionality of the toolbox and reveal the dependency of the results on certain control parameters. Gravity waves are detected with success and the wavevectors and corresponding amplitudes are determined with good accuracies.

### Information sharing and collaboration between agents in evacuation simulations

**David Haensel, Dresden University of Technology**

**Adviser: Dr. Ulrich Kemloh (JSC)**

This work describes a graph based navigation algorithm for pedestrian dynamics simulation in case of evacuation. The main goal of every pedestrian is to leave the building over the shortest path. Additionally to an implementation of the classical shortest path strategy, an information gathering and sharing scheme is modelled. We introduce a reasoning structure for the agents in the simulation. They are able to notice closed or broken escape routes and share it with other pedestrians in the

surrounding. We qualitatively analyzed the influence of the radius and the information propagation speed in an office building.

## Group Photo

Persons on the photo:

*left to right:* Thomas Neuhaus, Mathias Winkel, Lars Hoffmann, Khaldoon Ghanem, Kieran Austin, Kieran Austin, Ulrich Kemloh, Daniel Arndt, Erik Koch, Barbara Schlögl, Mike Nicolai, David Haensel, Daniel Martin Rodriguez, Anne Springer, Sebastian Rupprecht, Tommaso Zanca, Christian Jost

*missing:* Artur Strebel, Mario Berljafa, Felix Uhl