link to homepage

Institute for Advanced Simulation (IAS)

Navigation and service

Guest Student Programme 2010


Speck, Robert und Winkel, Mathias (Eds.) (2010)
 Proceedings 2010, JSC Guest Student Programme on Scientific Computing (PDF, 7 MB)
Technical Report IB-2010-04, December 2010


Implementation and Evaluation of Integrators for the Fast Multipole Method

Valentina Banciu, University of Bucharest
Adviser: Oliver Bücker, Ivo Kabadshow

Among alternative linear scaling and/or low cost methods, the Fast Multipole Method (FMM) has become the method of choice in applications where one is interested in accurate potentials. In this report, we revisit the method operators and steps, and analyze different particle integrators with respect to accuracy, stability, effectiveness or memory footprint.  Colloquium Talk (PDF, 810 kB)

Hardware and Software Routing on the QPACE parallel computer

Konstantin Boyanov, DESY Zeuthen
Adviser: Willi Homberg

The torus network of QPACE, the currently most energy efficient parallel computer in the world, allows up to now only for nearest-neighbor communication. This communication pattern is sufficient for numerical simulations in the field of Quantum Choromodynamics, the initial target application of QPACE. Nevertheless, expanding the torus network functionality to allow any-to-any communication is very important for broadening the spectrum of applications, which can take advantage of QPACE's high-performance, low-energy parallel architecture. Possible extensions to the custom designed Torus Network (TNW) are considered and a simple and low-overhead routing algorithm is proposed. Furthermore, the proposed algorithm is implemented and tested in the OMNeT++ event-based simulation environment. We show that the simulation model implemented is a good representation of the real hardware, and we also test and verify the algorithm implementation using communication patterns as they occur during matrix transposition.  Colloquium Talk (PDF, 653 kB)

Development of a parallel, tree-based neighbour-search algorithm

Andreas Breslau, Cologne University
Adviser: Paul Gibbon

In Astrophysics it is quite common to use a combination of an N-body code and a SPH code for the computation of self-gravitating matter. SPH is a Lagrangian method where the particles are used as the discrete elements within a fluid description. Thermodynamic properties are computed at the simulation points from averages over neighbouring particles. To do this it is necessary to know the next neighbours of each particle. In this article the implementation of a neighbour search algorithm using the tree-code PEPC is described. The algorithm is based on the existing routine for the force summation, adapted to return neighbour lists instead of multipoles. The correctness of the parallel neigh- bour search is verified using both visual and quantitative tests. The scaling of the algorithm with particle and process number is shown to be O(N log N ) or better.  Colloquium Talk (PDF, 3 MB)

Implementation of a Parallel I/O Module for the Particle-in-Cell Code PSC

Axel Hübl, Dresden University of Technology
Adviser: Anupam Karmakar

An efficient parallel I/O module for the particle-in-cell code PSC has been developed using the highly scalable library SIONlib, harnessing a one-file-for-all-tasks strategy. This module enables efficient production runs on large-scale HPC systems. The performance has been extensively tested and compared with the existing one-file-per-task I/O module. The new implementation largely reduces the resource requirement for data dumping as well as for post-processing for the code PSC.  Colloquium Talk (PDF, 655 kB)

Pedestrian dynamics: Implementation and analysis of ODE-solvers

Timo Hülsmann, Wuppertal University
Adviser: Ulrich Kemloh, Mohcine Chraibi

In this report two approaches for run-time optimization of the General Centrifugal Force Model (GCFM) are analysed. First we give a short introduction in modeling pedestrian dynamics and introduce the GCFM. The first approach consists of ordering pedestrians' data in the local memory to preserve data locality. The purpose is to reduce cache misses and is done using space-filling curves. This is presented in the second part of this report. A small decrease of computation-time was achieved. In the third part different ODE-solvers are investigated, with fixed and with varying step-size. The solvers are the Velocity Verlet method and different orders of the Runge-Kutta-Fehlberg method. Here an increase of the average step-size was achieved.  Colloquium Talk (PDF, 656 kB)

Integration of high order compact scheme into Multigrid

Alina Georgiana Istrate, Wuppertal University
Adviser: Godehard Sutmann

A 6th order compact difference scheme was implemented in a particle-particle particle- mesh method code for molecular simulation where multigrid method was used for solving the 3D Poisson equation.  Colloquium Talk (PDF, 889 kB)

Statistical Modelling of Protein Folding

Julie Krainau, Humboldt University Berlin
Adviser: Sandipan Mohanty, Jan H. Meinke

The open source protein folding and aggregation software ProFASi implements a physics based approach for studying protein folding and thermodynamics using Monte Carlo simu- lations. In this report, the main ideas of this approach are outlined, along with a qualitative presentation of various terms of ProFASi's force field, and a newly developed method for simulations with constraints. Simulation results for a simple helical peptide, a 73 residue 3-helix bundle protein and a 76 residue α/β protein will be presented and compared.  Colloquium Talk (PDF, 3 MB)

Domain Distribution for parallel Modeling of Root Water Uptake

Martin Licht, Bonn University
Adviser: Natalie Schröder, Bernd Körfgen

Towards a parallel simulation of water transport in coupled soil-root systems, we ana- lyze several strategies for soil domain distributions aligned to root geometries. Our results tentatively point out potential for a well-scaling simulation, when combined with adaptive mesh refinement and multithreaded root simulation. We recognize technical and conceptual questions that might emerge along this direction. For our investigations we enhanced the MPI-program parSWMS by a basic root model.  Colloquium Talk (PDF, 6 MB)

Analysis Tools for the Results of Scalasca

Markus Mayr, Vienna University of Technology
Adviser: Brian Wylie, Bernd Mohr

Scalasca is a tool set for performance analysis of parallel applications. To detect errors in Scalasca, the software is tested as is customary. One of the main steps of the testing procedure is to try to find errors in Scalasca's analysis reports. These errors can be of two different kinds. First, the output can be ill-formed and second, the measurement or analysis data can be wrong. Both kinds of errors can be detected automatically. We provide tools that analyze Scalasca's output and report errors. This report serves as an overview for this set of testing tools and the library these tools are built upon.  Colloquium Talk (PDF, 297 kB)

Modeling of doubly-connected fields of CPV/T solar collectors

Yosef Meller, Tel Aviv University
Adviser: Bernhard Steffen

 Colloquium Talk (PDF, 723 kB)

Towards Optimized Parallel Tempering Monte Carlo

Marco Müller, Leipzig University
Adviser: Thomas Neuhaus, Michael Bachmann (IFF)

Parallel tempering Monte Carlo methods are an important tool for numerical studies of models with large complexity, since interesting questions, like the origin of phase transitions and structure formation, can only be tackled by means of statistical analysis. This work introduces the method and discusses ways of improving performance when using many-core architectures.  Colloquium Talk (PDF, 814 kB)

Scaling of Linear Algebra Library Routines on the IBM BlueGene/P System JUGENE

Elin Solberg, University of Gothenburg
Adviser: Inge Gutheil

Three different solvers of the dense real symmetric eigenproblem are available in the parallel linear algebra library ScaLAPACK - a fourth one, building on the algorithm MR3 , is planned to be included in a future, not yet announced, version of the library. This report presents the results of benchmarking the three library solvers as well as a still experimental version of the MR3 solver. The benchmarking was performed on the IBM BlueGene/P system JUGENE, using up to 8192 cores to solve problems with a maximum matrix size of 122880 × 122880. Two main cases were investigated: (1) eigenvalues randomly spread in a given interval and (2) massively clustered eigenvalues. In the first case the scalability and accuracy of the new MR3 solver did not quite answer expectations, whereas in the second case, the new solver proved to be a promising alternative to the existing routines.  Colloquium Talk (PDF, 271 kB)

Group Photo

Guest Students 2010Guest students 2010 and some of their advisers

Persons on the photo, left to right, front to back:

1. row: Paul Gibbon, Elin Solberg, Julie Krainau, Yosef Meller, Marco Müller, Alina Istrate
2. row: Oliver Bücker, Konstantin Boyanov, Valentian Banciu, Natalie Schröder, Ulrich Kemloh, Robert Speck, Mathias Winkel
3. row: Axel Hübl, Bernhard Steffen, Andreas Breslau, Martin Licht
4. row: Markus Mayr, Brian Wylie, Thomas Neuhaus, Sandipan Mohanty, Timo Hülsmann