Search

link to homepage

Institute for Advanced Simulation (IAS)

Navigation and service


eeClust

Energy-Efficient Cluster-Computing

Synopsis

The goal of the eeClust project (Energy-Efficient Cluster Computing) is to determine relationships between the behaviour of parallel programs and the energy consumption of their execution on a compute cluster. Based on this, strategies to reduce the energy consumption without impairing program performance will be developed. In principle, this goal can be achieved when for as many as possible hardware components their energy saving mode can be activated for the periods of time when they are not used. Modern hardware and operating systems already use these mechanisms based on simple heuristics but without knowledge about the execution behavior of the applications currently executing. This naturally has a high potential for wrong decisions.

The project will develop enhanced parallel programming analysis software based on the successful Vampir (Dresden) and Scalasca (Jülich) software tools which in addition to measuring and analyzing program behaviour will be enhanced to also record energy-related metrics.

Partners / Grants

  • University of Hamburg (coordinator)
  • Dresden University of Technology (TUD/ZIH)
  • ParTec Cluster Competence Center GmbH
  • Jülich Supercomputing Centre of Forschungszentrum Jülich GmbH

The project was funded by the German Federal Ministry of Education and Research (BMBF) under the call “HPC-Software für skalierbare Parallelrechner”

The grant period was April 2009 until March 2012.

Project homepage: http://www.eeclust.de

Logo eeClust

Results

In the project, a small test cluster with high-resolution power meters was procured by the University of Hamburg. This cluster consists of 5 nodes with Intel Nehalem and 5 nodes with AMD Opteron processors, thus enabling the possibilities and results of hardware power mode management to be studied for the dominant x86 architectures. To determine the phases of inactivity, the well-known performance analysis tools Vampir and Scalasca were used. At JSC, the Scalasca toolset was extended during the project to identify the energy-saving potential in wait states of an application.

An API was developed to manually instrument an application to communicate the required hardware resources to a system process. This daemon, also developed in the project, switches the unused hardware to a lower power state and back again and ensures that no component needed by another process is switched.

The tools and methods developed in the project are now being used and extended in other JSC projects, specifically in the Exascale Innovation Center (EIC), to efficiently manage large-scale machines.


Servicemeu

Homepage