Extreme Scale-Out SuperMUC Phase 2, lessons learned

Nicolay J. Hammer, Leibniz-Rechenzentrum

During May and June 2015 LRZ conducted a friendly user operation phase of the their upcoming new extension of SuperMUC called Phase 2 which consists of 86,016 Intel Haswell cores distributed to 6 islands resulting in a peak performance of 3.2 PFlop/s. Selected user groups had the opportunity to use the system for 28 days of continuous operation as so called "friendly users" and run jobs up to the whole system size. The talk presents results obtained during this period and the lessons learned from the operational point of view.

Talk as pdf file

Last Modified: 23.11.2022