IBM Cluster for GPFS Fileserver
The configuration of the Juelich Storage Cluster (JUST) is more or less continuously under movement to new available storage technology respectively in expansion to fullfill the demands of the supercomputer applications related to capacity and I/O bandwidth. At the time the 4th generation of JUST consists of 33 GPFS Storage Server systems (GSS) with a capacity of 20 PB gross including the TSM backup environment.
For details see
450 TB gross,
320 TB net
|19.8 PB gross, |
15.9 PB net
|20.3 PB gross, |
16.2 PB net
|Server||8 + 2 Mngt||66 + 8 Mngt||84|
|Disk Enclosures / Drawers||28||680||708|
|Disks (*)||336||7888 + 66 SSD||8290|
(*) The failure rate for the disks of the complete system is 2-3 disks per week.
JUST Hardware Characteristics
JUST History and Roadmap
In 2007 JUST started with classical storage building blocks consisting of IBM Power5 servers running AIX and storage controllers with FC and SATA disks like IBM DS4800, DS4700, and DCS9550 and 1 PB capacity gross serving 6-7 GB/s bandwidth.
The next milestones were in 2009 starting in March with the replacement of the servers by Power6 systems and in December followed by migration to new generation of storage controllers and disks with IBM DS5300. The capacity grew to 5 PB gross and the bandwidth was about 33 GB/s.
In 2012 additional IBM x-Series servers running Linux and IBM DS3512 and DCS3700 storage controllers with SAS and NL-SAS disks were installed and all data beside the fast scratch file system were migrated to the new technology. The free Power6 servers and storage were added to the scratch file system bringing up the bandwith to 66 GB/s and increasing overall capacity to 10 PB.
In January 2013 the installation and test of about 9 PB gross GSS-24 systems running the limited available GSS 1.0 version started. Mid September 2013 a new generally available fast scratch file system ($WORK) was introduced. At the same time a new special file system ($DATA) dedicated to large projects with big data in collaboration with JSC was built, where disk space quota is available on application at JSC management only. The overall JUST storage capacity was 13 PB and a bandwidth of 160 GB/s could be achieved.
In June 2014 additional 2.8 PB (gross) GSS storage were installed and used for migration of the classical $HOME file systems into GNR based file systems. The JUST storage capacity grows to about 16 PB (gross).
In December 2014 it was decided to transfer the remaining classical storage components to GSS-24 systems by reusing the storage infrastucture combined with new x-Series servers. This was done step by step and finished in March 2015. At the end free storage was added to $WORK and $DATA which increases the bandwidth to about 200 GB/s. At that time JUST consists of 31 GPFS Storage Server systems (GSS) with a capacity of 16 PB gross.
In June 2015 a global I/O reconfiguration took place to support the new HPC-system JURECA. At all storage servers the 2 times 30 GB ethernet channels were spitted into 3 times 20 GB ethernet channels which were distributed over three I/O switches. This implies also recabling and strong checking of the layout for redundancy at all levels. Mid 2015 additional 4 PB (gross) were installed by two capacity bound GSS-26 storage servers. They were partially used for migration of the HPC archive file systems ($ARCH). The thereby freed storage was added to the fast scratch file system $WORK and the project dedicated file system $DATA which increased their capacity by 25% and the I/O bandwidth to 220 GB/s .The overall capacity is 20 PB gross.