JUPITER Technical Overview
A Deep Dive Into JUPITERs Building Blocks
JUPITER, the “JU Pioneer for Innovative and Transformative Exascale Research", is the first exascale supercomputer in Europe. The system is provided by the ParTec-Eviden supercomputer consortium and was procured by EuroHPC JU in cooperation with the Jülich Supercomputing Centre (JSC). It is installed at Forschungszentrum Jülich campus in Germany.
The purpose of this article is to give concise background information and technical details of the chosen architectures of the different components. Until all components of the system are accepted, all information is subject to change and might be modified at any point in time.
Note: Since JUPITER is moving to production, this article will not be updated anymore. For further details regarding the machine, please visit the system documentation.
This article was last updated on 26 June 2026.
JUPITER at a Glance
JUPITER is the first European supercomputer of the Exascale class and joins a fleet of other EuroHPC supercomputers, including multiple Petascale and Pre-Exascale systems throughout Europe. The total budget of 500 million Euro includes acquisition and operational costs. Installation of first components started in 2024 and is expected to conclude in 2026.
Following the dynamic Modular System Architecture (dMSA) implemented by JSC and other European partners in the course of the DEEP research projects, the JUPITER system consists of two compute modules, a Booster and a Cluster Module. The Booster Module delivers 1 ExaFLOP/s FP64 performance, measured through the HPL benchmark. It implements a highly-scalable system architecture based on the the Grace-Hopper superchip by NVIDIA. The general-purpose Cluster Module targets workflows that do not benefit from accelerator-based computing. It utilizes the first European HPC Processor, Rhea1 by SiPearl, to provide a uniquely high memory bandwidth supporting mixed workloads, and further, more classical AMD CPUs. Both modules are deployed independently.
The storage hierarchy is comprised by the three tiers ExaFLASH, ExaSTORE, and ExaTAPE which are described later in this article.
All compute nodes of JUPITER as well as storage and service systems are connected to a large NVIDIA Mellanox InfiniBand NDR fabric implementing a DragonFly+ topology.
The system configuration is the result of a public procurement that took prior JSC systems like JUWELS, the Jülich Wizard on European Leadership Science, as a blueprint. The previous userbase played a key role when defining requirements, utilizing a large set of benchmarks and applications for the assessment of the offers.
JUPITER Booster
The Booster module of JUPITER (short: the Booster) features 5884 compute nodes to achieve the compute performance of 1 ExaFLOP/s (FP64, HPL) – and much more in lower precision computing (for example more than 70 ExaFLOP/s of theoretical 8-bit compute performance with sparsity). The driving chip of the system is the NVIDIA Hopper GPU. The GPUs are deployed in the Grace-Hopper superchip form factor (GH200), a tight combination between NVIDIA’s first CPU (Grace) and HPC GPU (Hopper).
Each Booster node features four GH200 superchips, i.e. four GPUs each closely attached to a partner CPU (via NVLink Chip-to-Chip). With 72 cores per Grace CPU, a node has a total of 288 CPU cores (Arm). In a node, all GPUs are connected via NVLink 4, all CPUs are connected via CPU NVLink connections.
The Hopper GPU variant installed into the system offers 96 GB of HBM3 memory, accessible with 4 TB/s bandwidth from the multiprocessors of the GPU. Compared to previous NVIDIA GPU generations, the Hopper GPU offers more multiprocessors, larger caches, new core architectures, and further advancements – the documentation released by NVIDIA gives overviews. Using NVLink4, one GPU can transmit data to any other GPU in a node with 150 GB/s per direction.
Each GPU is attached to a Grace CPU, NVIDIA’s first HPC CPU, utilizing the Arm instruction set. The Grace CPU has 72 Neoverse V2 cores, SVE2-enabled with four 128 bit functional units, each. The CPU can access 120 GB of the LPDDR5X memory with a bandwidth of 500 GB/s. The key feature of the superchip design is the tight integration between CPU and GPU, not only offering a high bandwidth (450 GB/s per direction), but also more homogeneous programming. Again, details about Grace and properties making the combination a superchip can be found in NVIDIA documentation.
A CPU is connected to the three neighboring CPUs in a node via dedicated CPU NVLink (cNVLink) connections, offering 100 GB/s bi-directional bandwidth. A further PCIe Gen 5 connection exists per CPU towards its associated InfiniBand adapter (HCA). Four latest-generation InfiniBand NDR HCAs are available in a node, each with 200 Gbit/s bandwidth.
The system is warm-water-cooled, using the BullSequana XH3000 blade and rack design.
JUPITER Cluster
The Cluster module (the Cluster) integrates different types of CPUs, SiPearl Rhea1 and AMD Turin CPUs.
The Rhea1 processor, a processor designed in Europe through the EPI projects and commercialized by SiPearl, will be used on parts of the JUPITER Cluster. Rhea – like Grace – utilizes the Arm instruction set architecture (ISA), with the unique feature of providing an extraordinary high memory bandwidth by using 64 GB HBM2e memory. Two Rhea1 processors form a node, each containing 80 Arm Neoverse Zeus cores and providing SVE for enhanced performance. Beyond 2×64 GB HBM memory, each node provides additional 512 GB of DDR5 main memory. In total, JUPITER Cluster features 162 Rhea1 nodes. They will be installed in the near future.
In addition to the Rhea1 nodes, the Cluster is equipped with 582 dual-socket AMD Turin 9655 nodes, each Turin CPU featuring 96 cores (SMT-2). Turin standard nodes feature 768 GB of DDR5 memory; 32 nodes are equipped with 1536 GB of memory. This part of the Cluster is currently being installed.
The nodes are again based on the BullSequana XH3000 architecture and will be integrated into the global NVIDIA Mellanox InfiniBand interconnect with one NDR200 link per node.
JUPITER High-Speed Interconnect
At the core of the system, the InfiniBand NDR network connects 25 DragonFly+ groups in the Booster module, as well as 2 extra groups in total for the Cluster module, storage, and administrative infrastructure. The network is fully connected, with more than 11000 400 Gb/s global links connecting all groups with each other.
Inside each group, connectivity is maximized, with a full fat-tree topology. In it, leaf and spine switches use dense 400 Gb/s links; leaf switches rely on split ports to connect to 4 HCAs per node on the Booster module (1 HCA per node on the Cluster module), each with 200 Gb/s.
In total, the network comprises about 50000 links and 102000 logical ports, with 25000 end points and 867 high radix switches, and has still spare ports for future expansions, for example for further computing modules.
The network has been designed with HPC and AI use-cases in mind. Its adaptive routing and advanced in-network computing capabilities enable a very well-balanced, scalable, and cost-effective fabric for ground-breaking science.
ExaFLASH and ExaSTORE
ExaFLASH, a 29 Petabyte Flash Module (ExaFLASH) is provided based on the IBM Storage Scale software and a corresponding storage appliance based on 20x IBM SSS 6000 building blocks, providing a useabla capacity of around 20 Petabyte. It targets to provide more than 2 TB/s write and 3TB/s read performance.
A high-capacity spinning disk storage (ExaSTORE) is providing a raw capacity of 308 Petabyte, and is based on 22x IBM SSS 6000 building blocks.
An additional 379 Petabyte backup and archive solution (ExaTAPE) based on IBM Storage Protect and 2x IBM TS 4500 tape libraries is completing the storage hierarchy. Both ExaSTORE and ExaTAPE can be upgraded during the system runtime depending on the actual demand.
Service Partition and System Management
JUPITER is installed and operated with the unique JUPITER Management Stack. This is a combination of xScale (Atos/Eviden), ParaStation Modulo (ParTec), and software components from JSC (xOPS).
Slurm is used for workload and resource management, extended with ParaStation components. The backbone of the background management stack is a Kubernetes environment, relying on a highly-available Ceph storage. The management stack is used to install and manage all hardware and software components of the system.
More than 20 login nodes provide SSH access to the different modules of the system. In addition, the system is integrated into the Jupyter environment at JSC and made available via UNICORE.

