A new type of "cluster booster architecture" has been designed for the DEEP system.
The final DEEP prototype system consists of a 128 node Eurotech Aurora Cluster and a 384 node Booster.
DEEP takes the concept of compute acceleration to a new level: it combines a standard, InfiniBand™ Cluster using Intel® Xeon® nodes (Cluster Nodes) with an innovative, highly scalable Booster constructed of Intel® Xeon Phi™ co-processors (Booster Nodes) and the EXTOLL high-performance 3D torus network. Both interconnects are coupled by Intel® Core™ Booster Interface Nodes. This combination provides maximum throughput and scalability on the Booster side, matching the requirements of highly scalable code parts, and supports proven HPC programming models like MPI and OmpSs. Code parts with limited scalability (e.g. because of complex control flow or data dependencies) run with high efficiency on the Cluster side, and the transparent bridging of both interconnects facilitates high-speed data transfer between the two sides.
The final DEEP prototype system consists of a 128-node Eurotech Aurora Cluster and two distinct prototypes for the Booster:
- A 384-node system built by Eurotech from custom-engineered dual-node cards in the Aurora blade form factor – the DEEP Booster with aggregated performance around 500 TFlop/s
- A smaller 32-node prototype built by University of Heidelberg and Megware based on the latest ASIC implementation of EXTOLL
The DEEP Cluster consists of 128 Intel® Xeon™ compute nodes with Infiniband interconnect.
The DEEP Aurora Booster consists of 384 Intel® Xeon Phi™ Co-Processors with an EXTOLL 3D Torus interconnect network.
The DEEP Cluster and both Booster systems were installed at Jülich Supercomputing Center as part of a production HPC environment. All three systems are fully integrated, yet the two Booster systems can be operated completely independent.
To ensure a safe and reliable 24×7 production environment, the qualification and installation of critical system software layers was essential. Logically, also the infrastructure for hot water cooling was developed, put into operation and integrated with the DEEP systems at JSC. The cooling infrastructure was designed to enable the use of year-round chiller-less cooling on a newly built cooling loop.
Experiments with active cooling using an existing cold water supply are possible as well. Electrically controlled valves allow a rapid reconfiguration, and special filters ensure sufficient water quality. The maximum coolant temperature at Jülich is 40˚ C.
Safe unattended operation 24×7 was a top priority. The additional element of risk imposed by direct liquid cooling was addressed by including sensors, which will detect even very minor leaks. An always-on monitoring system registers any critical excursions and mitigates the impact of failures, in the worst case by switching off all power to the affected parts of the systems using web-relays in the 230 V lines.