A new type of "cluster booster architecture" has been designed for the DEEP system.
The final DEEP prototype system will consist of a 128 node Eurotech Aurora Cluster and a 512 node Booster.
DEEP takes the concept of compute acceleration to a new level: it combines a standard, InfiniBand(TM) Cluster using Intel® Xeon® nodes (Cluster Nodes) with an innovative, highly scalable Booster constructed of Intel® Xeon Phi(TM) co-processors (Booster Nodes) and the EXTOLL high-performance 3D torus network. Both interconnects are coupled by Intel® Core(TM) Booster Interface Nodes. This combination provides maximum throughput and scalability on the Booster side, matching the requirements of highly scalable code parts, and supports proven HPC programming models like MPI and OmpSs. Code parts with limited scalability (e.g. because of complex control flow or data dependencies) run with high efficiency on the Cluster side, and the transparent bridging of both interconnects facilitates high-speed data transfer between the two sides.
The EXTOLL 3D torus network developed by the University of Heidelberg matches the communication patterns of important HPC kernels, facilitates system scale-up and delivers leading latency and bandwidth between Booster Nodes. The Intel® Xeon Phi(TM) co-processor is optimized for highest performance on vectorizable codes, and delivers outstanding energy efficiency. To further increase efficiency and density, DEEP uses advanced liquid cooling technology as introduced in the Eurotech Aurora line of clusters, with custom-designed cold plates for the Booster Nodes and quick disconnect couplings that allow switching of Booster blades in a running system.
Prototypes of all system components are now available, and integration into small, self-contained systems for system and application SW development is making good progress. The Booster Nodes developed by Eurotech contain two Altera Stratix V FPGAs, a very capable board management controller, various sensors for fine-grained power and temperature measurements, and Gigabit Ethernet and SMBus endpoints for maintenance and debugging. They connect via PCI Express to two Intel® Xeon Phi(TM) cards. Cooling will be effected through a custom engineered cold plate, with hot pluggable connectors to the backplane. The Booster Interface Node uses a COM Express® Module with an Intel® Core(TM) i7 CPU, a Mellanox ConnectX®-3 host adapter and an Altera Stratix V FPGA connected by a PCI Express bridge, and it provides SATA connectors for SSD storage. On both kinds of nodes, the EXTOLL NIC is implemented on the FPGAs.
On the network front, the EXTOLL ASIC implementation has made impressive progress, with first samples available in the SC13 timeframe. After test & validation, DEEP will proceed to design the final hardware components, replacing the FPGAs by the EXTOLL ASICs and introducing compact form factor Intel® Xeon Phi(TM) boards providing a full suite of sensors and actuators for advanced energy monitoring and management. From these components, the final DEEP Booster prototype will be assembled and made available for application development and optimization. Target size for this prototype is 512 Intel® Xeon Phi(TM) processors with an expected performance of close to 500 Teraflop/s.