The KAUST Supercomputing Lab (KSL) is home to several high performance computing (HPC) systems. Among them are two supercomputers - Shaheen and Ibex. A supercomputer describes an array of computers acting as one collective machine and is used to solve very complex problems, including weather forecasting, oil and gas exploration, molecular modeling, physical simulations and aerodynamics among others.
More than 65% of all KAUST faculty use KSL resources.
KSL manages a number of HPC systems that process large amounts of data. Moreover, due to advances in computation power and methodologies, the amount of data being processed has grown exponentially the past decade. One of KSL’s greatest challenges has been to store and manage all of this information in an efficient and secure way. Indeed, KAUST’s research storage infrastructure has been nearing capacity (around 85%), and system administrators have been planning its expansion to keep up with growing user demand.
Accordingly, an upgrade project for storage infrastructure worth $5M was commissioned. The main goal was to more than double the existing 17 Petabytes (PB) of storage by a further 20 PB and connect it to Shaheen and Ibex using the existing Infiniband network. The other goal was to upgrade the backend backup infrastructure to cope with the increased amount of storage in an efficient manner.
KSL staff has been working with leading HPC storage vendors to design an appropriate solution. As a result of technical optimization and competitive procurement, they have managed to acquire a new parallel storage system from Hewlett Packard Enterprise (HPE) with a total capacity of 37 PB, resulting in 54 PB and significantly boosting the storage space available for research data on Shaheen and Ibex as of December 2021.
KSL backup infrastructure has also been restructured, and its hardware upgraded. Specifically, the Spectra Logic TFinity tape library has been equipped with the latest generation of tape drives, and more tapes were added for a total storage capacity close to 100 PB. Of course, with that amount of data, a complex and parallel software solution is required to take care of migration and day-to-day operations. KSL staff has chosen Data Management Framework (DMF) version 7, also from HPE, to manage the flow of data between disks and tapes. This addition significantly improves data stability and backup efficiency.
The added benefit of the new storage is more streamlined workflows between Shaheen and Ibex, as the data is presented on both sides as a single parallel Lustre filesystem.
“This new capacity will meet our researchers’ medium-term storage needs and become a bridge for future HPC systems deployed at KAUST,” said Dr. Maciej Olchowik, lead of the KSL systems administrator team.
To learn more about the KSL, please see https://corelabs.kaust.edu.sa/labs/detail/supercomputing-core-lab