Loading...

SC2017 - Driving the Future of Supercomputing

sc17

The KAUST Supercomputing Core Lab impacts on the international conversation on supercomputing at SC2017

By Marian O'Neill

 

This year’s SC conference, SC2017, was held in Denver, Colorado from Nov 12th to the 17th. 2017 marks this conference’s thirtieth anniversary, in the world of supercomputing almost a lifetime.

 

As part of its mandate, the SC conference has devised a series of standards by which to measure the capacity of any given supercomputer. The KAUST Supercomputing Core Lab’s Shaheen II, a Cray XC40 system computer, debuted as the seventh fastest computer in the world in June 2015 by the metrics devised by SC. As of Nov 13th 2017 it has been ranked as twentieth; but it still remains the number one supercomputer in the Middle East. Maintaining this consistency in accreditation is a noteworthy feat.

 

This year, computational scientist Georgios Markomanolis, from the KAUST Supercomputing Core Lab, participated in the development of a new benchmark in evaluating performance, in collaboration with other benchmark committee members: John Bent (Cray), Julian Kunkel (German Climate Computing Centre) and Jay Lofstead (Sandia National Laboratories). This benchmark is titled IO-500 and its purpose is to create a performance rating based on storage capabilities. As tested against this methodology the KAUST Shaheen Cray XC40 was ranked 2nd and 3rdfor the DataWarp and Lustre file systems respectively. Currently, as it is a new benchmarking system, submissions are being sought from other supercomputing centers in order to compare storage technologies.

 

Markomanolis further explored IO-500 in two presentations. In the IBM Spectrum Scale User Group he delivered an introduction to the IO-500 benchmark, explaining its importance for the procurement of storage. In the BoF (Birds of a Feather) sessions he spoke about his experience, and the results achieved, using the IO-500.

 

Along with Deborah Bard from Berkeley Labs, Markomanolis also organized a tutorial on Burst Buffer (an intermediate, high-speed layer of storage) entitled Getting Started with the Burst Buffer: Using DataWarp Technology. The purpose of this tutorial was to introduce this new technology while also showing how it optimizes the performance of typical applications.

 

In continuance of this training program KAUST is currently working with National Energy Research Scientific Computing Center to build a user forum related to Burst Buffer and NVMe technologies. As well as for training purposes this forum has been established with the intention of sharing information regarding best practices among various HPC centers.

 

As supercomputers develop in complexity so too must the procedures in place to maintain them. During the BoF session, a KAUST Supercomputing Core Lab computational scientist, Bilel Hadri, and collaborators, Guilherme Peretti-Pezzi (Swiss National Supercomputing Centre) and Reuben Budiardja (Oak Ridge National Laboratory), sat on a panel discussion which explored best practices, bringing different strategies to bear on system performance assessments.

 

This BoF concentrated on regression testing; the different strategies involved, lessons learned and how these can be made available to the HPC community. The feedback was excellent; around 70 participants attended and engaged the speakers with questions; many of them eager to implement new testing and strategies to their HPC systems.

 

As further proof of its continuing status, researchers using the KAUST Shaheen Cray XC40 won best paper at the conference. A collaboration from the Ludwig-Maximilians-Universitat Munich (LMU) and the Technical University of Munich (TUM) presented the prize winning paper on a scalability study and high-resolution simulation of the 2004 Sumatra-Andaman earthquake.

 

As with all research centers supercomputing faces the challenge of managing budget restrictions. Often computational cycles take precedence over infrastructures such as electricity. The KAUST Supercomputing Core Lab, along with eleven computing centers, shared its experience of power capping strategies such as using dynamic power scheduling with SLURM, examining its benefits and limitations.

 

Supercomputers are increasingly informing industry, medical research, environmental issues in fact any area of discovery that demands data processing. How we further understand our world and how we manage progress is reliant on these processing tools; and how we manage them to an optimum level is reliant on open dialogue within the HPC community. To this end the SC conference is a vital focal point in the supercomputing calendar.