ECMWF Newsletter #177

ECMWF contributes to exascale computing project

Jenny Wong

 

In April 2021, the EuroHPC-funded ‘IO – Software for Exascale Architectures’ (IO–SEA) project began. The three-year project aims to implement solutions for scaling applications to exascale high-performance computing (HPC) systems. It achieves this by designing storage and data access architectures in collaboration with a wide range of use cases with high I/O demands. As the project reaches completion, we summarise ECMWF’s contributions in a use case and as leader in the development of DASI, a high-level Data Access and Storage Interface.

What is IO–SEA?

The IO–SEA project (https://iosea-project.eu/) is coordinated by the French Alternative Energies and Atomic Energy Commission (CEA) and involves a consortium of ten partners, including Atos and ECMWF. It aims to provide a novel HPC data management and storage platform for exascale computing based on hierarchical storage management (HSM) and on-demand provisioning of storage services. The platform should efficiently make use of storage tiers ranging from non-volatile memory express (NVMe) to tape-based technologies.

IO–SEA meeting.
IO–SEA meeting. In-person meeting of IO–SEA participants on 10 October 2022 in Paris.

ECMWF’s use case

The project operates under a co‑design principle, where requirements from I/O-intensive use cases drive the development of the IO–SEA architecture. ECMWF’s weather forecasting workflow consumes and produces terabytes of data in a single model run, making it an ideal candidate for evaluating potential solutions to I/O bottlenecks. Solutions developed in the project are benchmarked with a simplified workflow that mimics the competition for I/O resources between the writing of model outputs at each step and the reading of these outputs for product generation.

In the project so far, we have been able to test the workflow on different HPC systems, such as the Jülich Supercomputing Centre’s prototype modular supercomputer (DEEP) and the supercomputers operated at IT4Innovations. We have also evaluated the performance implications of using the Smart Burst Buffer (SBB) ephemeral service developed by Atos. The SBB service consists of nodes equipped with NVMe devices to accelerate data access, and initial tests show positive results in some of the benchmark metrics.

DASI

The Data Access and Storage Interface (DASI, https://github.com/ecmwf-projects/dasi) is a layer on top of ECMWF’s existing software, which makes that software more accessible to other scientific domains. It achieves this through semantic data management: all access and control of data is performed using scientifically meaningful indexing relevant for the particular application domain. DASI is inspired by the Fields DataBase (FDB) software in operational use at ECMWF. However, unlike the FDB, it can be configured for any scientific domain. The interface enables users to write, retrieve and query data through metadata, as well as to set data policies relating to data lifetime or access frequency.

Increasing the accessibility of ECMWF’s existing software fosters collaboration and enables other organisations to contribute and strengthen the community around our software. In this project, many of the partners have contributed to developing ECMWF’s software further. Examples include building a GekkoFS and CORTX-Motr backend for the FDB in collaboration with the University of Mainz and Seagate, respectively. CEA has also developed a POSIX interface to DASI, and therefore to the FDB. The solutions we were hoping to achieve in this project, such as hierarchical storage management, are complicated, and we were able to bring our expertise from the development of the Meteorological Archival and Retrieval System (MARS) and the FDB into guiding the scope and direction of progress.

Outlook

This project has been extremely fruitful in extending ECMWF’s FDB object store to support alternative technologies and building a wider community of users and collaborators for our existing software stack. We have also been able to adapt an emulation of the operational forecast to use novel data node architectures developed by Atos and evaluate their initial impact on the I/O demands of our workflow.

DASI has also been adopted by other use cases in the project across a wide range of domains, including lattice quantum chromodynamics and electron microscopy, with developments continuing until the project ends in spring next year.