PASC
SPH-EXA
 
SPH-EXA: Optimizing Smooth Particle Hydrodynamics for Exascale Computing


PIs: Florina Ciorba (University of Basel)

Lucio Mayer (University of Zurich)


Co-PI: Rubén Cabezón (University of Basel)

Aurélien Cavelan (University of Basel)


Co-Is: Ioana Banicescu (Mississippi State University)

Domingo García-Senz (Universitat Politècnica de Catalunya, Spain)

Thomas Quinn (University of Washington in Seattle, WA, USA)

Danilo Guerrera (University of Basel), Darren Reed (University of Zürich)


Funding agency: Platform for Advanced Scientific Computing (http://www.pasc-ch.org)

Duration: 01.07.2017-30.06.2020

Project Summary

Understanding how fluids and plasmas behave under complex physical conditions is on the basis of some of the most important questions that researchers try to answer. These range from practical solutions to engineer- ing problems to cosmic structure formation and evolution. In that respect, numerical simulations of fluids in astrophysics and computational fluid dynamics (CFD) are among the most computationally demanding calculations in terms of sustained floating point operations per second (FLOP/s). It is expected that they will benefit greatly from the future Exascale computing infrastructures, that will perform 1018 FLOP/s. This type of scenarios pushes the computational astrophysics and CFD fields well into sustained Exascale computing. Nowadays, they can only be tackled by either reducing the scale, the resolution and/or the dimensionality of the problem, or using approximated versions of the physics involved. How this affects the outcome of the simulations, and therefore our knowledge on the problem, is still not well understood.

The simulation codes used in numerical astrophysics and CFD (hydrocodes, hereafter) are numerous and varied. Most of them rely on a hydrodynamics solver that calculates the evolution of the system to be studied along with all the coupled physics. Among these hydrodynamics solvers, the Smooth Particle Hydrodynamics (SPH) technique is a purely Lagrangian method, with no subjacent mesh, where the fluid can freely move in a practically boundless domain, this being very convenient for astrophysics and CFD simulations. SPH codes are very important in astrophysics because they couple naturally with the fastest and most efficient gravity solvers such as tree-code and fast multiple methods. Nevertheless, the parallelization of SPH codes is not straightfor- ward due to its boundless nature and the lack of a structured grid, causing continuously changing interactions between fluid elements or between fluid elements and mechanical structures, from one time-step to the next. This, indeed, poses an additional layer of complexity in parallelizing SPH codes, yet it also renders them a very attractive and challenging application for the computer science community in view of its parallelization and scalability challenges for the upcoming Exascale computing systems.

We aim in this project to have a scalable and fault tolerant SPH kernel, developed into a mini/proxy co-design application. The SPH mini-app will be incorporated into current production codes in the fields of astrophysics (SPHYNX, ChaNGa), and CFD (SPH-flow), producing what we call the SPH-EXA version of those codes.

The SPH-EXA project has the following main objectives:

Design parallelization methods targeted for SPH codes, that can be ported to other codes in the scientific community.

Parallelization of the SPH technique will involve both automatic and manual methods. For automatic parallelization, we will use advanced compilation options as well as stencil compilers. This will require rewriting parts of the SPH codes to enable vectorization and other loop transformations, as well as adapting existing stencil compilers to the SPH codes. For manual parallelization, we will employ shared-memory, and accelerator-based programming, task-based programming for on-node multi-threaded execution, as well as distributed-memory programming for the multi-process execution across computing nodes. The goal is to expose all available parallelism both at node and across-node levels.

Enable the scalability and dynamic load balancing of the SPH hydrodynamics technique within single compute nodes and across massive numbers of nodes.

Enabling the scalable execution of SPH codes is based on the massive software parallelism exposed and ex- pressed during (automatic and/or manual) parallelization. This will require hierarchical and/or distributed dynamic load balancing techniques to exploit the massive hardware parallelism at running time. We will em- ploy algorithms, techniques, and tools that address the load imbalance factors arising from the (problem and algorithmic) characteristic of the three SPH codes (e.g., individual time-steps per particle) as well as from the software environments (processor speed variations, resource sharing). The goal is to minimize the load imbalance between synchronous parts of the code (e.g., gravity calculations) by dynamically distributing the load to the processors, using methods such as those described in [2,3,4,5].

Design fault-tolerance mechanisms to sustain the scalable execution of massively parallel SPH codes.

The fault tolerant mechanisms will combine the use of dynamic fault-tolerant scheduling algorithms, complemented by methods to determine the optimal checkpointing frequency for the SPH technique on given architectures. Most importantly, we will explore the algorithm-based fault tolerance opportunities within the SPH technique to achieve portable fault-tolerance across architectures and independent of checkpointing mechanisms. We envision to provide fault-tolerant and non-fault-tolerant versions of the SPH codes, to maintain flexibility between high performance at smaller scales and fault-tolerant high performance at larger scales.

Build a repository of experiments to enable verification, reproducibility, and portability of the execution and simulation results of SPH-EXA codes.

To enable verification and reproducibility of the SPH simulations, as well as to support parallel performance studies we will use reproducibility tools to configure and run the codes. Such tools will aid in resolving software dependencies, facilitate environment configuration, automate the software build process, provide support for creating execution and post-processing scripts, and visualize the results. As of now PROVA! [23] (our reproducibility tool of choice) automatically provides graphs for the performance analysis, relying on Gnuplot, and will be extended to provide support for tools such as Visit2 and ParaView3, since visualization is extremely important when working with a huge number of particles.

There are currently in SPH codes: (1) no multi-level scheduling approaches (connecting thread or task level schedulers with process level schedulers), (2) no (or limited) algorithm based fault-tolerance, (3) very limited scalability within and across nodes (existing work has either/or, but not both). As an example, the largest, most recent high-resolution simulations of galaxy formation, such as GigaERIS (Mayer, Quinn, et al., in preparation), which employ more than a billion resolution elements with individual time-steps, does not scale to more than 8,000 compute cores on state-of-the-art architectures, such as the Cray platforms. High-impact work already published in the last few years employed SPH simulations, such as ERIS [22], that were not scaling even to 1,000 compute cores. This degree of scalability is clearly below what we need to exploit upcoming Exascale supercomputers.

The methodology that we will use to achieve these goals is a unique combination between (1) state-of-the- art parallelization and fault tolerance methods from computer science, (2) state-of-the-art SPH technique and expertise from physics, and (3) expertise in high-performance computing on state-of-the-art computing architectures.

The expected outcome of this project will be in the form of an open-source SPH mini-app (accessible here: https://github.com/unibas-dmi-hpc/SPH-EXA_mini-app), that will enable highly parallelized, scalable, and fault-tolerant production SPH codes in different scientific domains (represented via SPHYNX, ChaNGa, and SPH-flow in its very first application). Addressing the performance and scalability challenges of SPH codes requires a versatile collaboration with and support from supercomputing centers, such as CSCS, such that our results can be taken into account for the design of the next generation HPC infrastructures.

The success of the project will be measured in the achieved improvements, over their current levels, in speed-up, fault-tolerance, flexibility (in terms of numerical techniques), and portability of the SPH-EXA codes.

In summary, this project addresses the challenge of rendering the SPH technique and the SPH-based simulation codes scalable to future Exascale computing systems. We target the performance, portability, scalability, and fault tolerance of three SPH codes on the next generation supercomputers, such as those at CSCS, that are expected to contain hybrid CPU-MIC-accelerator architectures and a high-end interconnection fabric.


Publications

A. Mohammed, F. M. Ciorba, “SimAS: A Simulation-Assisted Approach for the Algorithm Selection Problem of Scheduling under Perturbations”, Concurrency and Computation: Practice and Experience Journal, Special Issue on the International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Platforms (HeteroPar'2018). (Under review)


Mohammed A.; Eleliemy A.; Ciorba F. M.; Kasielke F.; Banicescu I.; “A Methodology for Realistically Simulating the Performance of Scientific Applications on High Performance Computing Systems”, Future Generation Computer Systems Journal (2019) https://doi.org/10.1016/j.future.2019.10.007 (Open Access)


García-Senz, D.; Cabezón, R. M.; Blanco-Iglesias, J.M.; Lorén-Aguilar, P.; “Self-gravitating barotropic equilibrium configurations of rotating bodies with SPH”, Submitted to Astronomy & Astrophysics (2019)


R. M. Cabezón, K.-C. Pan, M. Liebendörfer, T. Kuroda, K. Ebinger, O. Heinimann, A. Perego, and F.-K. Thielemann, “Core-collapse supernovae in the hall of mirrors: A three-dimensional code-comparison project”, Astronomy & Astrophysics, Volume 619, November 2018, Article Number A118, November 13, 2018.


D. Guerrera, R. M. Cabezón, J.-G. Piccinali, A. Cavelan, F. M. Ciorba, D. Imbert, L. Mayer, and D. Reed, “Towards a Mini-App for Smoothed Particle Hydrodynamics at Exascale”, in Proceedings of the 3rd International Workshop on Representative Applications (WRAp 2018) of the 20th IEEE Cluster Conference (Cluster 2018), Belfast, UK, September 10-13, 2018. [C46.bib]


Mohammed A.; Cavelan A.; Ciorba F. M.; Cabezón R. M.; Banicescu I.; “Two-level Dynamic Load Balancing for High Performance Scientific Applications”, In the SIAM Conference on Parallel Processing for Scientific Computing 2020 (SIAM PP20), Feb. 2020, Seattle, WA, USA. (Under review)


Mohammed A.; Cavelan A.; Ciorba F. M.; “rDLB: A Novel Approach for Robust Dynamic Load Balancing of Scientific Applications with Independent Tasks”, In the International Conference on High Performance Computing & Simulation (HPCS), July 2019, Dublin, Ireland.


Cavelan, A.; Cabezón, R. M.; Korndorfer, J. H. M., Ciorba, F.; “Finding Neighbors in a Forest: A b-tree for Smoothed Particle Hydrodynamics Simulations”. In Proceedings of the SPHERIC International Workshop, June 2019, Exeter, UK. 


Guerrera D.; Cavelan A.; Cabezón R. M.; Imbert D.; Piccinali J. G.; Mohammed A.; Mayer L.; Reed D.; Ciorba F. M.; “SPH-EXA: Enhancing the Scalability of SPH codes Via an Exascale-Ready SPH Mini-App” In Proceedings of the SPHERIC International Workshop, June 2019, Exeter, UK.


García-Senz, D.; Cabezón, R. M.; “Integral SPH: Connecting the partition of unit to accurate gradient estimation”, In Proceedings of the SPHERIC International Workshop, June 2019, Exeter, UK.


Blanco-Iglesias, J. M.; García-Senz, D.; Cabezón, R. M.; “Building initial models of rotating white dwarfs with SPH”, In Proceedings of the SPHERIC International Workshop, June 2019, Exeter, UK.


Cavelan, A., Cabezón, R. M., and Ciorba, F. M. “Detection of Silent Data Corruptions in Smoothed Particle Hydrodynamics Simulations”. In Proceedings of the 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2019), Larnaca, May 2019.


Ciorba, F. M.; Iwainsky, C.; Buder, P.; “OpenMP Loop Scheduling Revisited: Making a Case for More Schedules”, In Proceedings of the 2018 International Workshop on OpenMP (iWomp 2018), September 21-23, 2018, Barcelona, Spain.


Mohammed A.; Ciorba F. M.; “SiL: An Approach for Adjusting Applications to Heterogeneous Systems Under Perturbations”, In the European Conference on Parallel Processing Workshops (HeteroPar) August 2018 (pp. 456-468), Turin, Italy. [C45.bib]


Guerrera, D.; Cabezón, R. M.; Piccinali, J.-G.; Cavelan, A.; Ciorba, F. M.; Imbert, D.; Mayer. L.; Reed, D.; "Towards a Mini-App for Smoothed Particle Hydrodynamics at Exascale", 3rd International Workshop on Representative Applications (WRAp 18), July 2018, Belfast, UK.


D. García-Senz, R. M. Cabezón, and I. Domínguez, “Surface and Core Detonations in Rotating White Dwarfs”, The Astrophysical Journal, Volume 862, Number 1, July 19, 2018.


Domínguez, I.; Cabezón, R. M.; García-Senz, D.; “Explosion of fast spinning sub-Chandrasekhar mass white dwarfs,'' 15th International Symposium on Nuclei in the Cosmos, June 2018, Assergi, Italy.



Posters

Müller-Korndörfer, J. H.; Ciorba, F. M.; Yilmaz, A.; Iwainsky, C.; Doerfert, J.; Finkel, H.; Kale, V.; Klemm, M.; "A Runtime Approach for Dynamic Load Balancing of OpenMP Parallel Loops in LLVM" Poster at the 30th ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2019), Denver, Colorado, USA, November 17-22, 2019.


Mohammed A.; Ciorba F. M.; Design of Robust Scheduling Methodologies in High Performance Computing, PhD Forum Poster at the 34th International Conference on High Performance Computing (ISC), June 2019, Frankfurt, Germany.


Ciorba F. M.; Mayer, L.; Cabezón R. M.; Imbert, D.; Guerrera, D.; Cavelan A.; Mohammed A.; Reed, D.; Piccinali, J.-G.; Banicescu I.; García-Senz, D.; Quinn, T.; “Design and Implementation of an Exascale-Ready Mini-App for Smoothed Particle Hydrodynamics Simulations”, Poster at the 34th International Conference on High Performance Computing (ISC), June 2019, Frankfurt, Germany.


Mohammed A.; Cavelan A.; Ciorba F. M.; Cabezón R. M.; Banicescu I.; “Identifying Performance Challenges in Smoothed Particle Hydrodynamics Simulations”, Poster at the 16th ACM/CSCS Platform for Advanced Scientific Computing (PASC) Conference, June 2019, Zurich, Switzerland.


Ciorba F. M.; Mayer, L.; Cabezón R. M.; Imbert, D.; Guerrera, D.; Cavelan A.; Mohammed A.; Piccinali, J.-G.; Reed, D.; Quinn, T.; Banicescu I.; García-Senz, D.; “Design and Implementation of an Exascale-Ready Mini-App for Smoothed Particle Hydrodynamics (SPH) Simulations”, Poster at the 16th ACM/CSCS Platform for Advanced Scientific Computing (PASC) Conference, June 2019, Zurich, Switzerland.


Mohammed A.; Ciorba F. M.; “A Study of the Performance of Scientific Applications with Dynamic Loop Scheduling Under Perturbations”, Poster at the 15th ACM/CSCS Platform for Advanced Scientific Computing (PASC) Conference, July 2018, Basel, Switzerland.


Cavelan, A.; F. M. Ciorba, R. M. Cabezón; “Are Smooth Particle Hydrodynamics Applications Inherently Resilient To Faults?”, Poster at the 15th ACM/CSCS Platform for Advanced Scientific Computing (PASC) Conference, July 2018, Basel, Switzerland.


Ciorba F. M.; Mayer, L.; Cabezón R. M.; Imbert, D.; Guerrera, D.; Cavelan A.; Mohammed A.; Reed, D.; Piccinali, J.-G.; Banicescu I.; García-Senz, D.; Quinn, T.; “Towards An Exascale-ready Mini-app For Smooth Particle Hydrodynamics”, Poster at the 15th ACM/CSCS Platform for Advanced Scientific Computing (PASC) Conference, July 2018, Basel, Switzerland.


Domínguez I.; Cabezón R. M.; García-Senz D.; "Explosion of Fast Spinning sub-Chandrasekhar Mass White Dwarfs" Poster at the 15th International Symposium on Nuclei in the Cosmos, Assergi, Italy, June 2018.


Talks

Mohammed A.; Cavelan A.; Ciorba F. M.; "rDLB: A Novel Approach for Robust Dynamic Load Balancing of Scientific Applications with Independent Tasks", Talk at the International Conference on High Performance Computing & Simulation (HPCS), July 2019, Dublin, Ireland.


Cabezón, R. M.; Käppeli, R.; "Breaking the Wall in Computational Astrophysics: Current Bottlenecks and How to Address them towards the Exascale Era", Minisymposium in PASC Conference, 2019 June, Zurich, Switzerland.


Ciorba, F.; Cavelan, A.; "SPH-EXA: A Smoothed Particle Hydrodynamics Mini-app for Exascale Computing", Talk in PASC Conference, 2019 June, Zurich, Switzerland.


García-Senz, D.; “Integral SPH: Connecting the partition of unit to accurate gradient estimation”. Talk in SPHERIC International Workshop, June 2019, Exeter, UK.


García-Senz, D.; “Building initial models of rotating white dwarfs with SPH”. Talk in SPHERIC International Workshop, June 2019, Exeter, UK.


Cavelan, A.; “Finding Neighbors in a Forest: A b-tree for Smoothed Particle Hydrodynamics Simulations”. Talk in SPHERIC International Workshop, June 2019, Exeter, UK.


Cavelan A.; “SPH-EXA: Enhancing the Scalability of SPH codes Via an Exascale-Ready SPH Mini-App”. Talk in SPHERIC International Workshop, June 2019, Exeter, UK.


Ciorba, F.; Cabezón, R. M.; "SPH-EXA: Optimizing Smooth Particle Hydrodynamics for Exascale Computing", Talk in PASC Conference, 2018 July, Basel, Switzerland.


Cavelan, A.; "Addressing Resilience Challenges For Computing At Extreme Scale", Minisymposium in PASC Conference, 2018 July, Basel, Switzerland.


Cavelan, A.; Bautista, L.; Robert, Y.; Engelman, Ch.; "Panel Discussion On Upcoming Challenges At Exascale", PASC Conference, 2018 July, Basel, Switzerland.


Guerrera, D.; "Towards a Mini-App for Smoothed Particle Hydrodynamics at Exascale", Talk in WRAp 18, 2018 July, Belfast, UK.


Performance Analysis Reports

F. Orland, R. Liem. J. Protze, B. Wang, “POP SPH-EXA Performance Assessment Report”, November 2019, 12 pp. (PDF)


M. Wagner, POP SPHYNX Performance Assessment Report, March 2018, 11 pp. (PDF)