Integrated Data Analysis Pipelines for Large-Scale Data Management, HPC, and Machine Learning
Coordinator: Know-Center GmbH, Austria
Partners: 13 partners from 7 European countries
Know-Center GmbH (Austria), AVL List GmbH (Austria), Deutsches Zentrum für Luft- und Raumfahrt EV (Germany), ETHZ (Switzerland), Hasso Pattner Institute (Germany), ICCS (Greece), Infineon Technologies (Austria), Intel (Poland), IT University Copenhagen (Denmark), KAI GmbH (Austria), TU Dresden (Germany), University of Maribor (Slovenia), University of Basel (Switzerland).
Funding agency: European Union via the Horizon 2020 programme; EU project page.
Duration: 01.12.2020-30.11.2024
Central project webpage: daphne-eu.eu
Project Summary
The DAPHNE project aims to define and build an open and extensible system infrastructure for integrated data analysis pipelines, including data management and processing, high-performance computing (HPC), and machine learning (ML) training and scoring. Key observations are that:
(1) systems of these areas share many compilation and runtime techniques,
(2) there is a trend towards complex data analysis pipelines that combine these systems, and
(3) the used, increasingly heterogeneous, hardware infrastructure converges as well.
Yet, the programming paradigms, cluster resource management, as well as data formats and representations differ substantially. Therefore, this project aims – with a joint consortium of experts from the data management, ML systems, and HPC communities – to systematically investigating the necessary system infrastructure, language abstractions, compilation, and runtime techniques, as well as systems and tools necessary to increase the productivity when building such data analysis pipelines, and eliminating unnecessary performance bottlenecks.
At the University of Basel
Throughout the hierarchy from integrated pipelines in distributed environments down to specialized storage and accelerator devices, scheduling of tasks (which may represent operation- and data-bundles) is crucial for achieving high system utilization, throughput, and latency.
The HPC group at the University of Basel will develop scheduling mechanisms for the different system hierarchy levels, including compilation and runtime techniques.
Publications
F. Boito, J. Brandt, V. Cardellini, P. Carns, F. M. Ciorba, H. Egan, A. Eleliemy, A. Gentile, T. Gruber, J. Hanson, U.-U. Haus, K. Huck, T. Ilsche, T. Jakobsche, T. Jones, S. Karlsson, A. Mueen, M. Ott, T. Patki, I. Peng, K. Raghavan, S. Simms, K. Shoga, M. Showerman, D. Tiwari, T. Wilde, K. Yamamoto. “Autonomy Loops for Monitoring, Operational Data Analytics, Feedback, and Response in HPC Operations.” In Proceedings of The 10th Monitoring and Analysis for High Performance Computing Systems Plus Applications (HPCMASPA) at IEEE Cluster, Santa Fe, New Mexico, USA, October, 2023.
J. Brandt, F. M. Ciorba, A. Gentile, M. Ott, and T. Wilde. Driving HPC Operations With Holistic Monitoring and Operational Data Analytics (Dagstuhl Seminar 23171). In Dagstuhl Reports, Volume 13, Issue 4, pp. 98-120, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023) https://doi.org/10.4230/DagRep.13.4.98
J. H. Müller Korndörfer, A. Eleliemy, O. S. Simsek, T. Ilsche, R. Schöne, F. M. Ciorba. “How Do OS and Application Schedulers Interact? An Investigation with Multithreaded Applications”. In Proceedings of the International European Conference on Parallel and Distributed Computing (Euro-Par 2023), Limassol, Cyprus, 28 August-1 September 2023. [bib] (online)
A. Eleliemy and F. M. Ciorba. “DaphneSched: A Scheduler for Integrated Data Analysis Pipelines”. In Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Computing (ISPDC), Bucharest, Romania, July 10-12, 2023. [bib] (online) Best paper award.
A. Mohammed, J. H. Müller Korndörfer, A. Eleliemy, F. M. Ciorba. “Automated Scheduling Algorithm Selection and Chunk Parameter Calculation in OpenMP”. IEEE Transactions on Parallel and Distributed Systems (TPDS), Tier A*, December 2022. https://ieeexplore.ieee.org/document/9825675 (Open Access) [bib]
Video presenting Auto4OMP: OpenMP SC22 Booth Talk, slides.
N. Ihde, P. Marten, A. Eleliemy, G. Poerwawinata, P. Silva, I. Tolovski, F. M. Ciorba, T. Rabl. “A Survey of Big Data, HPC and Machine Learning Benchmarks”. In Proceedings of the 13th Transaction Processing Council Technology Conference on Performance Evaluation & Benchmarking (TPCTC 2021) of the 47th International Conference on Very Large Data Bases, Copenhagen, Denmark, August 2021 online [C62.bib]
P. Damme, et al. “DAPHNE: An Open and Extensible System Infrastructure for Integrated Data Analysis Pipelines”. In Proceedings of the 12th Annual Conference on Innovative Data Systems Research (CIDR ’22), Chaminade, USA., January 2022 online [C65.bib]
J.H. Müller Korndörfer, A. Eleliemy, A. Mohammed, F.M. Ciorba. “LB4OMP: A Dynamic Load Balancing Library for Multithreaded Applications”. Transactions on Parallel and Distributed Systems (TPDS), Tier A*, 2021. [to appear] Videos about LB4OMP (OpenMP booth video, OpenMP Users Developer Conference video)
Talks
Keynote “Revolutionizing HPC Operations & Research” at HPCMASPA workshop at IEEE Cluster 2023, Speaker: F. M. Ciorba
Multilevel Scheduling Reflection on DAPHNE, Speaker: A. Eleliemy
Invited talk at EVEREST+DAPHNE: Workshop on Design and Programming High-performance, distributed, reconfigurable and heterogeneous platforms for extreme-scale analytics, held at the HiPEAC 2023 Conference, January 18. 2023.