In High-Performance Computing (HPC), workflows are utilized to define and manage a set of interdependent computations which allow the users to extract insights from (scientific) numerical simulations or data analytics. HPC platforms can perform extreme-scale simulations, combining Artificial Intelligence (AI) training and inference and data analytics (we refer to heterogeneous workflows), by providing tools and computing resources which serve a variety of use-cases spanning very diverse application domains (e.g., weather forecasting, quantum mechanics, etc.). Executing such workflows at scale requires to handle dependencies, job submission automation, I/O mechanisms. Despite State-of-the-Art batch schedulers can be configured and integrated with tools accomplishing this automation, a number of cases where resource allocation can lead to inefficiencies still exist. In this paper, to overcome these limitations, we present the WARP (Workflow-aware Advanced Resource Planner), a tool that integrates with workflow management tools and batch schedulers, to reserve in advance resources for an optimal execution of jobs, based on their duration, dependencies and machine load. WARP has been designed to minimize the overall workflow execution, without violating the priority policies for cluster users imposed by the system administrators.

Advanced Resource Allocation in the Context of Heterogeneous Workflows Management / Lubrano, Francesco; Vercellino, Chiara; Vitali, Giacomo; Viviani, Paolo; Scionti, Alberto; Terzo, Olivier. - ELETTRONICO. - (2024), pp. 14-20. (Intervento presentato al convegno WiDE '24: 2nd Workshop on Workflows in Distributed Environments tenutosi a Athens (GR) nel 22 April 2024) [10.1145/3642978.3652835].

Advanced Resource Allocation in the Context of Heterogeneous Workflows Management

Chiara Vercellino;Giacomo Vitali;Paolo Viviani;Alberto Scionti;
2024

Abstract

In High-Performance Computing (HPC), workflows are utilized to define and manage a set of interdependent computations which allow the users to extract insights from (scientific) numerical simulations or data analytics. HPC platforms can perform extreme-scale simulations, combining Artificial Intelligence (AI) training and inference and data analytics (we refer to heterogeneous workflows), by providing tools and computing resources which serve a variety of use-cases spanning very diverse application domains (e.g., weather forecasting, quantum mechanics, etc.). Executing such workflows at scale requires to handle dependencies, job submission automation, I/O mechanisms. Despite State-of-the-Art batch schedulers can be configured and integrated with tools accomplishing this automation, a number of cases where resource allocation can lead to inefficiencies still exist. In this paper, to overcome these limitations, we present the WARP (Workflow-aware Advanced Resource Planner), a tool that integrates with workflow management tools and batch schedulers, to reserve in advance resources for an optimal execution of jobs, based on their duration, dependencies and machine load. WARP has been designed to minimize the overall workflow execution, without violating the priority policies for cluster users imposed by the system administrators.
2024
979-8-4007-0546-5
File in questo prodotto:
File Dimensione Formato  
3642978.3652835.pdf

accesso aperto

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Creative commons
Dimensione 587.61 kB
Formato Adobe PDF
587.61 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2988116