As many recent real-time applications (e.g., Augmented/Virtual Reality, cognitive assistance) rely on Deep Neural Networks (DNNs) for inference tasks, edge computing has appeared as a key enabler to deploy such applications as closest to the data sources, helping meet stringent latency and throughput demands. However, the limited resources typically available at the edge create significant challenges for efficiently managing inference workloads. Thus, a trade-off between network and processing time should be considered when it comes to end-toend delay requirements. In this paper, we focus on the problem of scheduling inference jobs of DNN models in such edge-cloud continuum at short timescales (i.e., a few milliseconds). Through simulations, we analyze several policies in the realistic network settings and workloads of a large ISP, highlighting the need for a dynamic scheduling policy that can adapt to varying network conditions and workload demands. To this end, we propose ASET, a Reinforcement Learning (RL)-based scheduling algorithm able to dynamically adapt its decisions according to the system conditions. Our results show that ASET effectively provides the best performance compared to a set of static policies when scheduling over a distributed pool of edge-cloud resources.

Scheduling Inference Workloads in the Computing Continuum with Reinforcement Learning / Castellano, Gabriele; Nieto, Juan-José; Angi, Antonino; Àlvarez Terribas, Francisco; Luque, Jordi; Diego Andilla, Ferrán; Sacco, Alessio; Esposito, Flavio; Risso, Fulvio. - ELETTRONICO. - (2025), pp. 297-302. ( IEEE International Conference on Distributed Computing Systems (ICDCS) Glasgow (UK) 20 July - 23 July 2025) [10.1109/ICDCSW63273.2025.00056].

Scheduling Inference Workloads in the Computing Continuum with Reinforcement Learning

Castellano, Gabriele;Angi, Antonino;Sacco, Alessio;Risso, Fulvio
2025

Abstract

As many recent real-time applications (e.g., Augmented/Virtual Reality, cognitive assistance) rely on Deep Neural Networks (DNNs) for inference tasks, edge computing has appeared as a key enabler to deploy such applications as closest to the data sources, helping meet stringent latency and throughput demands. However, the limited resources typically available at the edge create significant challenges for efficiently managing inference workloads. Thus, a trade-off between network and processing time should be considered when it comes to end-toend delay requirements. In this paper, we focus on the problem of scheduling inference jobs of DNN models in such edge-cloud continuum at short timescales (i.e., a few milliseconds). Through simulations, we analyze several policies in the realistic network settings and workloads of a large ISP, highlighting the need for a dynamic scheduling policy that can adapt to varying network conditions and workload demands. To this end, we propose ASET, a Reinforcement Learning (RL)-based scheduling algorithm able to dynamically adapt its decisions according to the system conditions. Our results show that ASET effectively provides the best performance compared to a set of static policies when scheduling over a distributed pool of edge-cloud resources.
2025
979-8-3315-1725-0
File in questo prodotto:
File Dimensione Formato  
36422_Scheduling_Inference_Wor.pdf

accesso aperto

Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: Pubblico - Tutti i diritti riservati
Dimensione 4.29 MB
Formato Adobe PDF
4.29 MB Adobe PDF Visualizza/Apri
Scheduling_Inference_Workloads_in_the_Computing_Continuum_with_Reinforcement_Learning.pdf

accesso riservato

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 840.46 kB
Formato Adobe PDF
840.46 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/3001617