As many recent real-time applications (e.g., Augmented/Virtual Reality, cognitive assistance) rely on Deep Neural Networks (DNNs) for inference tasks, edge computing has appeared as a key enabler to deploy such applications as closest to the data sources, helping meet stringent latency and throughput demands. However, the limited resources typically available at the edge create significant challenges for efficiently managing inference workloads. Thus, a trade-off between network and processing time should be considered when it comes to end-toend delay requirements. In this paper, we focus on the problem of scheduling inference jobs of DNN models in such edge-cloud continuum at short timescales (i.e., a few milliseconds). Through simulations, we analyze several policies in the realistic network settings and workloads of a large ISP, highlighting the need for a dynamic scheduling policy that can adapt to varying network conditions and workload demands. To this end, we propose ASET, a Reinforcement Learning (RL)-based scheduling algorithm able to dynamically adapt its decisions according to the system conditions. Our results show that ASET effectively provides the best performance compared to a set of static policies when scheduling over a distributed pool of edge-cloud resources.
Scheduling Inference Workloads in the Computing Continuum with Reinforcement Learning / Castellano, Gabriele; Nieto, Juan-José; Angi, Antonino; Àlvarez Terribas, Francisco; Luque, Jordi; Diego Andilla, Ferrán; Sacco, Alessio; Esposito, Flavio; Risso, Fulvio. - ELETTRONICO. - (2025), pp. 297-302. ( IEEE International Conference on Distributed Computing Systems (ICDCS) Glasgow (UK) 20 July - 23 July 2025) [10.1109/ICDCSW63273.2025.00056].
Scheduling Inference Workloads in the Computing Continuum with Reinforcement Learning
Castellano, Gabriele;Angi, Antonino;Sacco, Alessio;Risso, Fulvio
2025
Abstract
As many recent real-time applications (e.g., Augmented/Virtual Reality, cognitive assistance) rely on Deep Neural Networks (DNNs) for inference tasks, edge computing has appeared as a key enabler to deploy such applications as closest to the data sources, helping meet stringent latency and throughput demands. However, the limited resources typically available at the edge create significant challenges for efficiently managing inference workloads. Thus, a trade-off between network and processing time should be considered when it comes to end-toend delay requirements. In this paper, we focus on the problem of scheduling inference jobs of DNN models in such edge-cloud continuum at short timescales (i.e., a few milliseconds). Through simulations, we analyze several policies in the realistic network settings and workloads of a large ISP, highlighting the need for a dynamic scheduling policy that can adapt to varying network conditions and workload demands. To this end, we propose ASET, a Reinforcement Learning (RL)-based scheduling algorithm able to dynamically adapt its decisions according to the system conditions. Our results show that ASET effectively provides the best performance compared to a set of static policies when scheduling over a distributed pool of edge-cloud resources.| File | Dimensione | Formato | |
|---|---|---|---|
|
36422_Scheduling_Inference_Wor.pdf
accesso aperto
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
Pubblico - Tutti i diritti riservati
Dimensione
4.29 MB
Formato
Adobe PDF
|
4.29 MB | Adobe PDF | Visualizza/Apri |
|
Scheduling_Inference_Workloads_in_the_Computing_Continuum_with_Reinforcement_Learning.pdf
accesso riservato
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Non Pubblico - Accesso privato/ristretto
Dimensione
840.46 kB
Formato
Adobe PDF
|
840.46 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/3001617
