Resource-aware Deployment of Dynamic DNNs over Multi-tiered Interconnected Systems

Singhal, C.; Wu, Y.; Malandrino, F.; Levorato, M.; Chiasserini, C. F.

The increasing pervasiveness of intelligent mobile applications requires to exploit the full range of resources offered by the mobile-edge-cloud network for the execution of inference tasks. However, due to the heterogeneity of such multi-tiered net- works, it is essential to make the applications’ demand amenable to the available resources while minimizing energy consumption. Modern dynamic deep neural networks (DNN) achieve this goal by designing multi-branched architectures where early exits enable sample-based adaptation of the model depth. In this paper, we tackle the problem of allocating sections of DNNs with early exits to the nodes of the mobile-edge-cloud system. By envisioning a 3-stage graph-modeling approach, we represent the possible options for splitting the DNN and deploying the DNN blocks on the multi-tiered network, embedding both the system constraints and the application requirements in a convenient and efficient way. Our framework – named Feasible Inference Graph (FIN) – can identify the solution that minimizes the overall inference energy consumption while enabling distributed inference over the multi-tiered network with the target quality and latency. Our results, obtained for DNNs with different levels of complexity, show that FIN matches the optimum and yields over 65% energy savings relative to a state-of-the-art technique for cost minimization.

Resource-aware Deployment of Dynamic DNNs over Multi-tiered Interconnected Systems / Singhal, C.; Wu, Y.; Malandrino, F.; Levorato, M.; Chiasserini, C. F.. - ELETTRONICO. - (2024). (Intervento presentato al convegno IEEE INFOCOM 2024 tenutosi a Vancouver (Canada) nel May 2024).

Resource-aware Deployment of Dynamic DNNs over Multi-tiered Interconnected Systems

C. Singhal;Y. Wu;F. Malandrino;M. Levorato;C. F. Chiasserini

2024

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

Anno del prodotto

2024

Appare nelle tipologie

4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
Infocom_2024_split.pdf accesso aperto Tipologia: 2. Post-print / Author's Accepted Manuscript Licenza: PUBBLICO - Tutti i diritti riservati Dimensione 4.29 MB Formato Adobe PDF Visualizza/Apri	4.29 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2984337

PORTO @ Archivio Istituzionale della Ricerca

Resource-aware Deployment of Dynamic DNNs over Multi-tiered Interconnected Systems

C. Singhal;Y. Wu;F. Malandrino;M. Levorato;C. F. Chiasserini

2024

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)