Distributing Inference Tasks over Interconnected Systems through Dynamic DNNs

Singhal, Chetna; Yashuo, Wu; Malandrino, Francesco; Levorato, Marco; Chiasserini, Carla Fabiana

doi:10.1109/TON.2025.3543848

An increasing number of mobile applications lever- age deep neural networks (DNN) as an essential component to adapt to the operational context at hand and provide users with an enhanced experience. It is thus of paramount importance that network systems support the execution of DNN inference tasks in an efficient and sustainable way. Matching the diverse resources available at the mobile-edge-cloud network tiers with the applications requirements and the complexity of their, while minimizing energy consumption, is however challenging. A possible approach to the problem consists in exploiting the emerging concept of dynamic DNNs, characterized by multi-branched architectures with early exits enabling sample-based adaptation of the model depth. We leverage this concept and address the problem of deploying portions of DNNs with early exits across the mobile-edge-cloud system and allocating therein the necessary network, computing, and memory resources. We do so by developing a 3- stage graph-modeling method that allows us to represent the characteristics of the system and the applications as well as the possible options for splitting the DNN over the multi-tier network nodes. Our solution, called Feasible Inference Graph (FIN), can determine the DNN split, deployment, and resource allocation that minimizes the inference energy consumption while satisfying the nodes’ constraints and the requirements of multiple, co-existing applications. FIN closely matches the optimum and leads to over 89% energy savings with respect to state-of-the-art alternatives.

Distributing Inference Tasks over Interconnected Systems through Dynamic DNNs / Singhal, Chetna; Wu, Yashuo; Malandrino, Francesco; Levorato, Marco; Chiasserini, Carla Fabiana. - In: IEEE-ACM TRANSACTIONS ON NETWORKING. - ISSN 1063-6692. - STAMPA. - 33:4(2025), pp. 1717-1730. [10.1109/TON.2025.3543848]

Distributing Inference Tasks over Interconnected Systems through Dynamic DNNs

Chetna Singhal;Yashuo Wu;Francesco Malandrino;Marco Levorato;Carla Fabiana Chiasserini

2025

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2025
			
	Codice DOI
	
				https://dx.doi.org/10.1109/TON.2025.3543848
			
	Titolo della Rivista
	
				IEEE-ACM TRANSACTIONS ON NETWORKING
			
	Appare nelle tipologie
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
Distributing_Inference_Tasks_Over_Interconnected_Systems_Through_Dynamic_DNNs-2.pdf accesso riservato Tipologia: 2a Post-print versione editoriale / Version of Record Licenza: Non Pubblico - Accesso privato/ristretto Dimensione 2.01 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	2.01 MB	Adobe PDF	Visualizza/Apri Richiedi una copia
Orchestration_federated_split_early_exit_ToN-2.pdf accesso aperto Tipologia: 2. Post-print / Author's Accepted Manuscript Licenza: Pubblico - Tutti i diritti riservati Dimensione 4.45 MB Formato Adobe PDF Visualizza/Apri	4.45 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2997572

PORTO @ Archivio Istituzionale della Ricerca

Distributing Inference Tasks over Interconnected Systems through Dynamic DNNs

Chetna Singhal;Yashuo Wu;Francesco Malandrino;Marco Levorato;Carla Fabiana Chiasserini

2025

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)