Compressing and Fine-tuning DNNs for Efficient Inference in Mobile Device-Edge Continuum

Singh, Gurtaj; Chukhno, Olga; Campolo, Claudia; Molinaro, Antonella; Chiasserini, Carla Fabiana

doi:10.1109/MeditCom61057.2024.10621155

Pruning deep neural networks (DNN) is a well- known technique that allows for a sensible reduction in inference cost. However, this may severely degrade the accuracy achieved by the model unless the latter is properly fine-tuned, which may, in turn, result in increased computational cost and latency. Thus, upon deploying a DNN in resource-constrained edge environments, it is critical to find the best trade-off between accuracy (hence, model complexity) and latency and energy consumption. In this work, we explore the different options for the deployment of a machine learning pipeline, encompassing pruning, fine-tuning, and inference, across a mobile device requesting inference tasks and an edge server, and considering privacy constraints on the data to be used for fine-tuning. Our experimental analysis provides insights for an efficient allocation of the pipeline tasks across network edge and mobile device in terms of energy and network costs, as the target inference latency and accuracy vary. In particular, our results highlight that the higher the edge server load and the number of inference requests, the more convenient it becomes to deploy the entire pipeline at the mobile device using a pruned model, with a cost reduction of up to a factor two compared to deploying the whole pipeline at the edge.

Compressing and Fine-tuning DNNs for Efficient Inference in Mobile Device-Edge Continuum / Singh, Gurtaj; Chukhno, Olga; Campolo, Claudia; Molinaro, Antonella; Chiasserini, Carla Fabiana. - ELETTRONICO. - (2024). ( IEEE MeditCom 2024 Madrid (Spain) 08-11 July 2024) [10.1109/MeditCom61057.2024.10621155].

Compressing and Fine-tuning DNNs for Efficient Inference in Mobile Device-Edge Continuum

Gurtaj Singh;Olga Chukhno;Claudia Campolo;Antonella Molinaro;Carla Fabiana Chiasserini

2024

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2024
			
	Codice ISBN
	
				979-8-3503-0948-5
			
	Appare nelle tipologie
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
ADROIT6G-2.pdf accesso aperto Tipologia: 2. Post-print / Author's Accepted Manuscript Licenza: Pubblico - Tutti i diritti riservati Dimensione 814.43 kB Formato Adobe PDF Visualizza/Apri	814.43 kB	Adobe PDF	Visualizza/Apri
Chiasserini-Compressing.pdf accesso riservato Tipologia: 2a Post-print versione editoriale / Version of Record Licenza: Non Pubblico - Accesso privato/ristretto Dimensione 914.05 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	914.05 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2988279

PORTO @ Archivio Istituzionale della Ricerca

Compressing and Fine-tuning DNNs for Efficient Inference in Mobile Device-Edge Continuum

Gurtaj Singh;Olga Chukhno;Claudia Campolo;Antonella Molinaro;Carla Fabiana Chiasserini

2024

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)