Energy-efficient deep learning inference on edge devices

Daghero, F.; Jahier Pagliari, D.; Poncino, M.

doi:10.1016/bs.adcom.2020.07.002

The success of deep learning comes at the cost of very high computational complexity. Consequently, Internet of Things (IoT) edge nodes typically offload deep learning tasks to powerful cloud servers, an inherently inefficient solution. In fact, transmitting raw data to the cloud through wireless links incurs long latencies and high energy consumption. Moreover, pure cloud offloading is not scalable due to network pressure and poses security concerns related to the transmission of user data. The straightforward solution to these issues is to perform deep learning inference at the edge. However, cost and power-constrained embedded processors with limited processing and memory capabilities cannot handle complex deep learning models. Even resorting to hardware acceleration, a common approach to handle such complexity, embedded devices are still not able to directly manage models designed for cloud servers. It becomes then necessary to employ proper optimization strategies to enable deep learning processing at the edge. In this chapter, we survey the most relevant optimizations to support embedded deep learning inference. We focus in particular on optimizations that favor hardware acceleration (such as quantization and big-little architectures). We divide our analysis in two parts. First, we review classic approaches based on static (design time) optimizations. We then show how these solutions are often suboptimal, as they produce models that are either over-optimized for complex inputs (yielding accuracy losses) or under-optimized for simple inputs (losing energy saving opportunities). Finally, we review the more recent trend of dynamic (input-dependent) optimizations, which solve this problem by adapting the optimization to the processed input.

Energy-efficient deep learning inference on edge devices / Daghero, F.; Jahier Pagliari, D.; Poncino, M. (ADVANCES IN COMPUTERS). - In: Hardware Accelerator Systems for Artificial Intelligence and Machine Learning / Shiho K., Ganesh C. D.. - ELETTRONICO. - [s.l] : Elsevier, 2021. - ISBN 978-0-12-823123-4. - pp. 247-301 [10.1016/bs.adcom.2020.07.002]

Energy-efficient deep learning inference on edge devices

Daghero F.;Jahier Pagliari D.;Poncino M.

2021

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2021
			
	Titolo della Serie/Collana (se presente ISSN)
	
				ADVANCES IN COMPUTERS
			
	Codice ISBN
	
				978-0-12-823123-4
			
	Titolo del libro
	
				Hardware Accelerator Systems for Artificial Intelligence and Machine Learning
			
	Appare nelle tipologie
	
				2.1 Contributo in volume (Capitolo o Saggio)

File in questo prodotto:

File	Dimensione	Formato
preprint.pdf accesso riservato Descrizione: Pre-print Tipologia: 1. Preprint / submitted version [pre- review] Licenza: Non Pubblico - Accesso privato/ristretto Dimensione 1.22 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.22 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2851984

PORTO @ Archivio Istituzionale della Ricerca

Energy-efficient deep learning inference on edge devices

Daghero F.;Jahier Pagliari D.;Poncino M.

2021

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)