Sequence-To-Sequence Neural Networks Inference on Embedded Processors Using Dynamic Beam Search

Jahier Pagliari, Daniele; Daghero, Francesco; Poncino, Massimo

doi:10.3390/electronics9020337

Sequence-to-sequence deep neural networks have become the state of the art for a variety of machine learning applications, ranging from neural machine translation (NMT) to speech recognition. Many mobile and Internet of Things (IoT) applications would benefit from the ability of performing sequence-to-sequence inference directly in embedded devices, thereby reducing the amount of raw data transmitted to the cloud, and obtaining benefits in terms of response latency, energy consumption and security. However, due to the high computational complexity of these models, specific optimization techniques are needed to achieve acceptable performance and energy consumption on single-core embedded processors. In this paper, we present a new optimization technique called dynamic beam search, in which the inference complexity is tuned to the difficulty of the processed input sequence at runtime. Results based on measurements on a real embedded device, and on three state-of-the-art deep learning models, show that our method is able to reduce the inference time and energy by up to 25% without loss of accuracy.

Sequence-To-Sequence Neural Networks Inference on Embedded Processors Using Dynamic Beam Search / Jahier Pagliari, Daniele; Daghero, Francesco; Poncino, Massimo. - In: ELECTRONICS. - ISSN 2079-9292. - ELETTRONICO. - 9:2(2020), pp. 1-21. [10.3390/electronics9020337]

Sequence-To-Sequence Neural Networks Inference on Embedded Processors Using Dynamic Beam Search

Jahier Pagliari, Daniele;Daghero, Francesco;Poncino, Massimo

2020

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2020
			
	Codice DOI
	
				https://dx.doi.org/10.3390/electronics9020337
			
	Titolo della Rivista
	
				ELECTRONICS
			
	Appare nelle tipologie
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
electronics-09-00337-v2(1).pdf accesso aperto Descrizione: Articolo principale Tipologia: 2a Post-print versione editoriale / Version of Record Licenza: Creative commons Dimensione 6.13 MB Formato Adobe PDF Visualizza/Apri	6.13 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2796527

PORTO @ Archivio Istituzionale della Ricerca

Sequence-To-Sequence Neural Networks Inference on Embedded Processors Using Dynamic Beam Search

Jahier Pagliari, Daniele;Daghero, Francesco;Poncino, Massimo

2020

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)