Collaborative Inference (CI) optimizes the latency and energy consumption of deep learning inference through the inter-operation of edge and cloud devices. Albeit beneficial for other tasks, CI has never been applied to the sequence-to-sequence mapping problem at the heart of Neural Machine Translation (NMT). In this work, we address the specific issues of collaborative NMT, such as estimating the latency required to generate the (unknown) output sequence, and show how existing CI methods can be adapted to these applications. Our experiments show that CI can reduce the latency of NMT by up to 44% compared to a non-collaborative approach.

C-NMT: A Collaborative Inference Framework for Neural Machine Translation / Chen, Y.; Chiaro, R.; Macii, E.; Poncino, M.; Jahier Pagliari, D.. - ELETTRONICO. - (2022), pp. 1512-1516. (Intervento presentato al convegno 2022 IEEE International Symposium on Circuits and Systems, ISCAS 2022 tenutosi a Austin (USA) nel 27 May 2022 - 01 June 2022) [10.1109/ISCAS48785.2022.9937603].

C-NMT: A Collaborative Inference Framework for Neural Machine Translation

Chen Y.;Chiaro R.;MacIi E.;Poncino M.;Jahier Pagliari D.
2022

Abstract

Collaborative Inference (CI) optimizes the latency and energy consumption of deep learning inference through the inter-operation of edge and cloud devices. Albeit beneficial for other tasks, CI has never been applied to the sequence-to-sequence mapping problem at the heart of Neural Machine Translation (NMT). In this work, we address the specific issues of collaborative NMT, such as estimating the latency required to generate the (unknown) output sequence, and show how existing CI methods can be adapted to these applications. Our experiments show that CI can reduce the latency of NMT by up to 44% compared to a non-collaborative approach.
2022
978-1-6654-8485-5
File in questo prodotto:
File Dimensione Formato  
CRIME_NMT_New.pdf

accesso aperto

Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 1.54 MB
Formato Adobe PDF
1.54 MB Adobe PDF Visualizza/Apri
Chen_et_al_2022_C-NMT.pdf

non disponibili

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 611.59 kB
Formato Adobe PDF
611.59 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2974501