Collaborative Inference (CI) optimizes the latency and energy consumption of deep learning inference through the inter-operation of edge and cloud devices. Albeit beneficial for other tasks, CI has never been applied to the sequence-to-sequence mapping problem at the heart of Neural Machine Translation (NMT). In this work, we address the specific issues of collaborative NMT, such as estimating the latency required to generate the (unknown) output sequence, and show how existing CI methods can be adapted to these applications. Our experiments show that CI can reduce the latency of NMT by up to 44% compared to a non-collaborative approach.
C-NMT: A Collaborative Inference Framework for Neural Machine Translation / Chen, Y.; Chiaro, R.; Macii, E.; Poncino, M.; Jahier Pagliari, D.. - ELETTRONICO. - (2022), pp. 1512-1516. (Intervento presentato al convegno 2022 IEEE International Symposium on Circuits and Systems, ISCAS 2022 tenutosi a Austin (USA) nel 27 May 2022 - 01 June 2022) [10.1109/ISCAS48785.2022.9937603].
C-NMT: A Collaborative Inference Framework for Neural Machine Translation
Chen Y.;Chiaro R.;MacIi E.;Poncino M.;Jahier Pagliari D.
2022
Abstract
Collaborative Inference (CI) optimizes the latency and energy consumption of deep learning inference through the inter-operation of edge and cloud devices. Albeit beneficial for other tasks, CI has never been applied to the sequence-to-sequence mapping problem at the heart of Neural Machine Translation (NMT). In this work, we address the specific issues of collaborative NMT, such as estimating the latency required to generate the (unknown) output sequence, and show how existing CI methods can be adapted to these applications. Our experiments show that CI can reduce the latency of NMT by up to 44% compared to a non-collaborative approach.File | Dimensione | Formato | |
---|---|---|---|
CRIME_NMT_New.pdf
accesso aperto
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
PUBBLICO - Tutti i diritti riservati
Dimensione
1.54 MB
Formato
Adobe PDF
|
1.54 MB | Adobe PDF | Visualizza/Apri |
Chen_et_al_2022_C-NMT.pdf
non disponibili
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Non Pubblico - Accesso privato/ristretto
Dimensione
611.59 kB
Formato
Adobe PDF
|
611.59 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2974501