Recent advances in Internet-of-things (IoT) and 5G infrastructures promote new computational paradigms such as Split Computing (SC) for deploying Deep Neural Networks (DNNs) on mobile applications. In SC, DNNs are partitioned into head and tail sub-models that are executed on the mobile device and cloud/edge servers, respectively. Modern SC models resort to head compression techniques to balance energy consumption, transmission data, and model size while preserving the outstanding accuracy of large state-of-the-art DNNs. These features make SC DNNs suitable for mobile applications, including safety-critical systems (e.g., self-driving vehicles, autonomous robots, and healthcare equipment), where reliability is a paramount factor mandated by strict safety standards. Despite there are many studies available about the reliability of DNNs, the SC models are still unexplored, especially when hardware faults threaten the operation of a mobile device. In this work, we present for the first time i) an application-level fault injection strategy for modeling hardware faults on mobile GPUs executing SC DNNs and ii) an evaluation of the resilience of supervised compression methods utilized by SC systems. The preliminary results gathered on some representative benchmark networks and configurations show the feasibility and effectiveness of the approach. They also demonstrate that aggressive compression strategies lead to high accuracy degradation (≈ 40%), increasing the overall vulnerability of the DNN and the system.
Evaluating the Reliability of Supervised Compression for Split Computing / Guerrero-Balaguera, Juan-David; Rodriguez Condia, Josie E.; Levorato, Marco; Sonza Reorda, Matteo. - ELETTRONICO. - (2024), pp. 1-6. (Intervento presentato al convegno 2024 IEEE 42nd VLSI Test Symposium (VTS) tenutosi a Tempe (USA) nel 22-24 April 2024) [10.1109/vts60656.2024.10538938].
Evaluating the Reliability of Supervised Compression for Split Computing
Guerrero-Balaguera, Juan-David;Rodriguez Condia, Josie E.;Levorato, Marco;Sonza Reorda, Matteo
2024
Abstract
Recent advances in Internet-of-things (IoT) and 5G infrastructures promote new computational paradigms such as Split Computing (SC) for deploying Deep Neural Networks (DNNs) on mobile applications. In SC, DNNs are partitioned into head and tail sub-models that are executed on the mobile device and cloud/edge servers, respectively. Modern SC models resort to head compression techniques to balance energy consumption, transmission data, and model size while preserving the outstanding accuracy of large state-of-the-art DNNs. These features make SC DNNs suitable for mobile applications, including safety-critical systems (e.g., self-driving vehicles, autonomous robots, and healthcare equipment), where reliability is a paramount factor mandated by strict safety standards. Despite there are many studies available about the reliability of DNNs, the SC models are still unexplored, especially when hardware faults threaten the operation of a mobile device. In this work, we present for the first time i) an application-level fault injection strategy for modeling hardware faults on mobile GPUs executing SC DNNs and ii) an evaluation of the resilience of supervised compression methods utilized by SC systems. The preliminary results gathered on some representative benchmark networks and configurations show the feasibility and effectiveness of the approach. They also demonstrate that aggressive compression strategies lead to high accuracy degradation (≈ 40%), increasing the overall vulnerability of the DNN and the system.File | Dimensione | Formato | |
---|---|---|---|
Evaluating_the_Reliability_of_Supervised_Compression_for_Split_Computing.pdf
accesso riservato
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Non Pubblico - Accesso privato/ristretto
Dimensione
428.59 kB
Formato
Adobe PDF
|
428.59 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2989150