Recent advances in Internet-of-things (IoT) and 5G infrastructures promote new computational paradigms such as Split Computing (SC) for deploying Deep Neural Networks (DNNs) on mobile applications. In SC, DNNs are partitioned into head and tail sub-models that are executed on the mobile device and cloud/edge servers, respectively. Modern SC models resort to head compression techniques to balance energy consumption, transmission data, and model size while preserving the outstanding accuracy of large state-of-the-art DNNs. These features make SC DNNs suitable for mobile applications, including safety-critical systems (e.g., self-driving vehicles, autonomous robots, and healthcare equipment), where reliability is a paramount factor mandated by strict safety standards. Despite there are many studies available about the reliability of DNNs, the SC models are still unexplored, especially when hardware faults threaten the operation of a mobile device. In this work, we present for the first time i) an application-level fault injection strategy for modeling hardware faults on mobile GPUs executing SC DNNs and ii) an evaluation of the resilience of supervised compression methods utilized by SC systems. The preliminary results gathered on some representative benchmark networks and configurations show the feasibility and effectiveness of the approach. They also demonstrate that aggressive compression strategies lead to high accuracy degradation (≈ 40%), increasing the overall vulnerability of the DNN and the system.

Evaluating the Reliability of Supervised Compression for Split Computing / Guerrero-Balaguera, Juan-David; Rodriguez Condia, Josie E.; Levorato, Marco; Sonza Reorda, Matteo. - ELETTRONICO. - (2024), pp. 1-6. (Intervento presentato al convegno 2024 IEEE 42nd VLSI Test Symposium (VTS) tenutosi a Tempe (USA) nel 22-24 April 2024) [10.1109/vts60656.2024.10538938].

Evaluating the Reliability of Supervised Compression for Split Computing

Guerrero-Balaguera, Juan-David;Rodriguez Condia, Josie E.;Levorato, Marco;Sonza Reorda, Matteo
2024

Abstract

Recent advances in Internet-of-things (IoT) and 5G infrastructures promote new computational paradigms such as Split Computing (SC) for deploying Deep Neural Networks (DNNs) on mobile applications. In SC, DNNs are partitioned into head and tail sub-models that are executed on the mobile device and cloud/edge servers, respectively. Modern SC models resort to head compression techniques to balance energy consumption, transmission data, and model size while preserving the outstanding accuracy of large state-of-the-art DNNs. These features make SC DNNs suitable for mobile applications, including safety-critical systems (e.g., self-driving vehicles, autonomous robots, and healthcare equipment), where reliability is a paramount factor mandated by strict safety standards. Despite there are many studies available about the reliability of DNNs, the SC models are still unexplored, especially when hardware faults threaten the operation of a mobile device. In this work, we present for the first time i) an application-level fault injection strategy for modeling hardware faults on mobile GPUs executing SC DNNs and ii) an evaluation of the resilience of supervised compression methods utilized by SC systems. The preliminary results gathered on some representative benchmark networks and configurations show the feasibility and effectiveness of the approach. They also demonstrate that aggressive compression strategies lead to high accuracy degradation (≈ 40%), increasing the overall vulnerability of the DNN and the system.
2024
979-8-3503-6378-4
File in questo prodotto:
File Dimensione Formato  
Evaluating_the_Reliability_of_Supervised_Compression_for_Split_Computing.pdf

accesso riservato

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 428.59 kB
Formato Adobe PDF
428.59 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2989150