Soft errors in General Purpose Graphics Processing Units (GPGPUs) result in data or control flow errors. Error detection and correction methods for data and control flow errors are orthogonal, and these methods incur separate area, power, and performance overheads. This paper proposes a low-overhead predictive control flow error detection method called GPGPU Predictive Detector (GPD), which leverages data flow error detection and correction methods to detect control flow errors. GPD is non-intrusive to application software and transparent to users. GPD is built on earlier work on data flow error detection and correction methods DDSR and TREFU. In GPD, DDSR and TREFU combined architecture protects all non-control flow instructions. The control flow error is detected by calculating the address of the instruction that succeeds the control flow instruction in advance and comparing it with the actual address it accesses. The effectiveness of GPD has been shown through a set of ISPASS-2009 and RODINIA benchmarks. Relative to a non-fault-tolerant GPGPU architecture, GPD has a performance overhead of 5% and average and peak power overheads of 4% and 3%, respectively. We prove through induction that the GPD provides fault coverage against GPGPU control flow and data flow errors.
GPD: Predictive Control Flow Error Detection Leveraging Data Flow Error Detection Methods / Raghunandana K, K; Yogesh Prasad, K R; Sonza Reorda, M.; Virendra, Singh. - (2025), pp. 1-5. (Intervento presentato al convegno 31st IEEE International Symposium on On-Line Testing and Robust System Design, IOLTS 2025 tenutosi a Ischia (ITA) nel 07-09 July 2025) [10.1109/iolts65288.2025.11116971].
GPD: Predictive Control Flow Error Detection Leveraging Data Flow Error Detection Methods
M. Sonza Reorda;
2025
Abstract
Soft errors in General Purpose Graphics Processing Units (GPGPUs) result in data or control flow errors. Error detection and correction methods for data and control flow errors are orthogonal, and these methods incur separate area, power, and performance overheads. This paper proposes a low-overhead predictive control flow error detection method called GPGPU Predictive Detector (GPD), which leverages data flow error detection and correction methods to detect control flow errors. GPD is non-intrusive to application software and transparent to users. GPD is built on earlier work on data flow error detection and correction methods DDSR and TREFU. In GPD, DDSR and TREFU combined architecture protects all non-control flow instructions. The control flow error is detected by calculating the address of the instruction that succeeds the control flow instruction in advance and comparing it with the actual address it accesses. The effectiveness of GPD has been shown through a set of ISPASS-2009 and RODINIA benchmarks. Relative to a non-fault-tolerant GPGPU architecture, GPD has a performance overhead of 5% and average and peak power overheads of 4% and 3%, respectively. We prove through induction that the GPD provides fault coverage against GPGPU control flow and data flow errors.| File | Dimensione | Formato | |
|---|---|---|---|
|
GPD_Predictive_Control_Flow_Error_Detection_Leveraging_Data_Flow_Error_Detection_Methods.pdf
accesso riservato
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Non Pubblico - Accesso privato/ristretto
Dimensione
506.65 kB
Formato
Adobe PDF
|
506.65 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/3004367
