General Purpose Graphics Processing Units (GPGPUs) have been extensively used in the last decade as accelerators in high demanding applications, such as multimedia processing and high-performance computing. Nowadays, these devices are becoming popular even in safety-critical applications, such as in autonomous and semi-autonomous vehicles. However, these devices can suffer from the effects of transient faults, such as those produced by radiation effects. Among those effects, Single Event Upsets (SEUs), which are the focus of this paper, can cause application misbehaviors, which may lead to catastrophic consequences. In this work, we first describe how we extended the capabilities of an open-source VHDL GPGPU model (FlexGrip) and developed a new version named FlexGripPlus to study and analyze the effects of SEUs in a GPGPU in a much more detailed manner. We also performed extensive fault injection campaigns using FlexGripPlus, which allowed identifying the most critical effects within the GPGPU architecture. We finally focused on the scheduler controller since it represents a module that is specific to the GPGPU architecture and showed that it has different levels of SEU sensibility depending on the affected location. Moreover, the results of additional analyses varying the number of parallel execution units in the system are presented, demonstrating the correlation between the number of execution units in a GPGPU and the system reliability.
FlexGripPlus: An improved GPGPU model to support reliability analysis / Rodriguez Condia, Josie E.; Du, Boyang; Sonza Reorda, Matteo; Sterpone, Luca. - In: MICROELECTRONICS RELIABILITY. - ISSN 0026-2714. - 109:(2020), pp. 1-14. [10.1016/j.microrel.2020.113660]
FlexGripPlus: An improved GPGPU model to support reliability analysis
Rodriguez Condia, Josie E.;Du, Boyang;Sonza Reorda, Matteo;Sterpone, Luca
2020
Abstract
General Purpose Graphics Processing Units (GPGPUs) have been extensively used in the last decade as accelerators in high demanding applications, such as multimedia processing and high-performance computing. Nowadays, these devices are becoming popular even in safety-critical applications, such as in autonomous and semi-autonomous vehicles. However, these devices can suffer from the effects of transient faults, such as those produced by radiation effects. Among those effects, Single Event Upsets (SEUs), which are the focus of this paper, can cause application misbehaviors, which may lead to catastrophic consequences. In this work, we first describe how we extended the capabilities of an open-source VHDL GPGPU model (FlexGrip) and developed a new version named FlexGripPlus to study and analyze the effects of SEUs in a GPGPU in a much more detailed manner. We also performed extensive fault injection campaigns using FlexGripPlus, which allowed identifying the most critical effects within the GPGPU architecture. We finally focused on the scheduler controller since it represents a module that is specific to the GPGPU architecture and showed that it has different levels of SEU sensibility depending on the affected location. Moreover, the results of additional analyses varying the number of parallel execution units in the system are presented, demonstrating the correlation between the number of execution units in a GPGPU and the system reliability.File | Dimensione | Formato | |
---|---|---|---|
journal-version-V20.pdf
accesso aperto
Descrizione: Pre-print version of manuscript (without format of the Journal)
Tipologia:
1. Preprint / submitted version [pre- review]
Licenza:
PUBBLICO - Tutti i diritti riservati
Dimensione
1.47 MB
Formato
Adobe PDF
|
1.47 MB | Adobe PDF | Visualizza/Apri |
1-s2.0-S0026271419307978-main.pdf
accesso aperto
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Creative commons
Dimensione
1.91 MB
Formato
Adobe PDF
|
1.91 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2820716