Extreme-scale computer systems take advantage of large arrays of general-purpose multicore processors coupled with specialized manycore accelerators. In order to support complex applications and correctly feed such processing elements, increasingly larger memory cores are integrated at different levels of the hierarchy. However, the adoption of increasingly aggressive manufacturing processes makes the memory sub-system particularly sensitive to faults. Error correcting codes (ECCs) allow the memory to recover from faults at run-time without interfering with the application execution. However, due to the loss of performance introduced every time an error must be corrected, the persistence of faults requires a more radical repair approach in which faulty cells are physically replaced by spare ones. Memory redundancy analysis (MRA) algorithms are used to drive the allocation process of spare resources. Many one-dimensional and two-dimensional MRAs have been proposed, but tools for evaluating their recovering capability are still not well established. This paper presents SIERRA, a simulation environment for precisely evaluating the repair efficiency of an MRA considering different fault signatures and faulty memory configurations. Our simulation engine provides a precise estimation of the MRA quality by analyzing the behavior of the MRA on several faulty memory configurations. To this end, different parameters such as the area of the memory blocks and the defect density are taken into account. The evaluation of the quality of an MRA takes into account its repairing capability, the power consumption derived from its execution, and the area overhead. Thanks to the use of a database for storing information, our tool is able to speed-up the simulation process by distributing it among several nodes. All these features make SIERRA essential in supporting the design of next-generation high-performance computers.
|Titolo:||SIERRA—Simulation environment for memory redundancy algorithms|
|Data di pubblicazione:||2016|
|Digital Object Identifier (DOI):||10.1016/j.simpat.2016.08.008|
|Appare nelle tipologie:||1.1 Articolo in rivista|