Anarado, I;
Anam, MA;
Verdicchio, F;
Andreopoulos, I;
(2016)
Mitigating Silent Data Corruptions In Integer Matrix Products: Toward Reliable Multimedia Computing On Unreliable Hardware.
IEEE Transactions on Circuits and Systems for Video Technology
, PP
(99)
10.1109/TCSVT.2016.2589622.
(In press).
Preview |
Text
Final_version_TCSVT_corrected_for_rps.pdf - Accepted Version Download (880kB) | Preview |
Abstract
The generic matrix multiply (GEMM) routine comprises the compute and memory-intensive part of many information retrieval, machine learning and object recognition systems that process integer inputs. Therefore, it is of paramount importance to ensure that integer GEMM computations remain robust to silent data corruptions (SDCs), which stem from accidental voltage or frequency overscaling, or other hardware non-idealities. In this paper, we introduce a new method for SDC mitigation based on the concept of numerical packing. The key difference between our approach and all existing methods is the production of redundant results within the numerical representation of the outputs, rather than as a separate set of checksums. Importantly, unlike well-known algorithm-based fault tolerance (ABFT) approaches for GEMM, the proposed approach can reliably detect the locations of the vast majority of all possible SDCs in the results of GEMM computations. An experimental investigation of voltage-scaled integer GEMM computations for visual descriptor matching within state-of-the art image and video retrieval algorithms running on an Intel i7- 4578U 3GHz processor shows that SDC mitigation based on numerical packing leads to comparable or lower execution and energy-consumption overhead in comparison to all other alternatives.
Type: | Article |
---|---|
Title: | Mitigating Silent Data Corruptions In Integer Matrix Products: Toward Reliable Multimedia Computing On Unreliable Hardware |
Open access status: | An open access version is available from UCL Discovery |
DOI: | 10.1109/TCSVT.2016.2589622 |
Publisher version: | http://dx.doi.org/10.1109/TCSVT.2016.2589622 |
Language: | English |
Additional information: | Copyright © 2016 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending a request to pubs-permissions@ieee.org. |
Keywords: | Integer matrix multiplication, dependable systems, fault tolerance, soft errors, voltage scaling |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Electronic and Electrical Eng |
URI: | https://discovery.ucl.ac.uk/id/eprint/1505951 |




Archive Staff Only
![]() |
View Item |