UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Mitigating Silent Data Corruptions In Integer Matrix Products: Toward Reliable Multimedia Computing On Unreliable Hardware

Anarado, I; Anam, MA; Verdicchio, F; Andreopoulos, I; (2016) Mitigating Silent Data Corruptions In Integer Matrix Products: Toward Reliable Multimedia Computing On Unreliable Hardware. IEEE Transactions on Circuits and Systems for Video Technology , PP (99) 10.1109/TCSVT.2016.2589622. (In press). Green open access

[thumbnail of Final_version_TCSVT_corrected_for_rps.pdf]
Preview
Text
Final_version_TCSVT_corrected_for_rps.pdf - Accepted Version

Download (880kB) | Preview

Abstract

The generic matrix multiply (GEMM) routine comprises the compute and memory-intensive part of many information retrieval, machine learning and object recognition systems that process integer inputs. Therefore, it is of paramount importance to ensure that integer GEMM computations remain robust to silent data corruptions (SDCs), which stem from accidental voltage or frequency overscaling, or other hardware non-idealities. In this paper, we introduce a new method for SDC mitigation based on the concept of numerical packing. The key difference between our approach and all existing methods is the production of redundant results within the numerical representation of the outputs, rather than as a separate set of checksums. Importantly, unlike well-known algorithm-based fault tolerance (ABFT) approaches for GEMM, the proposed approach can reliably detect the locations of the vast majority of all possible SDCs in the results of GEMM computations. An experimental investigation of voltage-scaled integer GEMM computations for visual descriptor matching within state-of-the art image and video retrieval algorithms running on an Intel i7- 4578U 3GHz processor shows that SDC mitigation based on numerical packing leads to comparable or lower execution and energy-consumption overhead in comparison to all other alternatives.

Type: Article
Title: Mitigating Silent Data Corruptions In Integer Matrix Products: Toward Reliable Multimedia Computing On Unreliable Hardware
Open access status: An open access version is available from UCL Discovery
DOI: 10.1109/TCSVT.2016.2589622
Publisher version: http://dx.doi.org/10.1109/TCSVT.2016.2589622
Language: English
Additional information: Copyright © 2016 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending a request to pubs-permissions@ieee.org.
Keywords: Integer matrix multiplication, dependable systems, fault tolerance, soft errors, voltage scaling
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Electronic and Electrical Eng
URI: https://discovery.ucl.ac.uk/id/eprint/1505951
Downloads since deposit
124Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item