eprintid: 1452450 rev_number: 45 eprint_status: archive userid: 608 dir: disk0/01/45/24/50 datestamp: 2015-03-03 10:57:14 lastmod: 2021-12-16 23:46:33 status_changed: 2015-03-03 11:32:47 type: article metadata_visibility: show item_issues_count: 0 creators_name: Anam, MA creators_name: Whatmough, PN creators_name: Andreopoulos, Y title: Precision-energy-throughput scaling of generic matrix multiplication and convolution kernels via linear projections ispublished: pub divisions: UCL divisions: B04 divisions: C05 divisions: F46 keywords: Convolution (CONV), Embedded systems, Energy and throughput scaling, Generic matrix multiplication (GEMM), Multimedia recognition and matching note: © 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. abstract: Generic matrix multiplication (GEMM) and con- volution (CONV)/cross-correlation kernels often constitute the bulk of the compute- and memory-intensive processing within image/audio recognition and matching systems. We propose a novel method to scale the energy and processing throughput of GEMM and CONV kernels for such error-tolerant multimedia applications by adjusting the precision of computation. Our technique employs linear projections to the input matrix or signal data during the top-level GEMM and CONV blocking and reordering. The GEMM and CONV kernel processing then uses the projected inputs and the results are accumulated to form the final outputs. Throughput and energy scaling takes place by changing the number of projections computed by each kernel, which in turn produces approximate results, i.e., changes the precision of the performed computation. Results derived from a voltage- and frequency-scaled ARM Cortex A15 processor running face recognition and music-matching algorithms demonstrate that the proposed approach allows for a 280%–440% increase of processing throug hput and a 75%– 80% decrease of energy consumption against the optimized GEMM and CONV kernels without any impact on the obtained recognition or matching accuracy. Even higher gains can be obtained, if one is willing to tolerate some reduction in the accuracy of the recognition and matching applications date: 2014-11 official_url: http://dx.doi.org/10.1109/TCSVT.2014.2321071 vfaculties: VENG oa_status: green full_text_type: other primo: open primo_central: open_green verified: verified_manual elements_source: crossref elements_id: 987051 doi: 10.1109/TCSVT.2014.2321071 lyricists_name: Anam, Mohammad lyricists_name: Andreopoulos, Ioannis lyricists_id: MAANA40 lyricists_id: IANDR50 full_text_status: public publication: IEEE Transactions on Circuits and Systems for Video Technology volume: 24 number: 11 pagerange: 1860- 1873 issn: 1051-8215 citation: Anam, MA; Whatmough, PN; Andreopoulos, Y; (2014) Precision-energy-throughput scaling of generic matrix multiplication and convolution kernels via linear projections. IEEE Transactions on Circuits and Systems for Video Technology , 24 (11) 1860- 1873. 10.1109/TCSVT.2014.2321071 <https://doi.org/10.1109/TCSVT.2014.2321071>. Green open access document_url: https://discovery.ucl.ac.uk/id/eprint/1452450/3/TCSVT7786.pdf