UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

SALSA: Self-Adjusting Lean Streaming Analytics

Basat, RB; Einziger, G; Mitzenmacher, M; Vargaftik, S; (2021) SALSA: Self-Adjusting Lean Streaming Analytics. In: Proceedings of the 37th IEEE International Conference on Data Engineering (ICDE 2021). (pp. pp. 864-875). IEEE: Chania, Greece. Green open access

[thumbnail of SALSA_UCL.pdf]
Preview
Text
SALSA_UCL.pdf - Accepted Version

Download (1MB) | Preview

Abstract

Counters are the fundamental building block of many data sketching schemes, which hash items to a small number of counters and account for collisions to provide good approximations for frequencies and other measures. Most existing methods rely on fixed-size counters, which may be wasteful in terms of space, as counters must be large enough to eliminate any risk of overflow. Instead, some solutions use small, fixed-size counters that may overflow into secondary structures.This paper takes a different approach. We propose a simple and general method called SALSA for dynamic re-sizing of counters, and show its effectiveness. SALSA starts with small counters, and overflowing counters simply merge with their neighbors. SALSA can thereby allow more counters for a given space, expanding them as necessary to represent large numbers. Our evaluation demonstrates that, at the cost of a small overhead for its merging logic, SALSA significantly improves the accuracy of popular schemes (such as Count-Min Sketch and Count Sketch) over a variety of tasks. Our code is released as open source.

Type: Proceedings paper
Title: SALSA: Self-Adjusting Lean Streaming Analytics
Event: 37th IEEE International Conference on Data Engineering (ICDE 2021)
ISBN-13: 978-1-7281-9184-3
Open access status: An open access version is available from UCL Discovery
DOI: 10.1109/ICDE51399.2021.00080
Publisher version: https://doi.org/10.1109/ICDE51399.2021.00080
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: Measurement errors, Heuristic algorithms, Conferences, Merging, Layout, Data engineering, Encoding
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10125499
Downloads since deposit
Loading...
90Downloads
Download activity - last month
Loading...
Download activity - last 12 months
Loading...
Downloads by country - last 12 months
1.China
8
2.United Kingdom
4
3.United States
3
4.Germany
3
5.Japan
2
6.Netherlands
2
7.India
2
8.Malaysia
1
9.Canada
1
10.Russian Federation
1

Archive Staff Only

View Item View Item