UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Investigating Labelless Drift Adaptation for Malware Detection

Kan, Z; Pendlebury, F; Pierazzi, F; Cavallaro, L; (2021) Investigating Labelless Drift Adaptation for Malware Detection. In: AISec '21: Proceedings of the 14th ACM Workshop on Artificial Intelligence and Security. (pp. pp. 123-134). ACM Green open access

[thumbnail of Cavallaro_Investigating Labelless Drift Adaptation for Malware Detection.pdf]
Preview
Text
Cavallaro_Investigating Labelless Drift Adaptation for Malware Detection.pdf - Accepted Version

Download (2MB) | Preview

Abstract

The evolution of malware has long plagued machine learning-based detection systems, as malware authors develop innovative strategies to evade detection and chase profits. This induces concept drift as the test distribution diverges from the training, causing performance decay that requires constant monitoring and adaptation. In this work, we analyze the adaptation strategy used by DroidEvolver, a state-of-the-art learning system that self-updates using pseudo-labels to avoid the high overhead associated with obtaining a new ground truth. After removing sources of experimental bias present in the original evaluation, we identify a number of flaws in the generation and integration of these pseudo-labels, leading to a rapid onset of performance degradation as the model poisons itself. We propose DroidEvolver++, a more robust variant of DroidEvolver, to address these issues and highlight the role of pseudo-labels in addressing concept drift. We test the tolerance of the adaptation strategy versus different degrees of pseudo-label noise and propose the adoption of methods to ensure only high-quality pseudo-labels are used for updates. Ultimately, we conclude that the use of pseudo-labeling remains a promising solution to limitations on labeling capacity, but great care must be taken when designing update mechanisms to avoid negative feedback loops and self-poisoning which have catastrophic effects on performance.

Type: Proceedings paper
Title: Investigating Labelless Drift Adaptation for Malware Detection
Event: ACM Workshop on Artificial Intelligence and Security (AISec '21)
ISBN-13: 9781450386579
Open access status: An open access version is available from UCL Discovery
DOI: 10.1145/3474369.3486873
Publisher version: https://doi.org/10.1145/3474369.3486873
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher's terms and conditions.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10139216
Downloads since deposit
198Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item