ML-Based Behavioral Malware Detection Is Far From a Solved Problem

Advanced search
Browse by:

Department | Year

UCL Theses | Latest

Deposit your research

ML-Based Behavioral Malware Detection Is Far From a Solved Problem

Kaya, Yigitcan; Chen, Yizheng; Botacin, Marcus; Saha, Shoumik; Pierazzi, Fabio; Cavallaro, Lorenzo; Wagner, David; (2025) ML-Based Behavioral Malware Detection Is Far From a Solved Problem. In: Proceedings of the 3rd IEEE Conference on Secure and Trustworthy Machine Learning. IEEE: Copenhagen, Denmark. (In press). Green open access

Preview

Text
kaya-satml-2025.pdf - Accepted Version
Download (1MB) | Preview

Abstract

Malware detection is a ubiquitous application of Machine Learning (ML) in security. In behavioral malware analysis, the detector relies on features extracted from program execution traces. The research literature has focused on detectors trained with features collected from sandbox environments and evaluated on samples also analyzed in a sandbox. However, in deployment, a malware detector at endpoint hosts often must rely on traces captured from endpoint hosts, not from a sandbox. Thus, there is a gap between the literature and real-world needs. We present the first measurement study of the performance of ML-based malware detectors at real-world endpoints. Leveraging a dataset of sandbox traces and a dataset of in-the-wild program traces, we evaluate two scenarios: (i) an endpoint detector trained on sandbox traces (convenient and easy to train), and (ii) an endpoint detector trained on endpoint traces (more challenging to train, since we need to collect telemetry data). We discover a wide gap between the performance as measured using prior evaluation methods in the literature—over 90%—vs. expected performance in endpoint detection—about 20% (scenario (i)) to 50% (scenario (ii)). We characterize the ML challenges that arise in this domain and contribute to this gap, including label noise, distribution shift, and spurious features. Moreover, we show several techniques that achieve 5–30% relative performance improvements over the baselines. Our evidence suggests that applying detectors trained on sandbox data to endpoint detection is challenging. The most promising direction is training detectors directly on endpoint data, which marks a departure from current practice. To promote progress, we will facilitate researchers to perform realistic detector evaluations against our real-world dataset.

Type:	Proceedings paper
Title:	ML-Based Behavioral Malware Detection Is Far From a Solved Problem
Event:	IEEE Conference on Secure and Trustworthy Machine Learning (SaTML)
Dates:	9 Apr 2025 - 11 Apr 2025
Open access status:	An open access version is available from UCL Discovery
Publisher version:	https://satml.org/
Language:	English
Additional information:	This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
UCL classification:	UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI:	https://discovery.ucl.ac.uk/id/eprint/10206107