eprintid: 10133171
rev_number: 15
eprint_status: archive
userid: 608
dir: disk0/10/13/31/71
datestamp: 2022-08-15 15:11:40
lastmod: 2022-08-15 15:11:40
status_changed: 2022-08-15 15:11:40
type: working_paper
metadata_visibility: show
creators_name: Labaca-Castro, R
creators_name: Muñoz-González, L
creators_name: Pendlebury, F
creators_name: Dreo Rodosek, G
creators_name: Pierazzi, F
creators_name: Cavallaro, L
title: Realizable Universal Adversarial Perturbations for Malware
ispublished: pub
divisions: UCL
divisions: A01
divisions: B04
divisions: C05
divisions: F48
note: This work is licensed under an Attribution 4.0 International License (CC BY 4.0).
abstract: Machine learning classifiers are vulnerable to adversarial examples -- input-specific perturbations that manipulate models' output. Universal Adversarial Perturbations (UAPs), which identify noisy patterns that generalize across the input space, allow the attacker to greatly scale up the generation of such examples. Although UAPs have been explored in application domains beyond computer vision, little is known about their properties and implications in the specific context of realizable attacks, such as malware, where attackers must satisfy challenging problem-space constraints.
In this paper we explore the challenges and strengths of UAPs in the context of malware classification. We generate sequences of problem-space transformations that induce UAPs in the corresponding feature-space embedding and evaluate their effectiveness across different malware domains. Additionally, we propose adversarial training-based mitigations using knowledge derived from the problem-space transformations, and compare against alternative feature-space defenses.
Our experiments limit the effectiveness of a white box Android evasion attack to ~20% at the cost of ~3% TPR at 1% FPR. We additionally show how our method can be adapted to more restrictive domains such as Windows malware.
We observe that while adversarial training in the feature space must deal with large and often unconstrained regions, UAPs in the problem space identify specific vulnerabilities that allow us to harden a classifier more effectively, shifting the challenges and associated cost of identifying new universal adversarial transformations back to the attacker.
date: 2021-02-12
date_type: published
publisher: ArXiv
official_url: https://doi.org/10.48550/arXiv.2102.06747
oa_status: green
full_text_type: pub
language: eng
primo: open
primo_central: open_green
verified: verified_manual
elements_id: 1882848
confidential: false
lyricists_name: Cavallaro, Lorenzo
lyricists_id: LCAVA89
actors_name: Cavallaro, Lorenzo
actors_id: LCAVA89
actors_role: owner
full_text_status: public
pages: 19
citation:        Labaca-Castro, R;    Muñoz-González, L;    Pendlebury, F;    Dreo Rodosek, G;    Pierazzi, F;    Cavallaro, L;      (2021)    Realizable Universal Adversarial Perturbations for Malware.                    ArXiv       Green open access   
 
document_url: https://discovery.ucl.ac.uk/id/eprint/10133171/7/Cavallaro_2102.06747.pdf