Identifying Human Strategies for Generating Word-Level Adversarial Examples

Advanced search
Browse by:

Department | Year

UCL Theses | Latest

Deposit your research

Bookmark & Share

Identifying Human Strategies for Generating Word-Level Adversarial Examples

Mozes, M; Kleinberg, B; Griffin, LD; (2022) Identifying Human Strategies for Generating Word-Level Adversarial Examples. In: Findings of the Association for Computational Linguistics: EMNLP 2022. (pp. pp. 6147-6155). ACL Anthology: Abu Dhabi, United Arab Emirates. Green open access

[thumbnail of 2022.findings-emnlp.454.pdf]

Preview

Text
2022.findings-emnlp.454.pdf - Published Version
Download (246kB) | Preview

Abstract

Adversarial examples in NLP are receiving increasing research attention. One line of investigation is the generation of word-level adversarial examples against fine-tuned Transformer models that preserve naturalness and grammaticality. Previous work found that human- and machine-generated adversarial examples are comparable in their naturalness and grammatical correctness. Most notably, humans were able to generate adversarial examples much more effortlessly than automated attacks. In this paper, we provide a detailed analysis of exactly how humans create these adversarial examples. By exploring the behavioural patterns of human workers during the generation process, we identify statistically significant tendencies based on which words humans prefer to select for adversarial replacement (e.g., word frequencies, word saliencies, sentiment) as well as where and when words are replaced in an input sequence. With our findings, we seek to inspire efforts that harness human strategies for more robust NLP models.

Type:	Proceedings paper
Title:	Identifying Human Strategies for Generating Word-Level Adversarial Examples
Event:	Findings of the Association for Computational Linguistics: EMNLP 2022
Open access status:	An open access version is available from UCL Discovery
Publisher version:	https://aclanthology.org/2022.findings-emnlp.454/#
Language:	English
Additional information:	2 ©2022 Association for Computational Linguistics. Licensed on a Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).
UCL classification:	UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI:	https://discovery.ucl.ac.uk/id/eprint/10167455

Downloads since deposit

10Downloads

Download activity - last month

Download activity - last 12 months

Downloads by country - last 12 months

Archive Staff Only

View Item