Confounds and overestimations in fake review detection: Experimentally controlling for product-ownership and data-origin

Advanced search
Browse by:

Department | Year

UCL Theses | Latest

Deposit your research

Bookmark & Share

Confounds and overestimations in fake review detection: Experimentally controlling for product-ownership and data-origin

Soldner, Felix; Kleinberg, Bennett; Johnson, Shane D; (2022) Confounds and overestimations in fake review detection: Experimentally controlling for product-ownership and data-origin. PLoS One , 17 (12) , Article e0277869. 10.1371/journal.pone.0277869. Green open access

Preview

PDF
journal.pone.0277869.pdf - Published Version
Download (910kB) | Preview

Abstract

The popularity of online shopping is steadily increasing. At the same time, fake product reviews are published widely and have the potential to affect consumer purchasing behavior. In response, previous work has developed automated methods utilizing natural language processing approaches to detect fake product reviews. However, studies vary considerably in how well they succeed in detecting deceptive reviews, and the reasons for such differences are unclear. A contributing factor may be the multitude of strategies used to collect data, introducing potential confounds which affect detection performance. Two possible confounds are data-origin (i.e., the dataset is composed of more than one source) and product ownership (i.e., reviews written by individuals who own or do not own the reviewed product). In the present study, we investigate the effect of both confounds for fake review detection. Using an experimental design, we manipulate data-origin, product ownership, review polarity, and veracity. Supervised learning analysis suggests that review veracity (60.26-69.87%) is somewhat detectable but reviews additionally confounded with product-ownership (66.19-74.17%), or with data-origin (84.44-86.94%) are easier to classify. Review veracity is most easily classified if confounded with product-ownership and data-origin combined (87.78-88.12%). These findings are moderated by review polarity. Overall, our findings suggest that detection accuracy may have been overestimated in previous studies, provide possible explanations as to why, and indicate how future studies might be designed to provide less biased estimates of detection accuracy.

Type:	Article
Title:	Confounds and overestimations in fake review detection: Experimentally controlling for product-ownership and data-origin
Location:	United States
Open access status:	An open access version is available from UCL Discovery
DOI:	10.1371/journal.pone.0277869
Publisher version:	https://doi.org/10.1371/journal.pone.0277869
Language:	English
Additional information:	© 2022 Soldner et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Keywords:	Humans, Ownership, Health Status, Research Design
UCL classification:	UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Security and Crime Science
URI:	https://discovery.ucl.ac.uk/id/eprint/10161823

Downloads since deposit

19Downloads

Download activity - last month

Download activity - last 12 months

Downloads by country - last 12 months

Archive Staff Only

View Item