Soldner, Felix David;
(2023)
Combating online consumer fraud and counterfeits: A data science perspective.
Doctoral thesis (Ph.D), UCL (University College London).
Preview |
Text
Soldner_10182179_thesis.pdf Download (4MB) | Preview |
Abstract
Online fraud is a growing problem that impacts many individuals, resulting in billions of dollars’ worth of damages. Although various online fraud types exist, they are all easily scalable through online (shopping) platforms, thus reaching many individuals with relatively little effort. The sheer number of fraud cases authorities face is impossible to resolve using traditional investigatory practices, which often require intensive manual work. Data science offers some solutions for the problems presented by online fraud through automation and making manual labour more efficient. This thesis explores various methods of how data science can help combat online consumer fraud and counterfeits. Chapters 3 and 4 examine the challenges of automated approaches to combating online fraud. Chapters 3 examines the feasibility of creating supervised machine learning training data by asking experts to annotate product listings based on suspiciousness. Annotators show low agreement on what constitutes a suspicious listing, revealing the importance of precise definitions of labels, clear instructions, and thorough records of the annotators' decision-making processes during labelling. In chapter 4, the impact that confounds in training data have on prediction performances (e.g., detecting fake reviews) is evaluated by examining the design choices used to create datasets. The results show that by mixing experimentally created and found (e.g., collected online) data, prediction performances can be artificially boosted, leading to incorrect conclusions about the predictive features. Chapters 5 and 6 examine the promise of automated methods for combating online fraud by examining data from anonymity networks and cryptomarkets, highly anonymized sections of the internet, often used to trade illicit goods. Counterfeits are openly offered on anonymity networks, and their information could help us to improve our understanding of the counterfeit economy. We analyse a large-scale dataset (2014-2015) to determine their prevalence, types, origins, and (sales) values across multiple markets. Comparing the estimations to other measures by authorities (e.g., border seizures), we found similarities in the type and origins of counterfeited products and that the number of types varied across measures. Finally, we utilise information about counterfeits on anonymity networks to search for the same products on the surface web by matching and ranking them based on image and text similarities. We examined highly similar matches and found that the number of identical products across platforms, such as shoes, smartphones, and watches, would warrant further investigation into whether they are counterfeits. The thesis closes with a discussion of the results before reviewing the limitations and possible future avenues for research on addressing online fraud. The availability of high-quality data, including ground truth data, is a recurring issue in fraud research, which could be addressed through better data documentation practices and increased data sharing. Future studies should aim to increase the temporal coverage of anonymity network data to allow for a better examination of trends in the counterfeit economy. Practical implications of utilising data science approaches are discussed, highlighting the importance of conveying the limitations and implications of applying data science methods to practitioners.
Type: | Thesis (Doctoral) |
---|---|
Qualification: | Ph.D |
Title: | Combating online consumer fraud and counterfeits: A data science perspective |
Open access status: | An open access version is available from UCL Discovery |
Language: | English |
Additional information: | Copyright © The Author 2023. Original content in this thesis is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) Licence (https://creativecommons.org/licenses/by-nc/4.0/). Any third-party copyright material present remains the property of its respective owner(s) and is licensed under its existing terms. Access may initially be restricted at the author’s request. |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Security and Crime Science |
URI: | https://discovery.ucl.ac.uk/id/eprint/10182179 |
Archive Staff Only
View Item |