UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Machine Learning and Alternative Data Analytics for Fashion Finance

Bainiaksinaite, Julija; (2020) Machine Learning and Alternative Data Analytics for Fashion Finance. Doctoral thesis (Ph.D), UCL (University College London). Green open access

[img]
Preview
Text
machine_learning_and_alternative_data_analytics_for_fashion_finance_julija_bainiaksinaite_ucl.pdf - Published version

Download (13MB) | Preview

Abstract

This dissertation investigates the application of Machine Learning, Natural Language Processing and computational finance to a novel area Fashion Finance. Specifically identifying investment opportunities within the Apparel industry using influential alternative data sources such as Instagram. Fashion investment is challenging due to the ephemeral nature of the industry and the difficulty for investors who lack an understanding of how to analyze trend-driven consumer brands. Unstructured online data (e-commerce stores, social media, online blogs, news, etc.), introduce new opportunities for investment signals extraction. We focus on how trading signals can be generated from the Instagram data and events reported in the news articles. Part of this research work was done in collaboration with Arabesque Asset Management. Farfetch, the online luxury retailer, and Living Bridge Private Equity provided industry advice. Research Datasets The datasets used for this research are collected from various sources and include the following types of data: - Financial data: daily stock prices of 50 U.S. and European Apparel and Footwear equities, daily U.S. Retail Trade and U.S. Consumer Non-Durables sectors indices, Form 10-K reports. - Instagram data: daily Instagram profile followers for 11 fashion companies. - News data: 0.5 mln news articles that mention selected 50 equities. Research Experiments The thesis consists of the below studies: 1. Relationship between Instagram Popularity and Stock Prices. This study investigates a link between the changes in a company's popularity (daily followers counts) on Instagram and its stock price, revenue movements. We use cross-correlation analysis to find whether the signals derived from the followers' data could help to infer a company's future financial performance. Two hypothetical trading strategies are designed to test if the changes in a company's Instagram popularity could improve the returns. To test the hypotheses, Wilcoxon signed-rank test is used. 2. Dynamic Density-based News Clustering. The aim of this study is twofold: 1) analyse the characteristics of relevant news event articles and how they differ from the noisy/irrelevant news; 2) using the insights, design an unsupervised framework that clusters news articles and identifies events clusters without predefined parameters or expert knowledge. The framework incorporates the density-based clustering algorithm DBSCAN where the clustering parameters are selected dynamically with Gaussian Mixture Model and by maximizing the inter-cluster Information Entropy. 3. ALGA: Automatic Logic Gate Annotator for Event Detection. We design a news classification model for detecting fashion events that are likely to impact a company's stock price. The articles are represented by the following text embeddings: TF-IDF, Doc2Vec and BERT (Transformer Neural Network). The study is comprised of two parts: 1) we design a domain-specific automatic news labelling framework ALGA. The framework incorporates topic extraction (Latent Dirichlet Allocation) and clustering (DBSCAN) algorithms in addition to other filters to annotate the dataset; 2) using the labelled dataset, we train Logistic Regression classifier for identifying financially relevant news. The model shows the state-of-the-art results in the domain-specific financial event detection problem. Contribution to Science This research work presents the following contributions to science: - Introducing original work in Machine Learning and Natural Language Processing application for analysing alternative data on ephemeral fashion assets. - Introducing the new metrics to measure and track a fashion brand's popularity for investment decision making. - Design of the dynamic news events clustering framework that finds events clusters of various sizes in the news articles without predefined parameters. - Present the original Automatic Logic Gate Annotator framework (ALGA) for automatic labelling of news articles for the financial event detection task. - Design of the Apparel and Footwear news events classifier using the datasets generated by the ALGA's framework and show the state-of-the-art performance in a domain-specific financial event detection task. - Build the \textit{Fashion Finance Dictionary} that contains 320 phrases related to various financially-relevant events in the Apparel and Footwear industry.

Type: Thesis (Doctoral)
Qualification: Ph.D
Title: Machine Learning and Alternative Data Analytics for Fashion Finance
Event: UCL (University College London)
Open access status: An open access version is available from UCL Discovery
Language: English
Additional information: Copyright © The Author 2020. Original content in this thesis is licensed under the terms of the Creative Commons Attribution 4.0 International (CC BY 4.0) Licence (https://creativecommons.org/licenses/by/4.0/). Any third-party copyright material present remains the property of its respective owner(s) and is licensed under its existing terms. Access may initially be restricted at the author’s request.
UCL classification: UCL
UCL > Provost and Vice Provost Offices
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10093622
Downloads since deposit
168Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item