eprintid: 10192656
rev_number: 14
eprint_status: archive
userid: 699
dir: disk0/10/19/26/56
datestamp: 2024-10-04 11:29:11
lastmod: 2024-12-01 07:10:46
status_changed: 2024-10-04 11:29:11
type: thesis
metadata_visibility: show
sword_depositor: 699
creators_name: Wang, Xiaoke
title: Research on Positive-Unlabeled Learning
ispublished: unpub
divisions: UCL
divisions: B04
divisions: C06
divisions: F61
note: Copyright © The Author 2024. Original content in this thesis is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) Licence (https://creativecommons.org/licenses/by-nc/4.0/).
abstract: Positive-unlabeled (PU) learning handles classification tasks on the data containing only labeled-positive instances and unlabeled instances. PU learning has been applied in many fields of observational studies. Support vector machine (SVM)-based PU learning is  one of the main branches of PU learning and offers a range of advantages, e.g., the efficiency of training and the generalisation ability. Moreover, the SVM-based PU classifiers are able to generate non-linear decision boundary by employing kernel trick to capture complex relationships among features and have been shown to achieve  robust performance. This study focuses on SVM-based PU classifiers and contains three contributions. Firstly we proposed global and local PU classifier with asymmetric loss (GLPUAL) with kernel trick applied for satisfactory classification on trifurcated PU datasets, where the positive set is constituted by two subsets distributing on both sides of the negative set. Secondly, to address the unsatisfactory interpretability and performance of GLPUAL on the PU datasets containing irrelevant features, we introduced L1-norm regularisation to the objective function of GLPUAL to construct a sparse classifier to remove irrelevant features. The proposed classifier is termed elastic GLPUAL (E-GLPUAL). Then a kernel-free technique was introduced to E-GLPUAL to generate non-linear decision boundary. The proposed classifier is termed elastic kernel-free GLPUAL (EKF-GLPUAL). Thirdly, we proposed class-prior-based GLPUAL (CPB-GLPUAL) by introducing a technique of unbiased PU learning to GLPUAL for better performance when the class prior is known. Besides, we explored the conditions for CPB-GLPUAL to exhibit universal consistency between the 0-1 classification risk of CPB-GLPUAL and the Bayes risk.
date: 2024-05-28
date_type: published
oa_status: green
full_text_type: other
thesis_class: doctoral_embargoed
thesis_award: Ph.D
language: eng
primo: open
primo_central: open_green
verified: verified_manual
elements_id: 2277863
lyricists_name: Wang, Xiaoke
lyricists_id: XWANI93
actors_name: Wang, Xiaoke
actors_id: XWANI93
actors_role: owner
full_text_status: public
pages: 134
institution: UCL (University College London)
department: Statistical Science
thesis_type: Doctoral
citation:        Wang, Xiaoke;      (2024)    Research on Positive-Unlabeled Learning.                   Doctoral thesis  (Ph.D), UCL (University College London).     Green open access   
 
document_url: https://discovery.ucl.ac.uk/id/eprint/10192656/2/Thesis_Xiaoke%20Wang_18029107.pdf