eprintid: 10192656 rev_number: 14 eprint_status: archive userid: 699 dir: disk0/10/19/26/56 datestamp: 2024-10-04 11:29:11 lastmod: 2024-12-01 07:10:46 status_changed: 2024-10-04 11:29:11 type: thesis metadata_visibility: show sword_depositor: 699 creators_name: Wang, Xiaoke title: Research on Positive-Unlabeled Learning ispublished: unpub divisions: UCL divisions: B04 divisions: C06 divisions: F61 note: Copyright © The Author 2024. Original content in this thesis is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) Licence (https://creativecommons.org/licenses/by-nc/4.0/). abstract: Positive-unlabeled (PU) learning handles classification tasks on the data containing only labeled-positive instances and unlabeled instances. PU learning has been applied in many fields of observational studies. Support vector machine (SVM)-based PU learning is one of the main branches of PU learning and offers a range of advantages, e.g., the efficiency of training and the generalisation ability. Moreover, the SVM-based PU classifiers are able to generate non-linear decision boundary by employing kernel trick to capture complex relationships among features and have been shown to achieve robust performance. This study focuses on SVM-based PU classifiers and contains three contributions. Firstly we proposed global and local PU classifier with asymmetric loss (GLPUAL) with kernel trick applied for satisfactory classification on trifurcated PU datasets, where the positive set is constituted by two subsets distributing on both sides of the negative set. Secondly, to address the unsatisfactory interpretability and performance of GLPUAL on the PU datasets containing irrelevant features, we introduced L1-norm regularisation to the objective function of GLPUAL to construct a sparse classifier to remove irrelevant features. The proposed classifier is termed elastic GLPUAL (E-GLPUAL). Then a kernel-free technique was introduced to E-GLPUAL to generate non-linear decision boundary. The proposed classifier is termed elastic kernel-free GLPUAL (EKF-GLPUAL). Thirdly, we proposed class-prior-based GLPUAL (CPB-GLPUAL) by introducing a technique of unbiased PU learning to GLPUAL for better performance when the class prior is known. Besides, we explored the conditions for CPB-GLPUAL to exhibit universal consistency between the 0-1 classification risk of CPB-GLPUAL and the Bayes risk. date: 2024-05-28 date_type: published oa_status: green full_text_type: other thesis_class: doctoral_embargoed thesis_award: Ph.D language: eng primo: open primo_central: open_green verified: verified_manual elements_id: 2277863 lyricists_name: Wang, Xiaoke lyricists_id: XWANI93 actors_name: Wang, Xiaoke actors_id: XWANI93 actors_role: owner full_text_status: public pages: 134 institution: UCL (University College London) department: Statistical Science thesis_type: Doctoral citation: Wang, Xiaoke; (2024) Research on Positive-Unlabeled Learning. Doctoral thesis (Ph.D), UCL (University College London). Green open access document_url: https://discovery.ucl.ac.uk/id/eprint/10192656/2/Thesis_Xiaoke%20Wang_18029107.pdf