eprintid: 10194620
rev_number: 6
eprint_status: archive
userid: 699
dir: disk0/10/19/46/20
datestamp: 2024-07-16 14:58:54
lastmod: 2024-07-16 14:58:54
status_changed: 2024-07-16 14:58:54
type: article
metadata_visibility: show
sword_depositor: 699
creators_name: Feng, Chen
creators_name: Tzimiropoulos, Georgios
creators_name: Patras, Ioannis
title: NoiseBox: Towards More Efficient and Effective Learning with Noisy Labels
ispublished: inpress
divisions: UCL
divisions: B04
divisions: F46
keywords: Noise, Noise measurement, Training, Computational modeling,
Predictive models, Supervised learning, Entropy
note: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
abstract: Despite the large progress in supervised learning with neural networks, there are significant challenges in obtaining high-quality, large-scale and accurately labelled datasets. In such contexts, how to learn in the presence of noisy labels has received more and more attention. Addressing this relatively intricate problem to attain competitive results predominantly involves designing mechanisms that select samples that are expected to have reliable annotations. However, these methods typically involve multiple off-the-shelf techniques, resulting in intricate structures. Furthermore, they frequently make implicit or explicit assumptions about the noise modes/ratios within the dataset. Such assumptions can compromise model robustness and limit its performance under varying noise conditions. Unlike these methods, in this work, we propose an efficient and effective framework with minimal hyperparameters that achieves SOTA results in various benchmarks. Specifically, we design an efficient and concise training framework consisting of a subset expansion module responsible for exploring non-selected samples and a model training module to further reduce the impact of noise, called NoiseBox . Moreover, diverging from common sample selection methods based on the "small loss" mechanism, we introduce a novel sample selection method based on the neighbouring relationships and label consistency in the feature space. Without bells and whistles, such as model co-training, self-supervised pre-training and semi-supervised learning, and with robustness concerning the settings of its few hyper-parameters, our method significantly surpasses previous methods on both CIFAR10/CIFAR100 with synthetic noise and real-world noisy datasets such as Red Mini-ImageNet, WebVision, Clothing1M and ANIMAL-10N.
date: 2024-07-11
date_type: published
publisher: Institute of Electrical and Electronics Engineers (IEEE)
official_url: http://dx.doi.org/10.1109/tcsvt.2024.3426994
oa_status: green
full_text_type: other
language: eng
primo: open
primo_central: open_green
verified: verified_manual
elements_id: 2296874
doi: 10.1109/tcsvt.2024.3426994
lyricists_name: Feng, Chen
lyricists_id: CFENA90
actors_name: Bracey, Alan
actors_id: ABBRA90
actors_role: owner
full_text_status: public
publication: IEEE Transactions on Circuits and Systems for Video Technology
issn: 1051-8215
citation:        Feng, Chen;    Tzimiropoulos, Georgios;    Patras, Ioannis;      (2024)    NoiseBox: Towards More Efficient and Effective Learning with Noisy Labels.                   IEEE Transactions on Circuits and Systems for Video Technology        10.1109/tcsvt.2024.3426994 <https://doi.org/10.1109/tcsvt.2024.3426994>.    (In press).    Green open access   
 
document_url: https://discovery.ucl.ac.uk/id/eprint/10194620/1/NoiseBox_Towards_More_Efficient_and_Effective_Learning_with_Noisy_Labels.pdf