UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Beta-CoRM: A Bayesian Approach for n-gram Profiles Analysis

Perusquia, José A; Griffin, Jim E; Villa, Cristiano; (2025) Beta-CoRM: A Bayesian Approach for n-gram Profiles Analysis. Computational Statistics and Data Analysis , 202 , Article 108056. 10.1016/j.csda.2024.108056. Green open access

[thumbnail of Griffin_1-s2.0-S0167947324001403-main.pdf]
Preview
Text
Griffin_1-s2.0-S0167947324001403-main.pdf

Download (3MB) | Preview

Abstract

n-gram profiles have been successfully and widely used to analyse long sequences of potentially differing lengths for clustering or classification. Mainly, machine learning algorithms have been used for this purpose but, despite their predictive performance, these methods cannot discover hidden structures or provide a full probabilistic representation of the data. A novel class of Bayesian generative models designed for n-gram profiles used as binary attributes have been designed to address this. The flexibility of the proposed modelling allows to consider a straightforward approach to feature selection in the generative model. Furthermore, a slice sampling algorithm is derived for a fast inferential procedure, which is applied to synthetic and real data scenarios and shows that feature selection can improve classification accuracy.

Type: Article
Title: Beta-CoRM: A Bayesian Approach for n-gram Profiles Analysis
Open access status: An open access version is available from UCL Discovery
DOI: 10.1016/j.csda.2024.108056
Publisher version: https://doi.org/10.1016/j.csda.2024.108056
Language: English
Additional information: © 2024 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
Keywords: Bayesian statistics, feature selection, labeled data ,n-grams, supervised learning
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences > Dept of Statistical Science
URI: https://discovery.ucl.ac.uk/id/eprint/10196456
Downloads since deposit
Loading...
5Downloads
Download activity - last month
Loading...
Download activity - last 12 months
Loading...
Downloads by country - last 12 months
Loading...

Archive Staff Only

View Item View Item