UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Bayesian clustering of multiple zero-inflated outcomes

Franzolini, B; Cremaschi, A; Van Den Boom, W; De Iorio, M; (2023) Bayesian clustering of multiple zero-inflated outcomes. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences , 381 (2247) , Article 20220145. 10.1098/rsta.2022.0145. Green open access

[thumbnail of Bayesian clustering of multiple zero-inflated outcomes.pdf]
Preview
Text
Bayesian clustering of multiple zero-inflated outcomes.pdf - Published Version

Download (861kB) | Preview

Abstract

Several applications involving counts present a large proportion of zeros (excess-of-zeros data). A popular model for such data is the hurdle model, which explicitly models the probability of a zero count, while assuming a sampling distribution on the positive integers. We consider data from multiple count processes. In this context, it is of interest to study the patterns of counts and cluster the subjects accordingly. We introduce a novel Bayesian approach to cluster multiple, possibly related, zero-inflated processes. We propose a joint model for zero-inflated counts, specifying a hurdle model for each process with a shifted Negative Binomial sampling distribution. Conditionally on the model parameters, the different processes are assumed independent, leading to a substantial reduction in the number of parameters as compared with traditional multivariate approaches. The subject-specific probabilities of zero-inflation and the parameters of the sampling distribution are flexibly modelled via an enriched finite mixture with random number of components. This induces a two-level clustering of the subjects based on the zero/non-zero patterns (outer clustering) and on the sampling distribution (inner clustering). Posterior inference is performed through tailored Markov chain Monte Carlo schemes. We demonstrate the proposed approach on an application involving the use of the messaging service WhatsApp. This article is part of the theme issue 'Bayesian inference: challenges, perspectives, and prospects'.

Type: Article
Title: Bayesian clustering of multiple zero-inflated outcomes
Location: England
Open access status: An open access version is available from UCL Discovery
DOI: 10.1098/rsta.2022.0145
Publisher version: https://doi.org/10.1098/rsta.2022.0145
Language: English
Additional information: © 2023 The Authors. Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.
Keywords: conditional algorithm, enriched priors, excess-of-zeros data, finite mixtures, hurdle model, nested clustering
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences > Dept of Statistical Science
URI: https://discovery.ucl.ac.uk/id/eprint/10168385
Downloads since deposit
18Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item