UCL logo

UCL Discovery

UCL home » Library Services » Electronic resources » UCL Discovery

Latent variable models for the topographic organisation of discrete and strictly positive data

Girolami, M; (2002) Latent variable models for the topographic organisation of discrete and strictly positive data. In: NEUROCOMPUTING. (pp. 185 - 198). ELSEVIER SCIENCE BV

Full text not available from this repository.

Abstract

This paper is concerned with learning dense low-dimensional representations of high-dimensional positive data. The positive data may be continuous, discrete binary or count based. In addition to the low-dimensional data model, a topographic ordering of the representation is desired. The primary motivation for this work is the requirement for a low-dimensional interpretation of sparse vector space models of text documents which may take the form of binary, count based or real multivariate data. The generative topographic mapping (GTM) was developed and introduced as a principled alternative to the self-organising map for, principally, visualising high-dimensional continuous data. The GTM is one method by which a topographically organised low-dimensional data representation may be realised. There are many cases where the observation data is discrete and the application of methods developed specifically for continuous data is inappropriate. Based on the continuous GTM data model a non-linear latent variable model for modelling high-dimensional binary data is presented. The non-negative factorisation of a positive matrix which ensures a topographic ordering of the constituent factors is also presented as a principled yet non-probabilistic alternative to the GTM model. Experimental demonstrations of both methods are provided based on representing binary coded handwritten digits and the topographic organisation and visualisation of a collection of text based documents. (C) 2002 Elsevier Science B.V. All rights reserved.

Type:Proceedings paper
Title:Latent variable models for the topographic organisation of discrete and strictly positive data
Event:8th European Symposium on Artificial Neural Networks (ESANN)
Location:BRUGGE, BELGIUM
Dates:2001-04-26 - 2001-04-28
Keywords:generative models, topographic mappings, non-negative matrix factorisation, latent semantic analysis, MATRIX FACTORIZATION
UCL classification:UCL > School of BEAMS > Faculty of Maths and Physical Sciences > Statistical Science

Archive Staff Only: edit this record