UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Unsupervised Few-Bits Semantic Hashing with Implicit Topics Modeling

Ye, F; Manotumruksa, J; Yilmaz, E; (2020) Unsupervised Few-Bits Semantic Hashing with Implicit Topics Modeling. In: Cohn, T and He, Y and Liu, Y, (eds.) Findings of the Association for Computational Linguistics: EMNLP 2020. (pp. pp. 2566-2575). Association for Computational Linguistics (ACL): Online conference. Green open access

[thumbnail of 2020.findings-emnlp.233.pdf]
Preview
Text
2020.findings-emnlp.233.pdf - Published Version

Download (4MB) | Preview

Abstract

Semantic hashing is a powerful paradigm for representing texts as compact binary hash codes. The explosion of short text data has spurred the demand of few-bits hashing. However, the performance of existing semantic hashing methods cannot be guaranteed when applied to few-bits hashing because of severe information loss. In this paper, we present a simple but effective unsupervised neural generative semantic hashing method with a focus on few-bits hashing. Our model is built upon variational autoencoder and represents each hash bit as a Bernoulli variable, which allows the model to be end-to-end trainable. To address the issue of information loss, we introduce a set of auxiliary implicit topic vectors. With the aid of these topic vectors, the generated hash codes are not only low-dimensional representations of the original texts but also capture their implicit topics. We conduct comprehensive experiments on four datasets. The results demonstrate that our approach achieves significant improvements over state-of-the-art semantic hashing methods in few-bits hashing.

Type: Proceedings paper
Title: Unsupervised Few-Bits Semantic Hashing with Implicit Topics Modeling
Event: 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Open access status: An open access version is available from UCL Discovery
DOI: 10.18653/v1/2020.findings-emnlp.233
Publisher version: http://dx.doi.org/10.18653/v1/2020.findings-emnlp....
Language: English
Additional information: This is an Open Access paper published under a Creative Commons Attribution 4.0 International (CC BY 4.0) licence (https://creativecommons.org/licenses/by/4.0/).
UCL classification: UCL
UCL > Provost and Vice Provost Offices
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10117563
Downloads since deposit
Loading...
158Downloads
Download activity - last month
Loading...
Download activity - last 12 months
Loading...
Downloads by country - last 12 months
Loading...

Archive Staff Only

View Item View Item