UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Inferring Dynamic User Interests in Streams of Short Texts for User Clustering

Liang, S; Ren, Z; Zhao, Y; Ma, J; Yilmaz, E; De Rijke, M; (2017) Inferring Dynamic User Interests in Streams of Short Texts for User Clustering. ACM Transactions on Information Systems , 36 (1) , Article 10. 10.1145/3072606. Green open access

[thumbnail of Yilmaz_a10-liang _accepted manuscript.pdf]
Preview
Text
Yilmaz_a10-liang _accepted manuscript.pdf - Accepted Version

Download (661kB) | Preview

Abstract

User clustering has been studied from different angles. In order to identify shared interests, behavior-based methods consider similar browsing or search patterns of users, whereas content-based methods use information from the contents of the documents visited by the users. So far, content-based user clustering has mostly focused on static sets of relatively long documents. Given the dynamic nature of social media, there is a need to dynamically cluster users in the context of streams of short texts. User clustering in this setting is more challenging than in the case of long documents, as it is difficult to capture the users’ dynamic topic distributions in sparse data settings. To address this problem, we propose a dynamic user clustering topic model (UCT). UCT adaptively tracks changes of each user’s time-varying topic distributions based both on the short texts the user posts during a given time period and on previously estimated distributions. To infer changes, we propose a Gibbs sampling algorithm where a set of word pairs from each user is constructed for sampling. UCT can be used in two ways: (1) as a short-term dependency model that infers a user’s current topic distribution based on the user’s topic distributions during the previous time period only, and (2) as a long-term dependency model that infers a user’s current topic distributions based on the user’s topic distributions during multiple time periods in the past. The clustering results are explainable and human-understandable, in contrast to many other clustering algorithms. For evaluation purposes, we work with a dataset consisting of users and tweets from each user. Experimental results demonstrate the effectiveness of our proposed short-term and long-term dependency user clustering models compared to state-of-the-art baselines.

Type: Article
Title: Inferring Dynamic User Interests in Streams of Short Texts for User Clustering
Open access status: An open access version is available from UCL Discovery
DOI: 10.1145/3072606
Publisher version: http://dx.doi.org/10.1145/3072606
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: Science & Technology, Technology, Computer Science, Information Systems, Computer Science, Diversity, ad hoc retrieval, data streams
UCL classification: UCL
UCL > Provost and Vice Provost Offices
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10045774
Downloads since deposit
327Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item