Liang, S;
Yilmaz, E;
Kanoulas, E;
(2019)
Collaboratively Tracking Interests for User Clustering in Streams of Short Texts.
IEEE Transactions on Knowledge and Data Engineering
, 31
(2)
pp. 257-272.
10.1109/TKDE.2018.2832211.
Preview |
Text
Yilmaz_Clustering in Streams of Short Texts.pdf - Accepted Version Download (825kB) | Preview |
Abstract
In this paper, we aim at tackling the problem of user clustering in the context of their published short text streams. Clustering users by short text streams is more challenging than in the case of long documents associated with them as it is difficult to track users’ dynamic interests in streaming sparse data. To obtain better user clustering performance, we propose two user collaborative interest tracking models that aim at tracking changes of each user’s dynamic topic distributions in collaboration with their followees’ dynamic topic distributions, based both on the content of current short texts and the previously estimated distributions. Our models can be either short-term or long-term dependency topic models. Short-term dependency model collaboratively tracks users’ interests based on users’ topic distributions at the previous time period only, whereas long-term dependency model collaboratively tracks users’ interests based on users’ topic distributions at multiple time periods in the past. We also propose two collapsed Gibbs sampling algorithms for collaboratively inferring users’ dynamic interests for their clustering in our short-term and long-term dependency topic models, respectively. We evaluate our proposed models via a benchmark dataset consisting of Twitter users and their tweets. Experimental results validate the effectiveness of our proposed models that integrate both users’ and their collaborative interests for user clustering by short text streams.
Type: | Article |
---|---|
Title: | Collaboratively Tracking Interests for User Clustering in Streams of Short Texts |
Open access status: | An open access version is available from UCL Discovery |
DOI: | 10.1109/TKDE.2018.2832211 |
Publisher version: | http://doi.org/10.1109/TKDE.2018.2832211 |
Language: | English |
Additional information: | This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions. |
Keywords: | Clustering, topic models, streaming text, twitter |
UCL classification: | UCL UCL > Provost and Vice Provost Offices UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science |
URI: | https://discovery.ucl.ac.uk/id/eprint/10071612 |
Archive Staff Only
View Item |