Addressing the cold start problem in tag-based recommender systems.
Doctoral thesis, UCL (University College London).
Folksonomies have become a powerful tool to describe, discover, search, and navigate online resources (e.g., pictures, videos, blogs) on the Social Web. Unlike taxonomies and ontologies, which impose a hierarchical categorisation on content, folksonomies directly allow end users to freely create and choose the categories (in this case, tags) that best describe a piece of information. However, the freedom aafforded to users comes at a cost: as tags are defined informally, the retrieval of information becomes more challenging. Different solutions have been proposed to help users discover content in this highly dynamic setting. However, they have proved to be effective only for users who have already heavily used the system (active users) and who are interested in popular items (i.e., items tagged by many other users). In this thesis we explore principles to help both active users and more importantly new or inactive users (cold starters) to find content they are interested in even when this content falls into the long tail of medium-to-low popularity items (cold start items). We investigate the tagging behaviour of users on content and show how the similarities between users and tags can be used to produce better recommendations. We then analyse how users create new content on social tagging websites and show how preferences of only a small portion of active users (leaders), responsible for the vast majority of the tagged content, can be used to improve the recommender system's scalability. We also investigate the growth of the number of users, items and tags in the system over time. We then show how this information can be used to decide whether the benefits of an update of the data structures modelling the system outweigh the corresponding cost. In this work we formalize the ideas introduced above and we describe their implementation. To demonstrate the improvements of our proposal in recommendation efficacy and efficiency, we report the results of an extensive evaluation conducted on three different social tagging websites: CiteULike, Bibsonomy and MovieLens. Our results demonstrate that our approach achieves higher accuracy than state-of-the-art systems for cold start users and for users searching for cold start items. Moreover, while accuracy of our technique is comparable to other techniques for active users, the computational cost that it requires is much smaller. In other words our approach is more scalable and thus more suitable for large and quickly growing settings.
|Title:||Addressing the cold start problem in tag-based recommender systems|
|Open access status:||An open access version is available from UCL Discovery|
|UCL classification:||UCL > School of BEAMS > Faculty of Engineering Science > Computer Science|
Archive Staff Only