Gorla, JY;
(2016)
A bi-directional unified Model for information retrieval.
Doctoral thesis , UCL (University College London).
Preview |
Text
Jagadeesh Gorla bi-directional-model_final__print.pdf Download (4MB) | Preview |
Abstract
Relevance matching between two information objects such as a document and query or a user and product (e.g. movie) is an important problem in information retrieval systems. The most common and most successful way to approach this problem is by probabilistically modelling the relevance between information objects, and computing their relevance matching as the probability of relevance. The objective of a probabilistic relevance retrieval model is to compute the probability of relevance between a given information object pair using all the available information about the individual objects (e.g., document and query), the existing relevance information on both objects and all the information available on other information objects (other documents, queries in the collection and the relevance information on them). The probabilistic retrieval models developed to date are not capable of utilising all available information due to the lack of a unified theory for relevance modelling. More than three decades ago, the notion of simultaneously utilising the relevance information about individual user needs and individual documents to come to a retrieval decision was formalised as the problem of a unified relevance model for Information Retrieval (IR). Since the inception of the unified model, a number of unsuccessful attempts have been made to develop a formal probabilistic relevance model to solve the problem. This thesis provides a new theory and a probabilistic relevance framework that not only solves the problem of the original unified relevance model but also provides the capability to utilise any available information about the information objects in computing the probability of relevance. In this thesis, we consider information matching between two objects (e.g. documents and queries) to be bi-directional preference matching and the relevance between them is thus established and estimated on top of the bi-directional relationship. A key benefit of this bi-directional approach is that the resulting probabilistic bi-directional unified model not only solves the original problem of a unified model in information retrieval but also has the ability to incorporate all of the available information on the information objects (documents and queries) into a single model while computing the probability of relevance. Theoretically, we demonstrate the effectiveness of applying our single framework by deriving relevance ranking functions for popular retrieval scenarios such as collaborative filtering (recommendation), group recommendation and ad-hoc retrieval. In the past, the solution for relevance matching in each of these retrieval scenarios approached with a different solution/framework, partly due to the kind of information available to the retrieval system for computing the probability of relevance. However, the underlying problem of information matching is the same in all scenarios, and a solution to the problem of a unified model should be applicable to all scenarios. One of the interesting aspects of our new theory and model in applying to a collaborative filtering scenario is that it computes the probability of relevance between a given user and a given item while not applying any dimensionality reduction technique or computing the explicit similarity between the users/items, which is contrary to the state-of-the-art collaborative filtering/recommender models (e.g. Matrix Factorisation methods, neighbourhood-based methods). This property allows the retrieval model to model users and items independently with their own features, rather than forcing it to use a common feature space (e.g., common hidden factor-features between a user-item pair of objects or a common vocabulary space between a document-query pair of objects). The effectiveness of this theoretical framework is demonstrated in various real-world applications by experimenting on datasets in collaborative filtering, group recommendation and ad-hoc retrieval tasks. For collaborative filtering and group recommendation the model convincingly out-performs various state-of-the-art recommender models (or frameworks). For ad-hoc retrieval, the model also outperforms the state-of-the-art information retrieval models when it is restricted to use the same information used by the other models. The bi-directional unified model allows the building of both search and personalisation/recommender (or collaborative filtering) systems from a single model, which has not been possible before with the existing probabilistic relevance models. Finally, our theory and its framework have been adopted by some large companies in gaming, venture-capital matching, retail and media, and deployed on their web systems to match their customers, often in the tens of millions, with relevant content.
Type: | Thesis (Doctoral) |
---|---|
Title: | A bi-directional unified Model for information retrieval |
Event: | UCL (University College London) |
Open access status: | An open access version is available from UCL Discovery |
Language: | English |
UCL classification: | UCL > Provost and Vice Provost Offices UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science |
URI: | https://discovery.ucl.ac.uk/id/eprint/1474236 |
Archive Staff Only
View Item |