UCL logo

UCL Discovery

UCL home » Library Services » Electronic resources » UCL Discovery

Re-ranking Documents Based on Query-Independent Document Specificity

Zheng, L; Cox, IJ; (2009) Re-ranking Documents Based on Query-Independent Document Specificity. In: Andreasen, T and Yager, RR and Bulskov, H and Christiansen, H and Larsen, HL, (eds.) FLEXIBLE QUERY ANSWERING SYSTEMS: 8TH INTERNATIONAL CONFERENCE, FQAS 2009. (pp. 201 - 214). SPRINGER-VERLAG BERLIN

Full text not available from this repository.

Abstract

The use of query-independent knowledge to improve the ranking of documents in information retrieval has proven very effective in the context of web search. This query-independent knowledge is derived from an analysis of the graph structure of hypertext links between documents. However, there are many cases where explicit hypertext links are absent or sparse, e. g. corporate Intranets. Previous work has sought to induce a graph link structure based on various measures of similarity between documents. After inducing these links, standard link analysis algorithms, e. g. Page Rank, can then be applied. In this paper, we propose and examine an alternative approach to derive query-independent knowledge, which is not based on link analysis. Instead, we analyze each document independently and calculate a "specificity" score, based on (i) normalized inverse document frequency, and (ii) term entropies. Two re-ranking strategies, i.e. hard cutoff and soft cutoff, are then discussed to utilize our query-independent "specificity" scores. Experiments on standard TREC test sets show that our re-ranking algorithms produce gains in mean reciprocal rank of about 4%, and 4% to 6% gains in precision at 5 and 10, respectively, when using the collection of TREC disk 4 and queries from TREC 8 ad hoc topics. Empirical tests demonstrate that the entropy-based algorithm produces stable results across (i) retrieval models, (ii) query sets, and (iii) collections.

Type:Proceedings paper
Title:Re-ranking Documents Based on Query-Independent Document Specificity
Event:8th International Conference on Flexible Query Answering Systems
Location:Roskilde Univ, Dept Commun, Business & Informat Technol, Roskilde, DENMARK
Dates:2009-10-26 - 2009-10-28
ISBN-13:978-3-642-04956-9
Keywords:Query-independent knowledge, Specificity, Normalized inverse document frequency, Entropy, Ranking, Information retrieval
UCL classification:UCL > School of BEAMS > Faculty of Engineering Science > Computer Science

Archive Staff Only: edit this record