AN UPPERBOUND TO THE PERFORMANCE OF RANKED‐OUTPUT SEARCHING: OPTIMAL WEIGHTING OF QUERY TERMS USING A GENETIC ALGORITHM
Abstract
This paper describes the development of a genetic algorithm (GA) for the assignment of weights to query terms in a ranked‐output document retrieval system. The GA involves a fitness function that is based on full relevance information, and the rankings resulting from the use of these weights are compared with the Robertson‐Sparck Jones F4 retrospective relevance weight. Extended experiments with seven document test collections show that the ga can often find weights that are slightly superior to those produced by the deterministic weighting scheme. That said, there are many cases where the two approaches give the same results, and a few cases where the F4 weights are superior to the ga weights. Since the ga has been designed to identify weights yielding the best possible level of retrospective performance, these results indicate that the F4 weights provide an excellent and practicable alternative. Evidence is presented to suggest that negative weights may play an important role in retrospective relevance weighting.
Citation
ROBERTSON, A.M. and WILLETT, P. (1996), "AN UPPERBOUND TO THE PERFORMANCE OF RANKED‐OUTPUT SEARCHING: OPTIMAL WEIGHTING OF QUERY TERMS USING A GENETIC ALGORITHM", Journal of Documentation, Vol. 52 No. 4, pp. 405-420. https://doi.org/10.1108/eb026973
Publisher
:MCB UP Ltd
Copyright © 1996, MCB UP Limited