Okapi at TREC-7: automatic ad hoc, filtering, VCL and interactive track
- Stephen Robertson ,
- S. Walker ,
- M.M. Beaulieu
Published by National Institute of Standards and Technology
The article presents text retrieval research results in the areas of automatic ad hoc retrieval, information filtering, VLC (very large collections) and interactive retrieval. For automatic ad hoc retrieval, three runs were submitted: medium (title and description), short (title only) and a run which was a combination of a long run (title, description and narrative) with the medium and short runs. The average precision of the last mentioned run was higher by several percent than any other submitted run, but another participant recently noticed an impossibly high score for one topic in the short run. This led to the discovery that due to a mistake in the indexing procedures part of the SUBJECT field of the LA Times documents had been indexed. Use of this field was explicitly forbidden in the guidelines for the ad hoc track. The official runs were repeated against a corrected index, and the corrected results are presented, average precisions being reduced by about 2-4%. In the area of adaptive filtering, efforts focused on the twin problems of: (a) starting from scratch, with no assumed history of relevance judgments for each topic, and (b) having to define a threshold for retrieval. For VLC, four runs on the full database were submitted, together with one each on the 10% and 1% collections. In the area of interactive retrieval, two pairwise comparisons were made: Okapi with relevance feedback against Okapi without, and Okapi without against ZPrise without.