ABSTRACT
Considerable research effort has been invested in improving the effectiveness of information retrieval systems. Techniques such as relevance feedback, thesaural expansion, and pivoting all provide better quality responses to queries when tested in standard evaluation frameworks. But such enhancements can add to the cost of evaluating queries. In this paper we consider the pragmatic issue of how to improve the cost-effectiveness of searching. We describe a new inverted file structure using quantized weights that provides superior retrieval effectiveness compared to conventional inverted file structures when early termination heuristics are employed. That is, we are able to reach similar effectiveness levels with less computational cost, and so provide a better cost/performance compromise than previous inverted file organisations.
- V.N.Anh and A.Mo .at.ompressed inverted .les with reduced decoding overheads.In Croft et al.{1998 },pages 290 -297. Google ScholarDigital Library
- E.W.Brown.Fast evaluation of structured queries for information retrieval.In E.A.Fox,P.Ingwersen,and R.Fidel, editors,Proc. 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval pages 30 -38.ACM Press,New York,July 1995. Google ScholarDigital Library
- C.Buckley and A.F.Lewit.Optimization of inverted vector searches.In Proc. 8th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval pages 97 -110,Montreal,anada,June 1985.A M Press,New York. Google ScholarDigital Library
- G.V.Cormack,C.R.Palmer,and C.L.A.Clarke.Efficient construction of large test collections.In roft et al.{1998 }, pages 282 -289. Google ScholarDigital Library
- W.B.Croft,A.Mo .at,C.J.van Rijsbergen,R.Wilkinson,and J.Zobel,editors.Proc. 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval Melbourne,Australia,Aug.1998.ACM Press,New York. Google Scholar
- D.K.Harman.Overview of the second text retrieval conference (TREC-2).Information Processing & Management 31(3):271 - 289,May 1995. Google ScholarDigital Library
- D.K.Harman and G.andela.Retrieving records from a gigabyte of text on a minicomputer using statistical ranking. Journal of the American Society for Information Science 41 (8):581 -589,Aug.1990.Google ScholarCross Ref
- D.Hawking,E.Voorhees,N.raswell,and P.Bailey.Overview of theTRE -8webtrack.InE.M.VoorheesandD.K.Harman, editors,Proc. Eighth Text REtrieval Conference (TREC-8), Gaithersburg,MD,Nov.1999.National Institute of Standards and Technology Special Publication 500-246.Google Scholar
- D.Lucarella.A document retrieval system based upon nearest neighbour searching.Journal of Information Science 14:25 - 33,1988. Google ScholarDigital Library
- A.Mo .at and J.Zobel.Self-indexing inverted .les for fast text retrieval.ACM Transactions on Information Systems 14(4): 349 -379,Oct.1996. Google ScholarDigital Library
- A.Mo .at,J.Zobel,and R.Sacks-Davis.Memory e .cient ranking.Information Processing & Management 30(6):733 -744, Nov.1994. Google ScholarDigital Library
- M.Persin.Document .ltering for fast ranking.In W.B.Croft and C.J.van Rijsbergen,editors,Proc. 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval pages 339 -348.ACM Press,New York, July 1994. Google ScholarDigital Library
- M.Persin,J.Zobel,and R.Sacks-Davis.Filtered document retrieval with frequency-sorted indexes.Journal of the American Society for Information Science 47(10):749 -764,Oct.1996. Google ScholarDigital Library
- S.E.Robertson,S.Walker,and M.M.Hancock-Beaulieu.Large test collection experiences on an operational,interactive system:Okapi at TREC.Information Processing & Management 31(3):345 -360,May 1995. Google ScholarDigital Library
- A.Singhal,C.Buckley,and M.Mitra.Pivoted document length normalization.In H.-P.Frei,D.Harman,P.Schauble,and R.Wilkinson,editors,Proc. 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval pages 21 -29.A M Press,New York,Aug.1996. Google ScholarDigital Library
- H.Turtle and J.Flood.Query evaluation:strategies and optimizations.Information Processing & Management 31(1): 831 -850,Nov.1995. Google ScholarDigital Library
- I.H.Witten,A.Mo .at,and T.Bell.Managing Gigabytes: Compressing and Indexing Documents and Images. Morgan Kaufmann, San Francisco, second edition, 1999. Google ScholarDigital Library
- W. Y. P. Wong and D. K. Lee. Implementation of partial document ranking using inverted files. Information Processing & Management, 29(5): 647-669, Sept. 1993. Google ScholarDigital Library
- J. Zobel. How reliable are the results of large-scale information retrieval experiments? In Croft et al. {1998}, pages 307-314. Google ScholarDigital Library
- J. Zobel and A. Moffat. Exploring the similarity space. ACM SIGIR Forum, 32(1):18-34, Spring 1998. Google ScholarDigital Library
Index Terms
- Vector-space ranking with effective early termination
Recommendations
Improvement of vector space information retrieval model based on supervised learning
IRAL '00: Proceedings of the fifth international workshop on on Information retrieval with Asian languagesThis paper proposes and method to improve retrieval performance of the vector space model (VSM) by utilizing user-supplied information of those documents that are relevant to the query in question. In addition to the user's relevance feedback ...
An information retrieval model based on vector space method by supervised learning
This paper proposes a method to improve retrieval performance of the vector space model (VSM) in part by utilizing user-supplied information of those documents that are relevant to the query in question. In addition to the user's relevance feedback ...
Comparing vector space retrieval with the RUBRIC expert system
RUBRIC is an expert system for full-text information retrieval. The underlying model for RUBRIC's information retrieval process is based upon fuzzy set theory. The RUBRIC developers have compared RUBRIC to the boolean retrieval model, which it subsumes. ...
Comments