Article

Vector-space ranking with effective early termination

Authors:
Vo Ngoc Anh

Univ. of Melborne, Victoria, Australia

Univ. of Melborne, Victoria, Australia
View Profile

,
Owen de Kretser

Univ. of Melborne, Victoria, Australia

Univ. of Melborne, Victoria, Australia
View Profile

,
Alistair Moffat

Univ. of Melborne, Victoria, Australia

Univ. of Melborne, Victoria, Australia
View Profile

SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrievalSeptember 2001Pages 35–42https://doi.org/10.1145/383952.383957

Published:01 September 2001Publication History

SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval

Pages 35–42

ABSTRACT

Considerable research effort has been invested in improving the effectiveness of information retrieval systems. Techniques such as relevance feedback, thesaural expansion, and pivoting all provide better quality responses to queries when tested in standard evaluation frameworks. But such enhancements can add to the cost of evaluating queries. In this paper we consider the pragmatic issue of how to improve the cost-effectiveness of searching. We describe a new inverted file structure using quantized weights that provides superior retrieval effectiveness compared to conventional inverted file structures when early termination heuristics are employed. That is, we are able to reach similar effectiveness levels with less computational cost, and so provide a better cost/performance compromise than previous inverted file organisations.

References

V.N.Anh and A.Mo .at.ompressed inverted .les with reduced decoding overheads.In Croft et al.{1998 },pages 290 -297. Google ScholarDigital Library
E.W.Brown.Fast evaluation of structured queries for information retrieval.In E.A.Fox,P.Ingwersen,and R.Fidel, editors,Proc. 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval pages 30 -38.ACM Press,New York,July 1995. Google ScholarDigital Library
C.Buckley and A.F.Lewit.Optimization of inverted vector searches.In Proc. 8th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval pages 97 -110,Montreal,anada,June 1985.A M Press,New York. Google ScholarDigital Library
G.V.Cormack,C.R.Palmer,and C.L.A.Clarke.Efficient construction of large test collections.In roft et al.{1998 }, pages 282 -289. Google ScholarDigital Library
W.B.Croft,A.Mo .at,C.J.van Rijsbergen,R.Wilkinson,and J.Zobel,editors.Proc. 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval Melbourne,Australia,Aug.1998.ACM Press,New York. Google Scholar
D.K.Harman.Overview of the second text retrieval conference (TREC-2).Information Processing & Management 31(3):271 - 289,May 1995. Google ScholarDigital Library
D.K.Harman and G.andela.Retrieving records from a gigabyte of text on a minicomputer using statistical ranking. Journal of the American Society for Information Science 41 (8):581 -589,Aug.1990.Google ScholarCross Ref
D.Hawking,E.Voorhees,N.raswell,and P.Bailey.Overview of theTRE -8webtrack.InE.M.VoorheesandD.K.Harman, editors,Proc. Eighth Text REtrieval Conference (TREC-8), Gaithersburg,MD,Nov.1999.National Institute of Standards and Technology Special Publication 500-246.Google Scholar
D.Lucarella.A document retrieval system based upon nearest neighbour searching.Journal of Information Science 14:25 - 33,1988. Google ScholarDigital Library
A.Mo .at and J.Zobel.Self-indexing inverted .les for fast text retrieval.ACM Transactions on Information Systems 14(4): 349 -379,Oct.1996. Google ScholarDigital Library
A.Mo .at,J.Zobel,and R.Sacks-Davis.Memory e .cient ranking.Information Processing & Management 30(6):733 -744, Nov.1994. Google ScholarDigital Library
M.Persin.Document .ltering for fast ranking.In W.B.Croft and C.J.van Rijsbergen,editors,Proc. 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval pages 339 -348.ACM Press,New York, July 1994. Google ScholarDigital Library
M.Persin,J.Zobel,and R.Sacks-Davis.Filtered document retrieval with frequency-sorted indexes.Journal of the American Society for Information Science 47(10):749 -764,Oct.1996. Google ScholarDigital Library
S.E.Robertson,S.Walker,and M.M.Hancock-Beaulieu.Large test collection experiences on an operational,interactive system:Okapi at TREC.Information Processing & Management 31(3):345 -360,May 1995. Google ScholarDigital Library
A.Singhal,C.Buckley,and M.Mitra.Pivoted document length normalization.In H.-P.Frei,D.Harman,P.Schauble,and R.Wilkinson,editors,Proc. 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval pages 21 -29.A M Press,New York,Aug.1996. Google ScholarDigital Library
H.Turtle and J.Flood.Query evaluation:strategies and optimizations.Information Processing & Management 31(1): 831 -850,Nov.1995. Google ScholarDigital Library
I.H.Witten,A.Mo .at,and T.Bell.Managing Gigabytes: Compressing and Indexing Documents and Images. Morgan Kaufmann, San Francisco, second edition, 1999. Google ScholarDigital Library
W. Y. P. Wong and D. K. Lee. Implementation of partial document ranking using inverted files. Information Processing & Management, 29(5): 647-669, Sept. 1993. Google ScholarDigital Library
J. Zobel. How reliable are the results of large-scale information retrieval experiments? In Croft et al. {1998}, pages 307-314. Google ScholarDigital Library
J. Zobel and A. Moffat. Exploring the similarity space. ACM SIGIR Forum, 32(1):18-34, Spring 1998. Google ScholarDigital Library

Index Terms

Vector-space ranking with effective early termination

Recommendations

Improvement of vector space information retrieval model based on supervised learning
IRAL '00: Proceedings of the fifth international workshop on on Information retrieval with Asian languages

This paper proposes and method to improve retrieval performance of the vector space model (VSM) by utilizing user-supplied information of those documents that are relevant to the query in question. In addition to the user's relevance feedback ...
Read More
An information retrieval model based on vector space method by supervised learning

This paper proposes a method to improve retrieval performance of the vector space model (VSM) in part by utilizing user-supplied information of those documents that are relevant to the query in question. In addition to the user's relevance feedback ...
Read More
Comparing vector space retrieval with the RUBRIC expert system

RUBRIC is an expert system for full-text information retrieval. The underlying model for RUBRIC's information retrieval process is based upon fuzzy set theory. The RUBRIC developers have compared RUBRIC to the boolean retrieval model, which it subsumes. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
September 2001
454 pages
ISBN:1581133316
DOI:10.1145/383952
Chairmen:
Donald H. Kraft
Louisiana State Univ.
,
W. Bruce Croft
University of Massachusetts, (For the Americas)
,
David J. Harper
The Robert Gordon University, (For Europe and Africa)
,
Justin Zobel
RMIT University, (For Asia and Australasia)
Copyright © 2001 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 September 2001
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
SIGIR '01 Paper Acceptance Rate47of201submissions,23%Overall Acceptance Rate792of3,983submissions,20%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 145
  Total Citations
  View Citations
- 1,288
  Total Downloads
- Downloads (Last 12 months)11
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Vector-space ranking with effective early termination

SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Improvement of vector space information retrieval model based on supervised learning

An information retrieval model based on vector space method by supervised learning

Comparing vector space retrieval with the RUBRIC expert system