Abstract
This paper compares two Java-based persistent storage mechanisms: a commercially available object-oriented database (OODB) system and the persistent storage tool PJama [1] on the basis of their suitability for implementing the index component of an experimental World Wide Web search engine called WWLib-TNG. Persistence is provided to the builder component of the search engine, which constructs a catalogue of Web pages. The searcher component of the engine searches the catalogue. The implementation of the builder using PJama and using an OODB were compared with respect to time taken to construct a catalogue, efficient use of disk space and scalability. The implementations of the searcher were compared on response time and scalability. The comparison showed that for this application PJama performs better than the OODB. Compared with the OODB, PJama gave 300% better build performance and was more scalable.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Atkinson, M.P., Daynes, L., Jordan, M.J., Printezis, T. and Spence, S., “An Orthogonally Persistent Java”, ACM Sigmod Record, 24(4), December 1996.
Burden, J. P. H. and Jackson, M. S., “WWLib-TNG-New directions in Search Engine Technology”, IEE Informatics Colloquium: Lost in the Web-Navigation on the Internet, pp. 10/1–10/8, November 1999.
Wallis, J. and Burden, J.P.H., “Towards a Classification-based Approach to Resource Discovery on the Web”, Proceedings of the 4th International W4G Workshop on Design and Electronic Publishing, Abingdon (near Oxford), England, 20–22 November 1995.
Jenkins, C., Jackson, M., Burden, P. and Wallis, J., “The Wolverhampton Web Library (WWLib) and Automatic Classification”, Proceedings of the First International Workshop on Libraries and WWW, Brisbane, Queensland, Australia, 14th April 1998.
Jenkins, C., Jackson, M., Burden, P. and Wallis, J., “Automatic Classification of Web Resources using Java and Dewey Decimal Classifications”, Proceedings of the Seventh International World Wide Web Conference, Brisbane, Queensland, Australia, 14–18 April 1998.
Mai Chan, L., Comaromi, J. P., Mitchell, J. S. and Satija, M. P., Dewey Decimal Classification: A Practical Guide. Forest Press, ISBN 0-910608-55-5, 1996.
Salton, G. and McGill, M. J., Introduction to Modern Information Retrieval. New York: McGraw Hill, 1983.
Faloutsos, C., “Signature Files”, in Frakes, W. B. and Baeza-Yates, R. (eds.) Information Retrieval Data Structures and Algorithms. New Jersey: Prentice Hall, pp. 44–65, 1992.
Gonnet, G. H., Baeza-Yates, R.A. and Snider, T., “New Indices for Text: PAT trees and PAT arrays”, in Frakes, W. B. and Baeza-Yates, R. (eds.) Information Retrieval Data Structures and Algorithms. New Jersey: Prentice Hall, pp. 66–82, 1992.
Bayer, R. and McCreight, E., “Organisation and Maintenance of Large Ordered Indexes”, Acta Informatica, 1(3), pp. 173–189, 1972.
Zobel, J., Moffat, A. and Ramamohanarao, K., “Inverted Files Versus Signature Files for Text Indexing”, ACM Transactions on Database Systems, 23(4), pp. 453–490, December 1998.
Garratt, A., Jackson, M., Burden, P. and Wallis, J., “Implementing a search engine using an OODB”, To appear in L’objet, 6(3), 2000.
Jansen, M. B. J., Spink, A., Bateman, J. and Saracevic, T., “Real Life Information Retrieval: A Study Of User Queries On The Web”, SIGIR FORUM, 32(1), pp. 5–17, Spring 1998.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Garratt, A., Jackson, M., Burden, P., Wallis, J. (2001). A Comparison of Two Persistent Storage Tools for Implementing a Search Engine. In: Kirby, G.N.C., Dearle, A., Sjøberg, D.I.K. (eds) Persistent Object Systems: Design, Implementation, and Use. POS 2000. Lecture Notes in Computer Science, vol 2135. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45498-5_15
Download citation
DOI: https://doi.org/10.1007/3-540-45498-5_15
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42735-3
Online ISBN: 978-3-540-45498-4
eBook Packages: Springer Book Archive