Abstract
Today, users expect a variety of digital libraries to be searchable from a single Web page. The German Vascoda project provides this service for dozens of information sources. Its ultimate goal is to provide search quality close to the ranking of a central database containing documents from all participating libraries. Currently, however, the Vascoda portal is based on a non-cooperative metasearch approach, where results from sources are merged randomly and ranking quality is sub-optimal. In this paper, we describe a Lucene-based plugin which replaces this method by a truly federated search across different search engines, where the exchange of document statistics improves document ranking. Preliminary evaluation results show ranking results equal to a centralized setup.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Balke, W.-T., Nejdl, W., Siberski, W., Thaden, U.: DL meets P2P – Distributed Document Retrieval based on Classification and Content. In: Rauber, A., Christodoulakis, S., Tjoa, A.M. (eds.) ECDL 2005. LNCS, vol. 3652, pp. 379–390. Springer, Heidelberg (2005)
Callan, J.: Distributed Information Retrieval. In: Croft, W.B. (ed.) Advances in information retrieval, pp. 127–150 (2000)
Castelli, D.: DILIGENT: A Digital Library Infrastructure on Grid Enabled Technology. ERCIM News 59 (2004), http://www.ercim.org/publication/ercim_news/enw59/castelli.html
Craswell, N.E.: Methods for Distributed Information Retrieval. PhD thesis, ANU, January 01 (2001), http://eprints.anu.edu.au/archive/00000503/
Bruce Croft, W.: Combining Approaches to IR. In: DELOS Workshop: Information Seeking, Searching and Querying in Digital Libraries (2000)
Cutting, D., et al.: Lucene, http://lucene.apache.org
Fuhr, N., Klas, C.-P., Schaefer, A., Mutschke, P.: Daffodil: An Integrated Desktop for Supporting High-Level Search Activities in Federated Digital Libraries. In: Agosti, M., Thanos, C. (eds.) ECDL 2002. LNCS, vol. 2458, pp. 597–612. Springer, Heidelberg (2002)
Gospodnetic, O., Hatcher, E.: Lucene in Action. Manning (2005)
Gravano, L., Chang, K.C.-C., Garcia-Molina, H., Paepcke, A.: STARTS: Stanford Proposal for Internet Meta-Searching. In: SIGMOD 1997: Proceedings of the 1997 ACM International Conference on Management of Data, pp. 207–218 (1997)
Green, N., Ipeirotis, P.G., Gravano, L.: SDLIP + STARTS = SDARTS a Protocol and Toolkit for Metasearching. In: JCDL 2001: Proceedings of the The First ACM and IEEE Joint Conference on Digital Libraries, pp. 207–214 (2001)
Lagoze, C., Van de Sompel, H., Nelson, M., Warner, S.: The Open Archives Initiative Protocol for Metadata Harvesting Protocol Version 2.0 of 2002-06-14, http://www.openarchives.org/oai/openarchivesprotocol.html
Liu, X., Maly, K., Zubair, M., Hong, Q., Nelson, M.L., Knudson, F.: Holtkamp. Federated Searching Interface Techniques for Heterogeneous OAI Repositories. Journal of Digital Information 4(2) (2002)
Meng, W., Yu, C.T., Liu, K.-L.: Building Efficient and Effective Metasearch Engines. ACM Comput. Surv. 34(1), 48–89 (2002), http://doi.acm.org/10.1145/505282.505284
National Information Standards Organization. Z39.50: Application Service Definition and Protocol Specification (2003)
Neuroth, H., Pianos, T.: VASCODA: A German Scientific Portal for Cross-Searching Distributed Digital Resource Collections. In: Koch, T., Sølvberg, I.T. (eds.) ECDL 2003. LNCS, vol. 2769, pp. 257–262. Springer, Heidelberg (2003)
Sadeh, T.: Google Scholar Versus Metasearch Systems. High Energy Physics Libraries Webzine, 12 (2006)
Salton, G., Wong, A., Yang, C.S.: A Vector Space Model for Automatic Indexing. Commun. ACM 18(11), 613–620 (1975)
Si, L., Jin, R., Callan, J.P., Ogilvie, P.: A Language Modeling Framework for Resource Selection and Results Merging. In: CIKM 2002: Proceedings of the ACM 11th Conference on Information and Knowledge Management, pp. 391–397 (2002), http://doi.acm.org/10.1145/584792.584856
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chernov, S., Kohlschütter, C., Nejdl, W. (2006). A Plugin Architecture Enabling Federated Search for Digital Libraries. In: Sugimoto, S., Hunter, J., Rauber, A., Morishima, A. (eds) Digital Libraries: Achievements, Challenges and Opportunities. ICADL 2006. Lecture Notes in Computer Science, vol 4312. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11931584_23
Download citation
DOI: https://doi.org/10.1007/11931584_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49375-4
Online ISBN: 978-3-540-49377-8
eBook Packages: Computer ScienceComputer Science (R0)