Abstract
Today, users expect a variety of digital libraries to be searchable from a single Web page. The German Vascoda project provides this service for dozens of information sources. Its ultimate goal is to provide search quality close to the ranking of a central database containing documents from all participating libraries. Currently, however, the Vascoda portal is based on a non-cooperative metasearch approach, where results from sources are merged randomly and ranking quality is sub-optimal. In this paper, we describe a Lucene-based plugin which replaces this method by a truly federated search across different search engines, where the exchange of document statistics improves document ranking. Preliminary evaluation results show ranking results equal to a centralized setup.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Balke, W.-T., Nejdl, W., Siberski, W., Thaden, U.: DL meets P2P – Distributed Document Retrieval based on Classification and Content. In: Rauber, A., Christodoulakis, S., Tjoa, A.M. (eds.) ECDL 2005. LNCS, vol. 3652, pp. 379–390. Springer, Heidelberg (2005)
Callan, J.: Distributed Information Retrieval. In: Croft, W.B. (ed.) Advances in information retrieval, pp. 127–150 (2000)
Castelli, D.: DILIGENT: A Digital Library Infrastructure on Grid Enabled Technology. ERCIM News 59 (2004), http://www.ercim.org/publication/ercim_news/enw59/castelli.html
Craswell, N.E.: Methods for Distributed Information Retrieval. PhD thesis, ANU, January 01 (2001), http://eprints.anu.edu.au/archive/00000503/
Bruce Croft, W.: Combining Approaches to IR. In: DELOS Workshop: Information Seeking, Searching and Querying in Digital Libraries (2000)
Cutting, D., et al.: Lucene, http://lucene.apache.org
Fuhr, N., Klas, C.-P., Schaefer, A., Mutschke, P.: Daffodil: An Integrated Desktop for Supporting High-Level Search Activities in Federated Digital Libraries. In: Agosti, M., Thanos, C. (eds.) ECDL 2002. LNCS, vol. 2458, pp. 597–612. Springer, Heidelberg (2002)
Gospodnetic, O., Hatcher, E.: Lucene in Action. Manning (2005)
Gravano, L., Chang, K.C.-C., Garcia-Molina, H., Paepcke, A.: STARTS: Stanford Proposal for Internet Meta-Searching. In: SIGMOD 1997: Proceedings of the 1997 ACM International Conference on Management of Data, pp. 207–218 (1997)
Green, N., Ipeirotis, P.G., Gravano, L.: SDLIP + STARTS = SDARTS a Protocol and Toolkit for Metasearching. In: JCDL 2001: Proceedings of the The First ACM and IEEE Joint Conference on Digital Libraries, pp. 207–214 (2001)
Lagoze, C., Van de Sompel, H., Nelson, M., Warner, S.: The Open Archives Initiative Protocol for Metadata Harvesting Protocol Version 2.0 of 2002-06-14, http://www.openarchives.org/oai/openarchivesprotocol.html
Liu, X., Maly, K., Zubair, M., Hong, Q., Nelson, M.L., Knudson, F.: Holtkamp. Federated Searching Interface Techniques for Heterogeneous OAI Repositories. Journal of Digital Information 4(2) (2002)
Meng, W., Yu, C.T., Liu, K.-L.: Building Efficient and Effective Metasearch Engines. ACM Comput. Surv. 34(1), 48–89 (2002), http://doi.acm.org/10.1145/505282.505284
National Information Standards Organization. Z39.50: Application Service Definition and Protocol Specification (2003)
Neuroth, H., Pianos, T.: VASCODA: A German Scientific Portal for Cross-Searching Distributed Digital Resource Collections. In: Koch, T., Sølvberg, I.T. (eds.) ECDL 2003. LNCS, vol. 2769, pp. 257–262. Springer, Heidelberg (2003)
Sadeh, T.: Google Scholar Versus Metasearch Systems. High Energy Physics Libraries Webzine, 12 (2006)
Salton, G., Wong, A., Yang, C.S.: A Vector Space Model for Automatic Indexing. Commun. ACM 18(11), 613–620 (1975)
Si, L., Jin, R., Callan, J.P., Ogilvie, P.: A Language Modeling Framework for Resource Selection and Results Merging. In: CIKM 2002: Proceedings of the ACM 11th Conference on Information and Knowledge Management, pp. 391–397 (2002), http://doi.acm.org/10.1145/584792.584856
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chernov, S., Kohlschütter, C., Nejdl, W. (2006). A Plugin Architecture Enabling Federated Search for Digital Libraries. In: Sugimoto, S., Hunter, J., Rauber, A., Morishima, A. (eds) Digital Libraries: Achievements, Challenges and Opportunities. ICADL 2006. Lecture Notes in Computer Science, vol 4312. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11931584_23
Download citation
DOI: https://doi.org/10.1007/11931584_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49375-4
Online ISBN: 978-3-540-49377-8
eBook Packages: Computer ScienceComputer Science (R0)