Abstract
A Web Public Access Catalog (WebPAC) is an important feature of modern libraries. In this paper we propose a meta-search method to provide users with simultaneous access to WebPACs of different libraries. Our method gives a librarian full freedom to select WebPACs to be incorporated in the service but requires no programming effort from the librarian’s side. At the core of our method is a meta-search engine which sends a query to incorporated WebPACs, receives results, and post-processes the query results into a uniform presentation format. To incorporate an existing WebPAC into our system, one needs to analyze the query interaction behavior between the WebPAC and the browser. This can be done by extracting the query parameters from a query and the subsequent query result web pages. We modeled and abstracted these interactions and defined the corresponding XML formats to capture the needed parameters from these web pages. The resulting XML pages will then be fed to the search engine which will automatically incorporate the designated WebPAC as part of its search.
The advantage of our method is that the search engine does not need to be modified when new WebPACs are added. When adding a new WebPAC, the librarian only needs to analyze a few web pages to decide the parameters. Even this step can mostly be done automatically. To illustrate the effectiveness of our method, we have built a system, called MetaCat, that has incorporated the WebPACs of 26 major libraries in Taiwan. MetaCat can be accessed at http://MetaCat.ntu.edu.tw.
This research is supported in part by the National Science Council of the Republic of China under grant numbers NSC-94-2422-H-002-008 and NSC-93-2213-E-002-039.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
National Information Standard Organization (NISO).ANSI Z39.50: Information Retrieval Service and Protocol (1992)
The Open Archives Initiative Protocol for Metadata Harvesting protocol version 2.0, http://www.openarchives.org/OAI/2.0/openarchivesprotocol.htm
Fang, L.: Library of Central China Normal University. A Developing Search Service - Heterogeneous Resources Integration and Retrieval System. D-Lib Magazine 10(3) (March 2004)
Kushmerick, N., Weld, D., Doorenbos, R.: Wrapper induction for information extraction. In: IJCAI 1997 (1997), http://sherry.ifi.unizh.ch/kushmerick97wrapper.html
Zhao, H., Meng, W., Wu, Z., Raghavan, V., Yu, C.: Fully Automatic Wrapper Generation for Search Engines. In: Proc. of 14th International World Wide Web Conference (WWW14), Chiba, Japan, May 2005, pp. 66–75 (2005)
Habegger, B.: Multi-pattern wrappers for relation extraction from the Web. In: Proceedings of the European Conference on Artificial Intelligence (2002)
Chang, C.-H., Siek, H., Lu, J.-J., Hsu, C.-N., Chiou, J.-J.: Reconfigurable Web Wrapper Agents. IEEE Intelligent Systems 18(5), 34–40 (2003)
Chang, C.-H., Lui, S.-C.: IEPAD: information extraction based on pattern discovery. In: Proceedings of the Tenth International Conference on the World Wide Web, Hong Kong, China, pp. 681–688 (2001)
Forms - User-input Forms: Text Fields, Buttons, Menus, and more HTML 4.01 Specification W3C Recommendation (December 24, 1999), http://www.w3.org/TR/REC-html40/interact/forms.html#h-17.10
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ho, H.I., Hsiang, J. (2005). Configurable Meta-search for Integrating Web Public Access Catalogs. In: Fox, E.A., Neuhold, E.J., Premsmit, P., Wuwongse, V. (eds) Digital Libraries: Implementing Strategies and Sharing Experiences. ICADL 2005. Lecture Notes in Computer Science, vol 3815. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11599517_36
Download citation
DOI: https://doi.org/10.1007/11599517_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30850-8
Online ISBN: 978-3-540-32291-7
eBook Packages: Computer ScienceComputer Science (R0)