Abstract
The recent explosion of the internet has made digital libraries popular. The user-friendly interface of Web browsers allows a user much easier access to the digital library. However, to retrieve relevant documents from the digital library, the user is provided with a search interface consisting of one input field and one push button. Most users type in a single keyword, click the button, and hope for the best. The result of a query using this kind of search interface can consist of a large unordered set of documents, or a ranked list of documents based on the frequency of the keywords. Both lists can contain articles unrelated to user’s inquiry unless a sophisticated search was performed and the user knows exactly what to look for. More sophisticated algorithms for ranking the relevance of search results may help, but what is desperately needed are software tools that can analyze the search result and manipulate large hierarchies of data graphically. In this paper, we present a language-independent document classification system for the Florida Center for Library Automation to help users analyze the search query results. Easy access through the Web is provided, as well as a graphical user interface to display the classification results.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bowman, C.M., Danzig, P.B., Manber, U., Schwartz, M.F.: Scalable Internet: Resource discovery. Communication of the ACM, 37(8), (August 1994).
Efthimiadis, E. N.: User Choices: A New Yardstick for the Evaluation of Raking Algorithms for interactive Query Expansion. Information Processing and Management, 31(4), (1995) 604–620
Chang, C., Hsu, C.: Customizable Multi-Engine Search Tool with Clustering. Proceeding of the Sixth International World Wide Web Conference, Santa Clara, CA, USA, April 7–11, 1997
Hearst, Marti A.: Interfaces for Searching the Web. Special Report Article in Scientific America, 3, (1997)
Voorhees, E.: The Effectiveness and Effciency of Agglomerative Hierarchic clustering in Document Retrieval. Ph.D. thesis, Cornelle University, (1986)
Willett, P.: Recent Trends in hierarchic document clustering: a critical review Information Processing and Management,24(5), (1988), 577–597
Maarek, Y.S., Wecker, A. J.: The Librarian’s Assistant: Automatically Assembling books into Dynamic Bookshelves Proceeding of RIAO’94, Intelligent Multimedia, Information Retrieval Systems and Management, New York, NY, (1994)
Hearst, Marti A., Pedersen, Jan O.: Reexamining the Cluster Hypothesis: Scatter/Gather on Retrieval Results. Proceedings of SIGIR’96, (1996)
Damashek, Marc,: Gauging similarity with N-Grams: Language independent categorization of text Science, 267, 843–848, (10 February 1995).
Cavnar, William B.: N-Gram-Based Text Categorization. Proceeding of the Third Symposium on Document Analysis and Information Retrieval. University of Nevada, Las Vegas, 1994, 161–176
Kamada T., Kawai, S.: An Algorithm for Drawing General Undirected Graphs. Information Processing Letters, 31(1), 12 April 1989, 7–15
Cohen, J.,: Highlights: Language and domain-independent automatic indexing terms for abstracting Journal of the American society for information science. 46(3) 162–174, (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liu, YH. et al. (1998). Visualizing Document Classification: A Search Aid for the Digital Library. In: Nikolaou, C., Stephanidis, C. (eds) Research and Advanced Technology for Digital Libraries. ECDL 1998. Lecture Notes in Computer Science, vol 1513. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49653-X_33
Download citation
DOI: https://doi.org/10.1007/3-540-49653-X_33
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65101-7
Online ISBN: 978-3-540-49653-3
eBook Packages: Springer Book Archive