A Domain Based Approach to Information Retrieval in Digital Libraries

Rotella, Fulvio; Ferilli, Stefano; Leuzzi, Fabio

doi:10.1007/978-3-642-35834-0_14

Fulvio Rotella³,
Stefano Ferilli^3,4 &
Fabio Leuzzi³

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 354))

Included in the following conference series:

Italian Research Conference on Digital Libraries

1234 Accesses
5 Citations

Abstract

The current abundance of electronic documents requires automatic techniques that support the users in understanding their content and extracting useful information. To this aim, improving the retrieval performance must necessarily go beyond simple lexical interpretation of the user queries, and pass through an understanding of their semantic content and aims. It goes without saying that any digital library would take enormous advantage from the availability of effective Information Retrieval techniques to provide to their users. This paper proposes an approach to Information Retrieval based on a correspondence of the domain of discourse between the query and the documents in the repository. Such an association is based on standard general-purpose linguistic resources (WordNet and WordNet Domains) and on a novel similarity assessment technique. Although the work is at a preliminary stage, interesting initial results suggest to go on extending and improving the approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Angioni, M., Demontis, R., Tuveri, F.: A semantic approach for resource cataloguing and query resolution. Communications of SIWN. Special Issue on Distributed Agent-based Retrieval Tools 5, 62–66 (2008)
Google Scholar
Bradford, R.B.: An empirical study of required dimensionality for large-scale latent semantic indexing applications. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management, CIKM 2008, pp. 153–162. ACM, New York (2008)
Chapter Google Scholar
Deerwester, S.: Improving Information Retrieval with Latent Semantic Indexing. In: Borgman, C.L., Pai, E.Y.H. (eds.) Proceedings of the 51st ASIS Annual Meeting (ASIS 1988), Atlanta, Georgia, vol. 25. American Society for Information Science (October 1988)
Google Scholar
Dhillon, I.S., Modha, D.S.: Concept decompositions for large sparse text data using clustering. In: Machine Learning, pp. 143–175 (2001)
Google Scholar
Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
MATH Google Scholar
Ferilli, S.: Automatic Digital Document Processing and Management: Problems, Algorithms and Techniques, 1st edn. Springer Publishing Company, Incorporated (2011)
Google Scholar
Ferilli, S., Biba, M., Di Mauro, N., Basile, T.M.A., Esposito, F.: Plugging Taxonomic Similarity in First-Order Logic Horn Clauses Comparison. In: Serra, R., Cucchiara, R. (eds.) AI*IA 2009. LNCS (LNAI), vol. 5883, pp. 131–140. Springer, Heidelberg (2009)
Chapter Google Scholar
Jones, W.P., Furnas, G.W.: Pictures of relevance: A geometric analysis of similarity measures. Journal of the American Society for Information Science 38(6), 420–442 (1987)
Article Google Scholar
Karypis, G., Han, E.-H.(S.): Concept indexing: A fast dimensionality reduction algorithm with applications to document retrieval and categorization. Technical report, In CIKM 2000 (2000)
Google Scholar
Klein, D., Manning, C.D.: Fast exact inference with a factored model for natural language parsing. In: Advances in Neural Information Processing Systems, vol. 15. MIT Press (2003)
Google Scholar
MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Le Cam, L.M., Neyman, J. (eds.) Proc. of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press (1967)
Google Scholar
Magnini, B., Cavaglià, G.: Integrating subject field codes into wordnet, pp. 1413–1418 (2000)
Google Scholar
Robertson, S.E., Walker, S., Jones, S., Hancock-Beaulieu, M., Gatford, M.: Okapi at trec-3. In: TREC (1994)
Google Scholar
Salton, G.: The SMART Retrieval System–Experiments in Automatic Document Processing. Prentice-Hall, Inc., Upper Saddle River (1971)
Google Scholar
Salton, G.: Automatic term class construction using relevance–a summary of work in automatic pseudoclassification. Inf. Process. Manage. 16(1), 1–15 (1980)
Article Google Scholar
Salton, G., McGill, M.: Introduction to Modern Information Retrieval. McGraw-Hill Book Company (1984)
Google Scholar
Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18, 613–620 (1975)
Article MATH Google Scholar
Singhal, A., Buckley, C., Mitra, M., Mitra, A.: Pivoted document length normalization, pp. 21–29. ACM Press (1996)
Google Scholar
Zesch, T., Müller, C., Gurevych, I.: Extracting lexical semantic knowledge from wikipedia and wiktionary. In: Proceedings of the 6th International Conference on Language Resources and Evaluation, Marrakech, Morocco, electronic proceedings (May 2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Dipartimento di Informatica, Università di Bari, Italy
Fulvio Rotella, Stefano Ferilli & Fabio Leuzzi
Centro Interdipartimentale per la Logica e sue Applicazioni, Università di Bari, Italy
Stefano Ferilli

Authors

Fulvio Rotella
View author publications
You can also search for this author in PubMed Google Scholar
Stefano Ferilli
View author publications
You can also search for this author in PubMed Google Scholar
Fabio Leuzzi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Information Engineering, University of Padua, Via Gradenigo, 6/a, 35131, Padua, Italy
Maristella Agosti & Nicola Ferro &
Department of Computer Science, University of Bari, Via E. Orabona, 4, 70126, Bari, Italy
Floriana Esposito & Stefano Ferilli &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rotella, F., Ferilli, S., Leuzzi, F. (2013). A Domain Based Approach to Information Retrieval in Digital Libraries. In: Agosti, M., Esposito, F., Ferilli, S., Ferro, N. (eds) Digital Libraries and Archives. IRCDL 2012. Communications in Computer and Information Science, vol 354. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35834-0_14

Download citation

DOI: https://doi.org/10.1007/978-3-642-35834-0_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35833-3
Online ISBN: 978-3-642-35834-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics