Skip to main content

A Domain Based Approach to Information Retrieval in Digital Libraries

  • Conference paper
Digital Libraries and Archives (IRCDL 2012)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 354))

Included in the following conference series:

Abstract

The current abundance of electronic documents requires automatic techniques that support the users in understanding their content and extracting useful information. To this aim, improving the retrieval performance must necessarily go beyond simple lexical interpretation of the user queries, and pass through an understanding of their semantic content and aims. It goes without saying that any digital library would take enormous advantage from the availability of effective Information Retrieval techniques to provide to their users. This paper proposes an approach to Information Retrieval based on a correspondence of the domain of discourse between the query and the documents in the repository. Such an association is based on standard general-purpose linguistic resources (WordNet and WordNet Domains) and on a novel similarity assessment technique. Although the work is at a preliminary stage, interesting initial results suggest to go on extending and improving the approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Angioni, M., Demontis, R., Tuveri, F.: A semantic approach for resource cataloguing and query resolution. Communications of SIWN. Special Issue on Distributed Agent-based Retrieval Tools 5, 62–66 (2008)

    Google Scholar 

  2. Bradford, R.B.: An empirical study of required dimensionality for large-scale latent semantic indexing applications. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management, CIKM 2008, pp. 153–162. ACM, New York (2008)

    Chapter  Google Scholar 

  3. Deerwester, S.: Improving Information Retrieval with Latent Semantic Indexing. In: Borgman, C.L., Pai, E.Y.H. (eds.) Proceedings of the 51st ASIS Annual Meeting (ASIS 1988), Atlanta, Georgia, vol. 25. American Society for Information Science (October 1988)

    Google Scholar 

  4. Dhillon, I.S., Modha, D.S.: Concept decompositions for large sparse text data using clustering. In: Machine Learning, pp. 143–175 (2001)

    Google Scholar 

  5. Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)

    MATH  Google Scholar 

  6. Ferilli, S.: Automatic Digital Document Processing and Management: Problems, Algorithms and Techniques, 1st edn. Springer Publishing Company, Incorporated (2011)

    Google Scholar 

  7. Ferilli, S., Biba, M., Di Mauro, N., Basile, T.M.A., Esposito, F.: Plugging Taxonomic Similarity in First-Order Logic Horn Clauses Comparison. In: Serra, R., Cucchiara, R. (eds.) AI*IA 2009. LNCS (LNAI), vol. 5883, pp. 131–140. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  8. Jones, W.P., Furnas, G.W.: Pictures of relevance: A geometric analysis of similarity measures. Journal of the American Society for Information Science 38(6), 420–442 (1987)

    Article  Google Scholar 

  9. Karypis, G., Han, E.-H.(S.): Concept indexing: A fast dimensionality reduction algorithm with applications to document retrieval and categorization. Technical report, In CIKM 2000 (2000)

    Google Scholar 

  10. Klein, D., Manning, C.D.: Fast exact inference with a factored model for natural language parsing. In: Advances in Neural Information Processing Systems, vol. 15. MIT Press (2003)

    Google Scholar 

  11. MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Le Cam, L.M., Neyman, J. (eds.) Proc. of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press (1967)

    Google Scholar 

  12. Magnini, B., Cavaglià, G.: Integrating subject field codes into wordnet, pp. 1413–1418 (2000)

    Google Scholar 

  13. Robertson, S.E., Walker, S., Jones, S., Hancock-Beaulieu, M., Gatford, M.: Okapi at trec-3. In: TREC (1994)

    Google Scholar 

  14. Salton, G.: The SMART Retrieval System–Experiments in Automatic Document Processing. Prentice-Hall, Inc., Upper Saddle River (1971)

    Google Scholar 

  15. Salton, G.: Automatic term class construction using relevance–a summary of work in automatic pseudoclassification. Inf. Process. Manage. 16(1), 1–15 (1980)

    Article  Google Scholar 

  16. Salton, G., McGill, M.: Introduction to Modern Information Retrieval. McGraw-Hill Book Company (1984)

    Google Scholar 

  17. Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18, 613–620 (1975)

    Article  MATH  Google Scholar 

  18. Singhal, A., Buckley, C., Mitra, M., Mitra, A.: Pivoted document length normalization, pp. 21–29. ACM Press (1996)

    Google Scholar 

  19. Zesch, T., Müller, C., Gurevych, I.: Extracting lexical semantic knowledge from wikipedia and wiktionary. In: Proceedings of the 6th International Conference on Language Resources and Evaluation, Marrakech, Morocco, electronic proceedings (May 2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rotella, F., Ferilli, S., Leuzzi, F. (2013). A Domain Based Approach to Information Retrieval in Digital Libraries. In: Agosti, M., Esposito, F., Ferilli, S., Ferro, N. (eds) Digital Libraries and Archives. IRCDL 2012. Communications in Computer and Information Science, vol 354. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35834-0_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35834-0_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35833-3

  • Online ISBN: 978-3-642-35834-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics