Skip to main content

A Hybrid Approach for Indexing and Retrieval of Archaeological Textual Information

  • Conference paper
Knowledge-Based and Intelligent Information and Engineering Systems (KES 2010)

Abstract

This paper focuses on the problem of archaeological textual information retrieval, covering various field-related topics, and investigating different issues related to special characteristics of Arabic.

The suggested hybrid retrieval approach employs various clustering and classification methods that enhances both retrieval and presentation, and infers further information from the results returned by a primary retrieval engine, which, in turn, uses Latent Semantic Analysis (LSA) as a primary retrieval method. In addition, a stemmer for Arabic words was designed and implemented to facilitate the indexing process and to enhance the quality of retrieval.

The performance of our module was measured by carrying out experiments using standard datasets, where the system showed promising results with many possibilities for future research and further development.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Akritas, G., Malaschonok, G.I.: Applications of Singular-Value Decomposition. Mathematics and Computers in Simulation 67(1-2), 15–31 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  2. Berkhin, P.: Survey of clustering data mining techniques. Tech. Rep., Accrue Software, San Jose, CA (2002)

    Google Scholar 

  3. Berry, M.W., Dumais, S.T., O’Brien, G.W.: Using Linear Algebra for Intelligent Information Retrieval. SIAM Review 37(4), 573–595 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  4. Chen, F.G.: Building an Arabic Stemmer for Information Retrieval. In: Proc. Eleventh Text Retrieval Conference TREC 2002, Gaithersburg, Maryland, USA, pp. 19–22 (2002)

    Google Scholar 

  5. Deerwester, S., Dumais, S., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by Latent Semantic Analysis. Journal of the Society for Information Science 41(6), 391–407 (1990)

    Article  Google Scholar 

  6. Dumais, S., Platt, J., Heckerman, D., Sahami, M.: Inductive learning algorithms and representations for text categorization. In: 7th ACM International Conference on Information and Knowledge Management ACM-CIKM 1998, Bethesda, USA, pp. 148–155 (1998)

    Google Scholar 

  7. Fox: Lexical Analysis and Stoplists. In: Frakes, W.B., Baeza-Yates, R. (eds.) Information Retrieval: Data Structures and Algorithms. Prentice Hall, Englewood Cliffs (1992)

    Google Scholar 

  8. Frakes, W.B.: Stemming Algorithms. In: Frakes, W.B., Baeza-Yates, R. (eds.) Information Retrieval: Data Structures. Prentice Hall, Englewood Cliffs (1992)

    Google Scholar 

  9. Frakes, B., Baeza-Yates, R.: Information Retrieval: Data Structures and Algorithms. Prentice-Hall, Englewood Cliffs (1992)

    Google Scholar 

  10. Halabi, A.D.I., Keshishian, R., Rehawi, O.: The Archaeological Text Retrieval System. BSc. thesis, Dept. Artificial Intelligence, Faculty of Informatics, University of Aleppo (2007)

    Google Scholar 

  11. Hearst, M.A., Pedersen, J.O.: Reexamining the Cluster Hypothesis: Scatter/Gather on Retrieval Results. In: Proc. 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 1996), Zurich, Switzerland, June 1996, pp. 76–84 (1996)

    Google Scholar 

  12. Hull: Stemming algorithms – A case study for detailed evaluation. Journal of the American Society for Information Science 47(1), 70–84 (1996)

    Article  Google Scholar 

  13. Landauer, T.K., Foltz, P.W., Laham, D.: Introduction to Latent Semantic Analysis. Discourse Processes 25, 259–284 (1998)

    Article  Google Scholar 

  14. Landauer, T.K., Littman, M.L.: A statistical method for language-independent representation of the topical content of text segments. In: Proc. Eleventh International Conference: Expert Systems and Their Applications, Avignon, France, vol. 8, pp. 77–85 (May 1991)

    Google Scholar 

  15. Larkey, L., Ballesteros, L., Connell, M.E.: Improving stemming for Arabic information retrieval: light stemming and co-occurrence analysis. In: Proc. 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Tampere, Finland, pp. 275–282 (2002)

    Google Scholar 

  16. Larkey, L., Ballesteros, L., Connell, M.: Light Stemming for Arabic Information Retrieval. In: Soudi, A., van den Bosch, A., Neumann, G. (eds.) Arabic Computational Morphology: Knowledge-based and Empirical Methods. Series on Text, Speech, and Language Technology. Kluwer/Springer’s (2005)

    Google Scholar 

  17. Lerman, K.: Document Clustering in Reduced Dimension Vector Space (1999) (unpublished), http://www.isi.edu/~lerman/papers/papers.html (retrieved on 13-08-2007)

  18. Lewis, D.D.: Naive Bayes at forty: The independence assumption in information retrieval. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 4–15. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  19. Littman, M.L., Dumais, S.T., Landauer, T.K.: Automatic cross-language information retrieval using latent semantic indexing. In: Grefenstette, G. (ed.) Cross-Language Information Retrieval, pp. 51–62. Kluwer Academic Publishers, Dordrecht (1998)

    Google Scholar 

  20. Littman, M.L., Jiang, F.: A Comparison of Two Corpus-Based Methods for Translingual Information Retrieval. Tech. Rep. CS-98-11, Duke University, Department of Computer Science, Durham, NC (June 1998)

    Google Scholar 

  21. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval, August 13. Cambridge University Press, Cambridge (2007), http://www-csli.stanford.edu/~schuetze/information-retrieval-book.html

  22. Sahami, M.: Using Machine Learning to Improve Information Access. Ph.d. thesis, Dept. Computer Science, Stanford University (1999)

    Google Scholar 

  23. Sahami, M., Dumais, S., Heckerman, D., Horvitz, E.: A Bayesian Approach to Filtering Junk E-mail. In: Proc. AAAI 1998 Workshop on Learning for Text Categorization, Madison, Wisconsin, USA, pp. 55–62 (1998)

    Google Scholar 

  24. Schutze, H., Silverstein, C.: Projections for efficient document clustering. In: Proc. 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Philadelphia, Pennsylvania, USA, pp. 74–81 (1997)

    Google Scholar 

  25. Al-Sulaiti, L., Atwell, E.: Designing and Developing a Corpus of Contemporary Arabic. In: Proc. Sixth TALC Conference, Granada, Spain, p. 92 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Halabi, A., Islim, AD., Kurdi, MZ. (2010). A Hybrid Approach for Indexing and Retrieval of Archaeological Textual Information. In: Setchi, R., Jordanov, I., Howlett, R.J., Jain, L.C. (eds) Knowledge-Based and Intelligent Information and Engineering Systems. KES 2010. Lecture Notes in Computer Science(), vol 6279. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15384-6_56

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15384-6_56

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15383-9

  • Online ISBN: 978-3-642-15384-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics