Skip to main content

A Novel Term-Term Similarity Score Based Information Foraging Assessment

  • Conference paper
  • First Online:
  • 2553 Accesses

Abstract

The dramatic proliferation of information on the web and the tremendous growth in the number of files published and uploaded online each day have led to the appearance of new words in the Internet. Due to the difficulty of reaching the meanings of these new terms, which play a central role in retrieving the desired information, it becomes necessary to give more importance to the sites and topics where these new words appear, or rather, to give value to the words that occur frequently with them. For this aim, in this paper, we propose a novel term-term similarity score based on the co-occurrence and closeness of words for retrieval performance improvement. A novel efficiency/effectiveness measure based on the principle of optimal information forager is also proposed in order to assess the quality of the obtained results. Our experiments were performed using the OHSUMED test collection and show significant effectiveness enhancement over the state-of-the-art.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Bharat, K., Broder, A.: A technique for measuring the relative size and overlap of public web search engines. Comput. Netw. ISDN Syst. 30(1), 379–388 (1998)

    Article  Google Scholar 

  2. Cambazoglu, B.B., Aykanat, C.: Performance of query processing implementations in ranking-based text retrieval systems using inverted indices. Inf. Process. Manage. 42(4), 875–898 (2006)

    Article  Google Scholar 

  3. Cambazoglu, B.B., Baeza-Yates, R.: Scalability Challenges in Web Search Engines. In: Melucci, M., Baeza-Yates, R. (eds.) Advanced Topics in Information Retrieval. The Information Retrieval Series, vol. 33, pp. 27–50. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  4. Carpineto, C., Romano, G.: A survey of automatic query expansion in information retrieval. ACM Comput. Surv. 44(1), 1–50 (2012)

    Article  MATH  Google Scholar 

  5. Chen, Q., Li, M., Zhou, M.: Improving query spelling correction using web search results. In: EMNLP-CoNLL 2007: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 181–189. ACL, Stroudsburg (2007)

    Google Scholar 

  6. Dix, A., Howes, A., Payne, S.: Post-web cognition: evolving knowledge strategies for global information environments. Int. J. Web Eng. Technol. 1(1), 112–126 (2003)

    Article  Google Scholar 

  7. Dominich, S.: The Modern Algebra of Information Retrieval. Springer, Heidelberg (2008)

    MATH  Google Scholar 

  8. Eisenstein, J., OConnor, B., Smith, N.A., Xing, E.P.: Mapping the geographical diffusion of new words. In: NIPS 2012: Workshop on Social Network and Social Media Analysis: Methods, Models and Applications (2012)

    Google Scholar 

  9. Frøkjær, E., Hertzum, M., Hornbæk, K.: Measuring usability: are effectiveness, efficiency, and satisfaction really correlated? In: CHI 2000: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 345–352. ACM, New York (2000)

    Google Scholar 

  10. Khennak, I.: Classification non supervisée floue des termes basée sur la proximité pour les systèmes de recherche d’information. In: CORIA 2013: Proceedings of the 10th French Information Retrieval Conference, pp. 341–346. Unine, Neuchâtel (2013)

    Google Scholar 

  11. Khennak, I., Drias, H.: Term proximity and data mining techniques for information retrieval systems. In: Rocha, Á., Correia, A.M., Wilson, T., Stroetmann, K.A. (eds.) Advances in Information Systems and Technologies. AISC, vol. 206, pp. 477–486. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  12. Ntoulas, A., Cho, J., C. Olston.: What’s new on the web?: the evolution of the web from a search engine perspective. In: WWW 2004: Proceedings of the 13th International Conference on World Wide Web, pp. 1–12. ACM, New York (2004)

    Google Scholar 

  13. Pirolli, P.: Information Foraging Theory: Adaptive Interaction with Information. Oxford University Press, Oxford (2007)

    Book  Google Scholar 

  14. Pirolli, P., Card, S.: Information foraging. Psychol. Rev. 106(4), 643–675 (1999)

    Article  Google Scholar 

  15. Ranganathan, P.: From microprocessors to nanostores: rethinking data centric systems. IEEE Comput. 44(1), 39–48 (2011)

    Article  Google Scholar 

  16. Ramos, C., Augusto, J.C., Shapiro, D.: Ambient intelligence the next step for artificial intelligence. IEEE Intell. Syst. 23(2), 15–18 (2008)

    Article  Google Scholar 

  17. Robertson, S.E., Jones, K.S.: Relevance weighting of search terms. J. Am. Soc. Inform. Sci. 27(3), 129–146 (1976)

    Article  Google Scholar 

  18. Robertson, S., Zaragoza, H.: The probabilistic relevance framework: BM25 and beyond. Found. Trends Inf. Retrieval 3(4), 333–389 (2009)

    Article  Google Scholar 

  19. Subramaniam, L.V., Roy, S., Faruquie, T.A., Negi, S.: A survey of types of text noise and techniques to handle noisy text. In: AND 2009: Proceedings of The Third Workshop on Analytics for Noisy Unstructured Text Data, pp. 115–122. ACM, New York (2009)

    Google Scholar 

  20. Sun, H.M.: A study of the features of internet english from the linguistic perspective. Studies in Literature and Language 1(7), 9–103 (2010)

    Google Scholar 

  21. Williams, H.E., Zobel, J.: Searchable words on the web. Int. J. Digit. Libr. 5(2), 99–105 (2005)

    Article  Google Scholar 

  22. Zhu, Y., Zhong, N., Xiong, Y.: Data explosion, data nature and dataology. In: Zhong, N., Li, K., Lu, S., Chen, L. (eds.) BI 2009. LNCS, vol. 5819, pp. 147–158. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ilyes Khennak .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Cite this paper

Khennak, I., Drias, H., Mosteghanemi, H. (2015). A Novel Term-Term Similarity Score Based Information Foraging Assessment. In: Giaffreda, R., et al. Internet of Things. User-Centric IoT. IoT360 2014. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 150. Springer, Cham. https://doi.org/10.1007/978-3-319-19656-5_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-19656-5_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-19655-8

  • Online ISBN: 978-3-319-19656-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics