A Novel Term-Term Similarity Score Based Information Foraging Assessment

Khennak, Ilyes; Drias, Habiba; Mosteghanemi, Hadia

doi:10.1007/978-3-319-19656-5_5

A Novel Term-Term Similarity Score Based Information Foraging Assessment

Ilyes Khennak²³,
Habiba Drias²³ &
Hadia Mosteghanemi²³

Conference paper
First Online: 01 January 2015

2553 Accesses

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 150))

Abstract

The dramatic proliferation of information on the web and the tremendous growth in the number of files published and uploaded online each day have led to the appearance of new words in the Internet. Due to the difficulty of reaching the meanings of these new terms, which play a central role in retrieving the desired information, it becomes necessary to give more importance to the sites and topics where these new words appear, or rather, to give value to the words that occur frequently with them. For this aim, in this paper, we propose a novel term-term similarity score based on the co-occurrence and closeness of words for retrieval performance improvement. A novel efficiency/effectiveness measure based on the principle of optimal information forager is also proposed in order to assess the quality of the obtained results. Our experiments were performed using the OHSUMED test collection and show significant effectiveness enhancement over the state-of-the-art.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Bharat, K., Broder, A.: A technique for measuring the relative size and overlap of public web search engines. Comput. Netw. ISDN Syst. 30(1), 379–388 (1998)
Article Google Scholar
Cambazoglu, B.B., Aykanat, C.: Performance of query processing implementations in ranking-based text retrieval systems using inverted indices. Inf. Process. Manage. 42(4), 875–898 (2006)
Article Google Scholar
Cambazoglu, B.B., Baeza-Yates, R.: Scalability Challenges in Web Search Engines. In: Melucci, M., Baeza-Yates, R. (eds.) Advanced Topics in Information Retrieval. The Information Retrieval Series, vol. 33, pp. 27–50. Springer, Heidelberg (2011)
Chapter Google Scholar
Carpineto, C., Romano, G.: A survey of automatic query expansion in information retrieval. ACM Comput. Surv. 44(1), 1–50 (2012)
Article MATH Google Scholar
Chen, Q., Li, M., Zhou, M.: Improving query spelling correction using web search results. In: EMNLP-CoNLL 2007: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 181–189. ACL, Stroudsburg (2007)
Google Scholar
Dix, A., Howes, A., Payne, S.: Post-web cognition: evolving knowledge strategies for global information environments. Int. J. Web Eng. Technol. 1(1), 112–126 (2003)
Article Google Scholar
Dominich, S.: The Modern Algebra of Information Retrieval. Springer, Heidelberg (2008)
MATH Google Scholar
Eisenstein, J., OConnor, B., Smith, N.A., Xing, E.P.: Mapping the geographical diffusion of new words. In: NIPS 2012: Workshop on Social Network and Social Media Analysis: Methods, Models and Applications (2012)
Google Scholar
Frøkjær, E., Hertzum, M., Hornbæk, K.: Measuring usability: are effectiveness, efficiency, and satisfaction really correlated? In: CHI 2000: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 345–352. ACM, New York (2000)
Google Scholar
Khennak, I.: Classification non supervisée floue des termes basée sur la proximité pour les systèmes de recherche d’information. In: CORIA 2013: Proceedings of the 10th French Information Retrieval Conference, pp. 341–346. Unine, Neuchâtel (2013)
Google Scholar
Khennak, I., Drias, H.: Term proximity and data mining techniques for information retrieval systems. In: Rocha, Á., Correia, A.M., Wilson, T., Stroetmann, K.A. (eds.) Advances in Information Systems and Technologies. AISC, vol. 206, pp. 477–486. Springer, Heidelberg (2013)
Chapter Google Scholar
Ntoulas, A., Cho, J., C. Olston.: What’s new on the web?: the evolution of the web from a search engine perspective. In: WWW 2004: Proceedings of the 13th International Conference on World Wide Web, pp. 1–12. ACM, New York (2004)
Google Scholar
Pirolli, P.: Information Foraging Theory: Adaptive Interaction with Information. Oxford University Press, Oxford (2007)
Book Google Scholar
Pirolli, P., Card, S.: Information foraging. Psychol. Rev. 106(4), 643–675 (1999)
Article Google Scholar
Ranganathan, P.: From microprocessors to nanostores: rethinking data centric systems. IEEE Comput. 44(1), 39–48 (2011)
Article Google Scholar
Ramos, C., Augusto, J.C., Shapiro, D.: Ambient intelligence the next step for artificial intelligence. IEEE Intell. Syst. 23(2), 15–18 (2008)
Article Google Scholar
Robertson, S.E., Jones, K.S.: Relevance weighting of search terms. J. Am. Soc. Inform. Sci. 27(3), 129–146 (1976)
Article Google Scholar
Robertson, S., Zaragoza, H.: The probabilistic relevance framework: BM25 and beyond. Found. Trends Inf. Retrieval 3(4), 333–389 (2009)
Article Google Scholar
Subramaniam, L.V., Roy, S., Faruquie, T.A., Negi, S.: A survey of types of text noise and techniques to handle noisy text. In: AND 2009: Proceedings of The Third Workshop on Analytics for Noisy Unstructured Text Data, pp. 115–122. ACM, New York (2009)
Google Scholar
Sun, H.M.: A study of the features of internet english from the linguistic perspective. Studies in Literature and Language 1(7), 9–103 (2010)
Google Scholar
Williams, H.E., Zobel, J.: Searchable words on the web. Int. J. Digit. Libr. 5(2), 99–105 (2005)
Article Google Scholar
Zhu, Y., Zhong, N., Xiong, Y.: Data explosion, data nature and dataology. In: Zhong, N., Li, K., Lu, S., Chen, L. (eds.) BI 2009. LNCS, vol. 5819, pp. 147–158. Springer, Heidelberg (2009)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Laboratory for Research in Artificial Intelligence, USTHB, Algiers, Algeria
Ilyes Khennak, Habiba Drias & Hadia Mosteghanemi

Authors

Ilyes Khennak
View author publications
You can also search for this author in PubMed Google Scholar
Habiba Drias
View author publications
You can also search for this author in PubMed Google Scholar
Hadia Mosteghanemi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ilyes Khennak .

Editor information

Editors and Affiliations

CREATE-NET, Trento, Italy
Raffaele Giaffreda
University of Trento, Trento, Italy
Radu-Laurentiu Vieriu
Management Consultant, Edna Pasher Ph.D & Associates, Tel Aviv, Israel
Edna Pasher
Management Consultant, Edna Pasher Ph.D & Associates, Tel Aviv, Israel
Gabriel Bendersky
University of Applied Sciences, Institute of Information Systems, Delémont, Switzerland
Antonio J. Jara
University of Beira Interior, Covilhã, Portugal
Joel J.P.C. Rodrigues
IBM Research Laboratory, Haifa, Israel
Eliezer Dekel
IBM Research, Haifa, Israel
Benny Mandler

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Khennak, I., Drias, H., Mosteghanemi, H. (2015). A Novel Term-Term Similarity Score Based Information Foraging Assessment. In: Giaffreda, R., et al. Internet of Things. User-Centric IoT. IoT360 2014. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 150. Springer, Cham. https://doi.org/10.1007/978-3-319-19656-5_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-19656-5_5
Published: 26 June 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19655-8
Online ISBN: 978-3-319-19656-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics