Skip to main content

Extending Information Retrieval by Adjusting Text Feature Vectors

  • Conference paper
Knowledge Technology (KTW 2011)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 295))

Included in the following conference series:

Abstract

Automatic detection of text scope is now crucial for information retrieval tasks owing to semantic, linguistic, and unexpressive content problems, which has increased the demand for uncomplicated, language-independent, and scope-based strategies. In this paper, we extend the vector of documents with exerting impressive words to simplify expressiveness of each document from extracted essential words of related documents and then analyze the network of these words to detect words that share meaningful concepts related to exactly our document. In other words, we analyze each document in only one topic: the topic of that document. We changed measures of social network analysis according to weights of the document words. The impression of these new words to the document can be exerted as changing the document vector weights or inserting these words as metadata to the document. As an example, we classified documents and compared effectiveness of our Intelligent Information Retrieval (IIR) model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ricardo, A., Baeza, Y., Ribeiro, N.: Modern Information Retrieval. Addison Wesley Longman Publishing Co., Inc., Boston (1999)

    Google Scholar 

  2. Voorhees, E.M.: On test collections for adaptive information retrieval. IPM 44(6), 1879–1885 (2008)

    Google Scholar 

  3. Zhou, D., Bian, J., Zheng, S., Zha, H., Lee Giles, C.: Exploring social annotations for information retrieval. In: Proc. of the 17th Int. Conf. on World Wide Web, Beijing, China (April 2008)

    Google Scholar 

  4. Wang, W., Barnaghi, P.M., Andrzej, B.: Semantic-enhanced Information Search and Retrieval. In: Proc. of the 6th Int. Conf. on Advanced Language Processing and Web Information Technology (2007)

    Google Scholar 

  5. Pedersen, T., Patwardhan, S., Michelizzi, J.: WordNet Similarity - measuring the relatedness of concepts. In: AAAI, pp. 1024–1025 (2004)

    Google Scholar 

  6. Jarmasz, M., Szpakowicz, S.: Roget’s Thesaurus and semantic similarity. In: Proc. of RANLP 2003, pp. 212–219 (2003)

    Google Scholar 

  7. Taheri Makhsoos, P., Kangavari, M.R., Shayegh, H.R.: Improving Feature Vector by Word’s Position and. Sequence for Text Classification. In: Int. Conf. on IT, Thailand (March 2010)

    Google Scholar 

  8. Gkantsidis, C., Mihail, M., Saberi, A.: Random walks in peer-to-peer networks: algorithms and evaluation. Perform. Eval. 63(3), 241–263 (2006)

    Article  Google Scholar 

  9. Aldous, D.: On the markov chain simulation method for uniform combinatorial distributions and simulated annealing. Probab. Engrg. Inform. Sci. 1(2), 33–46 (1987)

    Article  MATH  Google Scholar 

  10. Teevan, J., Dumais, S., Horvitz, E.: Personalizing search via automated analysis of interests and activities. SIGIR (2005)

    Google Scholar 

  11. Mislove, A., Gummadi, K.P., Druschel, P.: Exploiting Social Networks for Internet Search. HotNets (2006)

    Google Scholar 

  12. Bar-Yossef, Z., Berg, A., Chien, S., Fakcharoenphol, J., Weitz, D.: Approximating aggregate queries about web pages via random walks. In: VLDB (2000)

    Google Scholar 

  13. Bar-Yossef, Z., Gurevich, M.: Random sampling from a search engine’s index. In: WWW (2006)

    Google Scholar 

  14. Das, G., Koudas, N., Papagelis, M., Puttaswamy, S.: Efficient sampling of information in social networks. In: Proc. of the 2008 ACM Workshop on Search in Social Media, Napa Valley, California, USA (October 2008)

    Google Scholar 

  15. Isak, T., Spink, A.H.: Evaluating Usability of a Long Query Meta Search Engine. In: Sprague, S. (ed.) Proc. 40th Annual Hawaii Int. Conf. on System Sciences 2007 (HICSS 2007), Hawaii, USA (2007)

    Google Scholar 

  16. Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Communications of the ACM 18(11), 613–620 (1975)

    Article  MATH  Google Scholar 

  17. Landauer, T.K., Foltz, P.W., Laham, D.: An Introduction to Latent Semantic Analysis. Discourse Processes 25(2&3), 259–284 (1998)

    Article  Google Scholar 

  18. Hirst, G., Budanitsky, A.: Lexical Chains and Semantic Distance. In: EUROLAN, Romania (2001)

    Google Scholar 

  19. Wasserman, S., Faust, K.: Social Network Analysis: Methods and Applications. Cambridge University Press (November 1994)

    Google Scholar 

  20. Lang, K.: 20000 messages taken from 20 newsgroups, Dataset, http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/news20.html

  21. Kumar, R.: Cluster Analysis: Basic Concepts and Algorithms (2003), http://www.users.cs.umn.edu/~kumar/dmbook/ch8.pdf

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Aghassi, H., Sheykhlar, Z. (2012). Extending Information Retrieval by Adjusting Text Feature Vectors. In: Lukose, D., Ahmad, A.R., Suliman, A. (eds) Knowledge Technology. KTW 2011. Communications in Computer and Information Science, vol 295. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32826-8_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-32826-8_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-32825-1

  • Online ISBN: 978-3-642-32826-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics