Skip to main content

The Intention Behind Web Queries

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4209))

Abstract

The identification of the user’s intention or interest through queries that they submit to a search engine can be very useful to offer them more adequate results. In this work we present a framework for the identification of user’s interest in an automatic way, based on the analysis of query logs. This identification is made from two perspectives, the objectives or goals of a user and the categories in which these aims are situated. A manual classification of the queries was made in order to have a reference point and then we applied supervised and unsupervised learning techniques. The results obtained show that for a considerable amount of cases supervised learning is a good option, however through unsupervised learning we found relationships between users and behaviors that are not easy to detect just taking the query words. Also, through unsupervised learning we established that there are categories that we are not able to determine in contrast with other classes that were not considered but naturally appear after the clustering process. This allowed us to establish that the combination of supervised and unsupervised learning is a good alternative to find user’s goals. From supervised learning we can identify the user interest given certain established goals and categories; on the other hand, with unsupervised learning we can validate the goals and categories used, refine them and select the most appropriate to the user’s needs.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Mobasher, B.: Practical Handbook of Internet Computing. CRC Press, Boca Raton (2005)

    Google Scholar 

  2. Baeza-Yates, R., Hurtado, C.A., Mendoza, M.: Query Recommendation Using Query Logs in Search Engines. In: Lindner, W., et al. (eds.) EDBT 2004. LNCS, vol. 3268, pp. 588–596. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  3. Baeza-Yates, R.: Applications of web query mining. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 7–22. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  4. Broder, A.: A taxonomy of web search. SIGIR Forum 36, 3–10 (2002)

    Article  Google Scholar 

  5. Rose, D.E., Levinson, D.: Understanding user goals in web search. In: International conference on WWW, pp. 13–19. ACM Press, New York (2004)

    Google Scholar 

  6. Lee, U., Liu, Z., Cho, J.: Automatic identification of user goals in web search. In: International conference on WWW, pp. 391–400. ACM Press, New York (2005)

    Google Scholar 

  7. Speretta, M., Gauch, S.: Personalizing search based on user search history (2004)

    Google Scholar 

  8. Burges, C.J.C.: A tutorial on support vector machines for pattern recognition. Data Min. Knowl. Discov. 2, 121–167 (1998)

    Article  Google Scholar 

  9. Hofmann, T.: Probabilistic latent semantic analysis. In: Proc. of Uncertainty in Artificial Intelligence, Stockholm (1999)

    Google Scholar 

  10. Basu, A., Watters, C., Shepherd, M.: Support vector machines for text categorization. In: International Conference on System Sciences, p. 103.3. IEEE Computer Society Press, Los Alamitos (2003)

    Google Scholar 

  11. Dietterich, T.G., Bakiri, G.: Solving multiclass learning problems via error-correcting output codes. Journal of AI Research 2, 263–286 (1995)

    MATH  Google Scholar 

  12. Jin, X., Zhou, Y., Mobasher, B.: Web usage mining based on probabilistic latent semantic analysis. In: Knowledge discovery and data mining, pp. 197–205. ACM Press, New York (2004)

    Google Scholar 

  13. Lin, C., Xue, G.R., Zeng, H.J., Yu, Y.: Using probabilistic latent semantic analysis for personalized web search. In: Web Technologies Research and Development, pp. 707–717. Springer, Berlin (2005)

    Chapter  Google Scholar 

  14. Schein, A., Popescul, A., Ungar, L.: Pennaspect: A two-way aspect model implementation. Technical report (Department of Computer and Information Science, The University of Pennsylvania)

    Google Scholar 

  15. Spink, A., Wolfram, D., Jansen, M.B.J., Saracevic, T.: Searching the web: the public and their queries. Journal of the American Society for Information Science and Technology 52, 226–234 (2001)

    Article  Google Scholar 

  16. Jansen, B.J., Spink, A.: An analysis of web searching by european alltheweb.com users. Information Processing and Management: an International Journal 41, 361–381 (2005)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Baeza-Yates, R., Calderón-Benavides, L., González-Caro, C. (2006). The Intention Behind Web Queries. In: Crestani, F., Ferragina, P., Sanderson, M. (eds) String Processing and Information Retrieval. SPIRE 2006. Lecture Notes in Computer Science, vol 4209. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11880561_9

Download citation

  • DOI: https://doi.org/10.1007/11880561_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-45774-9

  • Online ISBN: 978-3-540-45775-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics