skip to main content
10.1145/1177080.1177110acmconferencesArticle/Chapter ViewAbstractPublication PagesimcConference Proceedingsconference-collections
Article

Web search clickstreams

Published: 25 October 2006 Publication History

Abstract

Search engines are a vital part of the Web and thus the Internet infrastructure. Therefore understanding the behavior of users searching the Web gives insights into trends, and enables enhancements of future search capabilities. Possible data sources for studying Web search behavior are either server-side logs or client-side logs. Unfortunately, current server-side logs are hard to obtain as they are considered proprietary by the search engine operators. Therefore we in this paper present a methodology for extracting client-side logs from the traffic exchanged between a large user group and the Internet. The added benefit of our methodology is that we do not only extract the search terms, the query sequences, and search results of each individual user but also the full clickstream, i.e., the result pages users view and the subsequently visited hyperlinked pages. We propose a finite-state Markov model that captures the user web searching and browsing behavior and allows us to deduce users' prevalent search patterns. To our knowledge, this is the first such detailed client-side analysis of clickstreams.

References

[1]
Google basic search. http://www.google.com/support/bin/static.py?page=searchguides.html&ctx=basics.
[2]
R. Atterer, M. Wnuk, and A. Schmidt. Knowing the user's every move---user activity tracking for website usability evaluation and implicit interaction. In WWW, 2006.
[3]
P. Barford. Modeling, Measurement and Performance of World Wide Web Transactions. PhD thesis, Boston University, 2001.
[4]
S. M. Beitzel, E. C. Jensen, A. Chowdhury, D. Grossman, and O. Frieder. Hourly analysis of a very large topically categorized web query log. In ACM SIGIR, 2004.
[5]
M. Chau, X. Fang, and O. R. L. Sheng. Analysis of the query logs of a web site search engine. In American Society for Information Science and Technology, 2005.
[6]
H. Cui, J.-R. Wen, J.-Y. Nie, and W.-Y. Ma. Query expansion by mining user logs. In IEEE Trans. Knowl. Data Eng. 15(4), 2003.
[7]
B. Jansen and U. Pooch. Web user studies: A review and framework for future work. In American Society of Information Science and Technology, 2001.
[8]
B. Krishnamurthy and J. Rexford. Web Protocols and Practice. Addison-Wesley, 2001.
[9]
U. Lee, Z. Liu, and J. Cho. Automatic identification of user goals in web search. In WWW, 2005.
[10]
J. Luxenburger and G. Weikum. Query-log based authority analysis for web information search. In WISE, 2004.
[11]
V. Paxson. Bro: A system for detecting network intruders in real-time. In Computer Networks, 1999.
[12]
F. Radlinski and T. Joachims. Query chains: Learning to rank from implicit feedback. In KDD, 2005.
[13]
X. Shen, B. Tan, and C. Zhai. Context-sensitive information retrieval using implicit feedback. In ACM SIGIR, 2005.
[14]
C. Silverstein, M. Henzinger, H. Marais, and M. Moricz. Analysis of a very large altavista query log. Technical report, SRC Technical Note 014, 1998.
[15]
A. Spink, B. J. Jansen, and H. C. Ozmultu. Use of query reformulation and relevance feedback by excite users. In Internet Research: Electronic Networking Applications and Policy, 2000.
[16]
A. Spink, S. Koshman, M. Park, C. Field, and B. J. Jansen. Multitasking web search on vivisimo.com. In ITCC, 2005.
[17]
A. Spink, D. Wolfram, B. Jansen, and T. Saracevic. Searching the web: The public and their queries. In American Society for Information Science and Technology, 2001.
[18]
H. Weinreich, H. Obendorf, E. Herder, and M. Mayer. Off the beaten tracks: Exploring three aspects of web navigation. In WWW, 2006.

Cited By

View all
  • (2023)Complex Event Processing in Heterogeneous Domains2023 IEEE International Symposium on Multimedia (ISM)10.1109/ISM59092.2023.00062(325-330)Online publication date: 11-Dec-2023
  • (2023)Information Retrieval from Facebook for Social Network Analysis2023 IEEE 17th International Conference on Semantic Computing (ICSC)10.1109/ICSC56153.2023.00067(329-336)Online publication date: Feb-2023
  • (2021)The Obfuscation Method of User Identification SystemApplied Cryptography and Network Security Workshops10.1007/978-3-030-81645-2_2(19-26)Online publication date: 22-Jul-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
IMC '06: Proceedings of the 6th ACM SIGCOMM conference on Internet measurement
October 2006
356 pages
ISBN:1595935614
DOI:10.1145/1177080
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 October 2006

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. HTTP traces
  2. clickstream
  3. markov model
  4. web search

Qualifiers

  • Article

Conference

IMC06
Sponsor:
IMC06: Internet Measurement Conference
October 25 - 27, 2006
Rio de Janeriro, Brazil

Acceptance Rates

Overall Acceptance Rate 277 of 1,083 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Complex Event Processing in Heterogeneous Domains2023 IEEE International Symposium on Multimedia (ISM)10.1109/ISM59092.2023.00062(325-330)Online publication date: 11-Dec-2023
  • (2023)Information Retrieval from Facebook for Social Network Analysis2023 IEEE 17th International Conference on Semantic Computing (ICSC)10.1109/ICSC56153.2023.00067(329-336)Online publication date: Feb-2023
  • (2021)The Obfuscation Method of User Identification SystemApplied Cryptography and Network Security Workshops10.1007/978-3-030-81645-2_2(19-26)Online publication date: 22-Jul-2021
  • (2019)A Hypergraph Data Model for Expert-Finding in Multimedia Social NetworksInformation10.3390/info1006018310:6(183)Online publication date: 28-May-2019
  • (2019)Modeling collective attention in online and flexible learning environmentsDistance Education10.1080/01587919.2019.160036840:2(278-301)Online publication date: 9-Apr-2019
  • (2018)You, the Web, and Your DeviceACM Transactions on the Web10.1145/323146612:4(1-30)Online publication date: 27-Sep-2018
  • (2018)A Framework for High-Level Event Detection in a Social Network Context Via an Extension of ISEQL2018 IEEE 12th International Conference on Semantic Computing (ICSC)10.1109/ICSC.2018.00028(140-147)Online publication date: Jan-2018
  • (2018)Recognizing human behaviours in online social networksComputers and Security10.1016/j.cose.2017.06.00274:C(355-370)Online publication date: 1-May-2018
  • (2017)Mining and modeling web trajectories from passive traces2017 IEEE International Conference on Big Data (Big Data)10.1109/BigData.2017.8258416(4016-4021)Online publication date: Dec-2017
  • (2015)A Geometric Representation of Collective Attention FlowsPLOS ONE10.1371/journal.pone.013624310:9(e0136243)Online publication date: 1-Sep-2015
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media