skip to main content
10.1145/2786451.2786468acmconferencesArticle/Chapter ViewAbstractPublication PageswebsciConference Proceedingsconference-collections
research-article

What can be Found on the Web and How: A Characterization of Web Browsing Patterns

Published: 28 June 2015 Publication History

Abstract

In this paper, we suggest a novel approach to studying user browsing behavior, i.e., the ways users get to different pages on the Web. Namely, we classified all user browsing paths leading to web pages into several types or browsing patterns. In order to define browsing patterns, we consider several important points of the browsing path: its origin, the last page before the user gets to the domain of the target page, and the target page referrer. Each point can be of several types, which leads to 56 possible patterns. The distribution of the browsing paths over these patterns forms the navigational profile of a web page.
We conducted a comprehensive large-scale study of navigational profiles of different web pages. First, we demonstrated that the navigational profile of a web page carry crucial information about the properties of this page (e.g., its popularity and age). Second, we found that the Web consists of several typical non-overlapping clusters formed by pages of similar ranges of incoming traffic. These clusters can be characterized by the functionality of their pages.

References

[1]
R. Baeza-Yates, A. P. Jr, and N. Ziviani. The evolution of web content and search engines. In Proceedings of the 8th ACM Workshop on Web Mining and Web Usage Analysis, 2008.
[2]
P. Bailey, R. W. White, H. Liu, and G. Kumaran. Mining historic query trails to label long and rare search engine queries. In ACM Transactions on the Web, volume 4 (4), 2010.
[3]
M. Bilenko and R. W. White. Mining the search trails of surfing crowds: identifying relevant websites from user activity. In Proceedings of the 17th international conference on World Wide Web, pages 51--60, 2008.
[4]
J. Cho and S. Roy. Impact of search engines on page popularity. In Proceedings of the 13th international conference on World Wide Web, pages 20--29, 2004.
[5]
J. H. Friedman. Stochastic gradient boosting. In Comput. Stat. Data Anal., volume 38(4), pages 367--378, 2002.
[6]
S. Goel, J. M. Hofman, and M. I. Sirer. Who does what on the web: A large-scale study of browsing behavior. In Proceedings of the Sixth International AAAI Conference on Weblogs and Social Media, 2012.
[7]
T. Hastie, R. Tibshirani, and J. H. Friedman. The elements of statistical learning: data mining, inference, and prediction: with 200 full-color illustrations. New York: Springer-Verlag, 2001.
[8]
S. Ieong, N. Mishra, E. Sadikov, and L. Zhang. Domain bias in web search. In Proceedings of the fifth ACM international conference on Web search and data mining, pages 413--422, 2012.
[9]
R. Kumar and A. Tomkins. A characterization of online browsing behavior. In Proceedings of the 19th international conference on World wide web, pages 561--570, 2010.
[10]
J. Leskovec, L. Backstrom, R. Kumar, and A. Tomkins. Microscopic evolution of social networks. Proceedings of the 14th ACM SIGKDD international conference on Knowledge Discovery and Data mining, pages 462--470, 2008.
[11]
M. Liu, R. Cai, M. Zhang, and L. Zhang. User browsing behavior-driven web crawling. In Proceedings of the 20th ACM international conference on Information and knowledge management, pages 87--92, 2011.
[12]
Y. Liu, B. Gao, T.-Y. Liu, Y. Zhang, Z. Ma, S. He, and H. Li. Browserank: letting web users vote for page importance. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 451--458, 2008.
[13]
L. Ostroumova, I. Bogatyy, A. Chelnokov, A. Tikhonov, and G. Gusev. Crawling policies based on web page popularity prediction. In Advances in Information Retrieval, Lecture Notes in Computer Science, vol. 8416, pages 100--111, 2014.
[14]
F. Qiu, Z. Liu, and J. Cho. Analysis of user web traffic with a focus on search activities. In WebDB, pages 103--108, 2005.
[15]
W. M. Rand. Objective criteria for the evaluation of clustering methods. In Journal of the American Statistical Association, volume 66(336), pages 846--850, 1971.
[16]
C. R. Rao. Linear statistical inference and its applications. Wiley, New York, 1973.
[17]
A. Spink, M. Park, B. J. Jansen, and J. Pedersen. Multitasking during web search sessions. In Information Processing and Management, volume 42(1), pages 264--475, 2006.
[18]
A. Tolstikov, M. Shakhray, G. Gusev, and P. Serdyukov. Through-the-looking glass: utilizing rich post-search trail statistics for web search. In Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, pages 1897--1900, 2013.
[19]
I. Weber and A. Jaimes. Who uses web search for what: and how. In Proceedings of the fourth ACM international conference on Web search and data mining, pages 15--24, 2011.
[20]
R. W. White and J. Huang. Assessing the scenic route: measuring the value of search trails in web logs. In Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval, pages 587--594, 2010.
[21]
M. Zhukovskiy, A. Khropov, G. Gusev, and P. Serdyukov. Introducing search behavior into browsing based models of page's importance. In Proceedings of the 22nd international conference on World Wide Web companion, pages 129--130, 2013.

Cited By

View all
  • (2022)A world wide view of browsing the world wide webProceedings of the 22nd ACM Internet Measurement Conference10.1145/3517745.3561418(317-336)Online publication date: 25-Oct-2022
  • (2019)Exposing Knowledge: Providing a Real-Time View of the Domain Under Study for StudentsArtificial Intelligence XXXVI10.1007/978-3-030-34885-4_9(122-135)Online publication date: 19-Nov-2019
  • (2018)You, the Web, and Your DeviceACM Transactions on the Web10.1145/323146612:4(1-30)Online publication date: 27-Sep-2018

Index Terms

  1. What can be Found on the Web and How: A Characterization of Web Browsing Patterns

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WebSci '15: Proceedings of the ACM Web Science Conference
    June 2015
    366 pages
    ISBN:9781450336727
    DOI:10.1145/2786451
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 June 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. browsing patterns
    2. clustering of web pages
    3. user browsing behavior

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    WebSci '15
    Sponsor:
    WebSci '15: ACM Web Science Conference
    June 28 - July 1, 2015
    Oxford, United Kingdom

    Acceptance Rates

    Overall Acceptance Rate 245 of 933 submissions, 26%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)6
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 22 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)A world wide view of browsing the world wide webProceedings of the 22nd ACM Internet Measurement Conference10.1145/3517745.3561418(317-336)Online publication date: 25-Oct-2022
    • (2019)Exposing Knowledge: Providing a Real-Time View of the Domain Under Study for StudentsArtificial Intelligence XXXVI10.1007/978-3-030-34885-4_9(122-135)Online publication date: 19-Nov-2019
    • (2018)You, the Web, and Your DeviceACM Transactions on the Web10.1145/323146612:4(1-30)Online publication date: 27-Sep-2018

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media