skip to main content
10.1145/2009916.2010168acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
tutorial

Enhancing web search by mining search and browse logs

Published:24 July 2011Publication History

ABSTRACT

Huge amounts of search log data have been accumulated in various search engines. Currently, a commercial search engine receives billions of queries and collects tera-bytes of log data on any single day. Other than search log data, browse logs can be collected by client-side browser plug-ins, which record the browse information if users' permissions are granted. Such massive amounts of search/browse log data, on the one hand, provide great opportunities to mine the wisdom of crowds and improve web search results. On the other hand, designing effective and efficient methods to clean, model, and process large scale log data also presents great challenges.

In this tutorial, we will focus on mining search and browse log data for search engines. We will start with an introduction of search and browse log data and an overview of frequently-used data summarization in log mining. We will then elaborate how log mining applications enhance the five major components of a search engine, namely, query understanding, document understanding, query-document matching, user understanding, and monitoring and feedbacks. For each aspect, we will survey the major tasks, fundamental principles, and state-of-the-art methods. Finally, we will discuss the challenges and future trends of log data mining.

The goal of this tutorial is to provide a systematic survey on large-scale search/browse log mining to the IR community. It may help IR researchers to get familiar with the core challenges and promising directions in log mining. At the same time, this tutorial may also serve the developers of web information retrieval systems as a comprehensive and in-depth reference to the advanced log mining techniques.

Index Terms

  1. Enhancing web search by mining search and browse logs

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
        July 2011
        1374 pages
        ISBN:9781450307574
        DOI:10.1145/2009916

        Copyright © 2011 Authors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 24 July 2011

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • tutorial

        Acceptance Rates

        Overall Acceptance Rate792of3,983submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader