skip to main content
10.1145/2433396.2433493acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Web usage mining for enhancing search-result delivery and helping users to find interesting web content

Published:04 February 2013Publication History

ABSTRACT

Web usage mining is the application of data mining techniques to the data generated by the interactions of users with web servers. This kind of data, stored in server logs, represents a valuable source of information, which can be exploited to optimize the document-retrieval task, or to better understand, and thus, satisfy user needs.

Our research focuses on two important issues: improving search-engine performance through static caching of search results, and helping users to find interesting web pages by recommending news articles and blog posts.

Concerning the static caching of search results, we present the query covering approach. The general idea is to populate the cache with those documents that contribute to the result pages of a large number of queries, as opposed to caching the top documents of most frequent queries.

For the recommendation of web pages, we present a graph-based approach, which leverages the user-browsing logs to identify early adopters. These users discover interesting content before others, and monitoring their activity we can find web pages to recommend.

References

  1. R. Agrawal, S. Gollapudi, A. Halverson, and S. Ieong. Diversifying search results. In WSDM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Anagnostopoulos, L. Becchetti, S. Leonardi, I. Mele, and P. Sankowski. Stochastic query covering. In WSDM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Baeza-Yates, C. Hurtado, and M. Mendoza. Improving search engines by query clustering. J. Am. Soc. Inf. Sci. Technol., 58(12):1793--1804, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. Baeza-Yates, F. Junqueira, V. Plachouras, and H. F. Witschel. Admission policies for caches of search engine results. In SPIRE, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. P. Boldi, F. Bonchi, C. Castillo, D. Donato, A. Gionis, and S. Vigna. The query-flow graph: model and applications. In CIKM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. Bookstein. Information retrieval: A sequential learning process. Journal of the American Society for Information Science, 34(5):331--342, 1983.Google ScholarGoogle ScholarCross RefCross Ref
  7. G. Capannini, F. M. Nardini, R. Perego, and F. Silvestri. Efficient diversification of web search results. Proc. VLDB Endow., 4:451--459, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Carbonell and J. Goldstein. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In SIGIR, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. C. L. Clarke, N. Craswell, and I. Soboroff. Overview of the TREC 2009 Web Track. In TREC, 2009.Google ScholarGoogle Scholar
  10. C. L. Clarke, M. Kolla, G. V. Cormack, O. Vechtomova, A. Ashkan, S. Büttcher, and I. MacKinnon. Novelty and diversity in information retrieval evaluation. In SIGIR, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Das, M. Datar, A. Garg, and S. Rajaram. Google news personalization: scalable online collaborative filtering. In WWW, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. T. Fagni, R. Perego, F. Silvestri, and S. Orlando. Boosting the performance of web search engines: Caching and prefetching query results by exploiting historical usage data. ACM Trans. Inf. Syst., 24(1):51--78, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. Goyal, F. Bonchi, and L. V. S. Lakshmanan. Learning influence probabilities in social networks. In WSDM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. F. Grandoni, A. Gupta, S. Leonardi, P. Miettinen, P. Sankowski, and M. Singh. Set covering with our eyes closed. In FOCS '08, pages 347--356. IEEE Computer Society, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. E. P. Markatos. On caching search engine query results. Computer Communications, 24(2):137--143, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. I. Mele, F. Bonchi, and A. Gionis. The early-adopter graph and its application to web-page recommendation. In CIKM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. V. V. Raghavan and H. Sever. On the reuse of past optimal queries. In SIGIR, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, and J. Riedl. Grouplens: an open architecture for collaborative filtering of netnews. In CSCW, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. C. Silverstein, M. Henzinger, H. Marais, and M. Moricz. Analysis of a very large web search engine query log. In ACM SIGIR Forum, pages 6--12, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. Spink, D. Wolfram, M. B. J. Jansen, and T. Saracevic. Searching the web: the public and their queries. J. Amer. Soc. Inform. Sci. Tech., 52(3):226--234, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Srivastava, R. Cooley, M. Deshpande, and P.-N. Tan. Web usage mining: discovery and applications of usage patterns from Web data. SIGKDD Explor. Newsl., 1(2):12--23, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. R. W. White, M. Bilenko, and S. Cucerzan. Studying the use of popular destinations to enhance web search interaction. In SIGIR, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Y. Xie and D. O'Hallaron. Locality in search engine queries and its implications for caching. In IEEE Infocom 2002, pages 1238--1247, 2002.Google ScholarGoogle Scholar
  24. J. Zhu, J. Hong, and J. G. Hughes. Pagecluster: Mining conceptual link hierarchies from web log files for adaptive web site navigation. ACM Trans. Internet Technol., 4(2), 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Web usage mining for enhancing search-result delivery and helping users to find interesting web content

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            WSDM '13: Proceedings of the sixth ACM international conference on Web search and data mining
            February 2013
            816 pages
            ISBN:9781450318693
            DOI:10.1145/2433396

            Copyright © 2013 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 4 February 2013

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate498of2,863submissions,17%

            Upcoming Conference

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader