skip to main content
10.1145/2433396.2433493acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Web usage mining for enhancing search-result delivery and helping users to find interesting web content

Published: 04 February 2013 Publication History

Abstract

Web usage mining is the application of data mining techniques to the data generated by the interactions of users with web servers. This kind of data, stored in server logs, represents a valuable source of information, which can be exploited to optimize the document-retrieval task, or to better understand, and thus, satisfy user needs.
Our research focuses on two important issues: improving search-engine performance through static caching of search results, and helping users to find interesting web pages by recommending news articles and blog posts.
Concerning the static caching of search results, we present the query covering approach. The general idea is to populate the cache with those documents that contribute to the result pages of a large number of queries, as opposed to caching the top documents of most frequent queries.
For the recommendation of web pages, we present a graph-based approach, which leverages the user-browsing logs to identify early adopters. These users discover interesting content before others, and monitoring their activity we can find web pages to recommend.

References

[1]
R. Agrawal, S. Gollapudi, A. Halverson, and S. Ieong. Diversifying search results. In WSDM, 2009.
[2]
A. Anagnostopoulos, L. Becchetti, S. Leonardi, I. Mele, and P. Sankowski. Stochastic query covering. In WSDM, 2011.
[3]
R. Baeza-Yates, C. Hurtado, and M. Mendoza. Improving search engines by query clustering. J. Am. Soc. Inf. Sci. Technol., 58(12):1793--1804, 2007.
[4]
R. Baeza-Yates, F. Junqueira, V. Plachouras, and H. F. Witschel. Admission policies for caches of search engine results. In SPIRE, 2007.
[5]
P. Boldi, F. Bonchi, C. Castillo, D. Donato, A. Gionis, and S. Vigna. The query-flow graph: model and applications. In CIKM, 2008.
[6]
A. Bookstein. Information retrieval: A sequential learning process. Journal of the American Society for Information Science, 34(5):331--342, 1983.
[7]
G. Capannini, F. M. Nardini, R. Perego, and F. Silvestri. Efficient diversification of web search results. Proc. VLDB Endow., 4:451--459, 2011.
[8]
J. Carbonell and J. Goldstein. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In SIGIR, 1998.
[9]
C. L. Clarke, N. Craswell, and I. Soboroff. Overview of the TREC 2009 Web Track. In TREC, 2009.
[10]
C. L. Clarke, M. Kolla, G. V. Cormack, O. Vechtomova, A. Ashkan, S. Büttcher, and I. MacKinnon. Novelty and diversity in information retrieval evaluation. In SIGIR, 2008.
[11]
A. Das, M. Datar, A. Garg, and S. Rajaram. Google news personalization: scalable online collaborative filtering. In WWW, 2007.
[12]
T. Fagni, R. Perego, F. Silvestri, and S. Orlando. Boosting the performance of web search engines: Caching and prefetching query results by exploiting historical usage data. ACM Trans. Inf. Syst., 24(1):51--78, 2006.
[13]
A. Goyal, F. Bonchi, and L. V. S. Lakshmanan. Learning influence probabilities in social networks. In WSDM, 2010.
[14]
F. Grandoni, A. Gupta, S. Leonardi, P. Miettinen, P. Sankowski, and M. Singh. Set covering with our eyes closed. In FOCS '08, pages 347--356. IEEE Computer Society, 2008.
[15]
E. P. Markatos. On caching search engine query results. Computer Communications, 24(2):137--143, 2001.
[16]
I. Mele, F. Bonchi, and A. Gionis. The early-adopter graph and its application to web-page recommendation. In CIKM, 2012.
[17]
V. V. Raghavan and H. Sever. On the reuse of past optimal queries. In SIGIR, 1995.
[18]
P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, and J. Riedl. Grouplens: an open architecture for collaborative filtering of netnews. In CSCW, 1994.
[19]
C. Silverstein, M. Henzinger, H. Marais, and M. Moricz. Analysis of a very large web search engine query log. In ACM SIGIR Forum, pages 6--12, 1999.
[20]
A. Spink, D. Wolfram, M. B. J. Jansen, and T. Saracevic. Searching the web: the public and their queries. J. Amer. Soc. Inform. Sci. Tech., 52(3):226--234, 2001.
[21]
J. Srivastava, R. Cooley, M. Deshpande, and P.-N. Tan. Web usage mining: discovery and applications of usage patterns from Web data. SIGKDD Explor. Newsl., 1(2):12--23, 2000.
[22]
R. W. White, M. Bilenko, and S. Cucerzan. Studying the use of popular destinations to enhance web search interaction. In SIGIR, 2007.
[23]
Y. Xie and D. O'Hallaron. Locality in search engine queries and its implications for caching. In IEEE Infocom 2002, pages 1238--1247, 2002.
[24]
J. Zhu, J. Hong, and J. G. Hughes. Pagecluster: Mining conceptual link hierarchies from web log files for adaptive web site navigation. ACM Trans. Internet Technol., 4(2), 2004.

Cited By

View all
  • (2022)A survey on classification techniques for opinion mining and sentiment analysisArtificial Intelligence Review10.1007/s10462-017-9599-652:3(1495-1545)Online publication date: 10-Mar-2022
  • (2020)Identifying User’s Interest in Using E-Payment SystemsInnovations in Computer Science and Engineering10.1007/978-981-15-2043-3_40(353-361)Online publication date: 4-Mar-2020
  • (2020)Information Retrieval from Search Engine Using Particle Swarm OptimizationAdvances in Computing and Intelligent Systems10.1007/978-981-15-0222-4_11(127-140)Online publication date: 3-Jan-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WSDM '13: Proceedings of the sixth ACM international conference on Web search and data mining
February 2013
816 pages
ISBN:9781450318693
DOI:10.1145/2433396
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 February 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. caching
  2. recommendation
  3. web usage mining

Qualifiers

  • Research-article

Conference

WSDM 2013

Acceptance Rates

Overall Acceptance Rate 498 of 2,863 submissions, 17%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2022)A survey on classification techniques for opinion mining and sentiment analysisArtificial Intelligence Review10.1007/s10462-017-9599-652:3(1495-1545)Online publication date: 10-Mar-2022
  • (2020)Identifying User’s Interest in Using E-Payment SystemsInnovations in Computer Science and Engineering10.1007/978-981-15-2043-3_40(353-361)Online publication date: 4-Mar-2020
  • (2020)Information Retrieval from Search Engine Using Particle Swarm OptimizationAdvances in Computing and Intelligent Systems10.1007/978-981-15-0222-4_11(127-140)Online publication date: 3-Jan-2020
  • (2018)You, the Web, and Your DeviceACM Transactions on the Web10.1145/323146612:4(1-30)Online publication date: 27-Sep-2018
  • (2018)Mobile Access Record Resolution on Large-Scale Identifier-Linkage GraphsProceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining10.1145/3219819.3219916(886-894)Online publication date: 19-Jul-2018
  • (2017)Mining and modeling web trajectories from passive traces2017 IEEE International Conference on Big Data (Big Data)10.1109/BigData.2017.8258416(4016-4021)Online publication date: Dec-2017
  • (2017)Relevance Re-ranking Through Proximity Based Term Frequency ModelICT Innovations 201610.1007/978-3-319-68855-8_22(219-229)Online publication date: 12-Oct-2017
  • (2016)Web usage mining-current trends and future challenges2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT)10.1109/ICEEOT.2016.7754915(1409-1414)Online publication date: Mar-2016
  • (2016)Mining Frequent Attack Sequence in Web LogsGreen, Pervasive, and Cloud Computing10.1007/978-3-319-39077-2_16(243-260)Online publication date: 3-May-2016
  • (2015)ANALYZING WEB LOGS OF AN ASTROLOGICAL WEBSITE USING KEY INFLUENCERSIARS International Research Journal10.51611/iars.irj.v5i1.2015.455:1Online publication date: 8-Feb-2015
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media