research-article

Web usage mining for enhancing search-result delivery and helping users to find interesting web content

Author:

Ida MeleAuthors Info & Claims

WSDM '13: Proceedings of the sixth ACM international conference on Web search and data mining

Pages 765 - 770

https://doi.org/10.1145/2433396.2433493

Published: 04 February 2013 Publication History

Abstract

Web usage mining is the application of data mining techniques to the data generated by the interactions of users with web servers. This kind of data, stored in server logs, represents a valuable source of information, which can be exploited to optimize the document-retrieval task, or to better understand, and thus, satisfy user needs.

Our research focuses on two important issues: improving search-engine performance through static caching of search results, and helping users to find interesting web pages by recommending news articles and blog posts.

Concerning the static caching of search results, we present the query covering approach. The general idea is to populate the cache with those documents that contribute to the result pages of a large number of queries, as opposed to caching the top documents of most frequent queries.

For the recommendation of web pages, we present a graph-based approach, which leverages the user-browsing logs to identify early adopters. These users discover interesting content before others, and monitoring their activity we can find web pages to recommend.

References

[1]

R. Agrawal, S. Gollapudi, A. Halverson, and S. Ieong. Diversifying search results. In WSDM, 2009.

Digital Library

[2]

A. Anagnostopoulos, L. Becchetti, S. Leonardi, I. Mele, and P. Sankowski. Stochastic query covering. In WSDM, 2011.

Digital Library

[3]

R. Baeza-Yates, C. Hurtado, and M. Mendoza. Improving search engines by query clustering. J. Am. Soc. Inf. Sci. Technol., 58(12):1793--1804, 2007.

Digital Library

[4]

R. Baeza-Yates, F. Junqueira, V. Plachouras, and H. F. Witschel. Admission policies for caches of search engine results. In SPIRE, 2007.

Digital Library

[5]

P. Boldi, F. Bonchi, C. Castillo, D. Donato, A. Gionis, and S. Vigna. The query-flow graph: model and applications. In CIKM, 2008.

Digital Library

[6]

A. Bookstein. Information retrieval: A sequential learning process. Journal of the American Society for Information Science, 34(5):331--342, 1983.

[7]

G. Capannini, F. M. Nardini, R. Perego, and F. Silvestri. Efficient diversification of web search results. Proc. VLDB Endow., 4:451--459, 2011.

Digital Library

[8]

J. Carbonell and J. Goldstein. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In SIGIR, 1998.

Digital Library

[9]

C. L. Clarke, N. Craswell, and I. Soboroff. Overview of the TREC 2009 Web Track. In TREC, 2009.

[10]

C. L. Clarke, M. Kolla, G. V. Cormack, O. Vechtomova, A. Ashkan, S. Büttcher, and I. MacKinnon. Novelty and diversity in information retrieval evaluation. In SIGIR, 2008.

Digital Library

[11]

A. Das, M. Datar, A. Garg, and S. Rajaram. Google news personalization: scalable online collaborative filtering. In WWW, 2007.

Digital Library

[12]

T. Fagni, R. Perego, F. Silvestri, and S. Orlando. Boosting the performance of web search engines: Caching and prefetching query results by exploiting historical usage data. ACM Trans. Inf. Syst., 24(1):51--78, 2006.

Digital Library

[13]

A. Goyal, F. Bonchi, and L. V. S. Lakshmanan. Learning influence probabilities in social networks. In WSDM, 2010.

Digital Library

[14]

F. Grandoni, A. Gupta, S. Leonardi, P. Miettinen, P. Sankowski, and M. Singh. Set covering with our eyes closed. In FOCS '08, pages 347--356. IEEE Computer Society, 2008.

Digital Library

[15]

E. P. Markatos. On caching search engine query results. Computer Communications, 24(2):137--143, 2001.

Digital Library

[16]

I. Mele, F. Bonchi, and A. Gionis. The early-adopter graph and its application to web-page recommendation. In CIKM, 2012.

Digital Library

[17]

V. V. Raghavan and H. Sever. On the reuse of past optimal queries. In SIGIR, 1995.

Digital Library

[18]

P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, and J. Riedl. Grouplens: an open architecture for collaborative filtering of netnews. In CSCW, 1994.

Digital Library

[19]

C. Silverstein, M. Henzinger, H. Marais, and M. Moricz. Analysis of a very large web search engine query log. In ACM SIGIR Forum, pages 6--12, 1999.

Digital Library

[20]

A. Spink, D. Wolfram, M. B. J. Jansen, and T. Saracevic. Searching the web: the public and their queries. J. Amer. Soc. Inform. Sci. Tech., 52(3):226--234, 2001.

Digital Library

[21]

J. Srivastava, R. Cooley, M. Deshpande, and P.-N. Tan. Web usage mining: discovery and applications of usage patterns from Web data. SIGKDD Explor. Newsl., 1(2):12--23, 2000.

Digital Library

[22]

R. W. White, M. Bilenko, and S. Cucerzan. Studying the use of popular destinations to enhance web search interaction. In SIGIR, 2007.

Digital Library

[23]

Y. Xie and D. O'Hallaron. Locality in search engine queries and its implications for caching. In IEEE Infocom 2002, pages 1238--1247, 2002.

[24]

J. Zhu, J. Hong, and J. G. Hughes. Pagecluster: Mining conceptual link hierarchies from web log files for adaptive web site navigation. ACM Trans. Internet Technol., 4(2), 2004.

Digital Library

Cited By

Hemmatian FSohrabi M(2022)A survey on classification techniques for opinion mining and sentiment analysisArtificial Intelligence Review10.1007/s10462-017-9599-652:3(1495-1545)Online publication date: 10-Mar-2022
https://dl.acm.org/doi/10.1007/s10462-017-9599-6
Srinivas KRajeshwar J(2020)Identifying User’s Interest in Using E-Payment SystemsInnovations in Computer Science and Engineering10.1007/978-981-15-2043-3_40(353-361)Online publication date: 4-Mar-2020
https://doi.org/10.1007/978-981-15-2043-3_40
Kaushik NBhatia M(2020)Information Retrieval from Search Engine Using Particle Swarm OptimizationAdvances in Computing and Intelligent Systems10.1007/978-981-15-0222-4_11(127-140)Online publication date: 3-Jan-2020
https://doi.org/10.1007/978-981-15-0222-4_11
Show More Cited By

Index Terms

Web usage mining for enhancing search-result delivery and helping users to find interesting web content
1. Information systems
  1. Information retrieval
    1. Information retrieval query processing
    2. Retrieval tasks and goals
      1. Document filtering
      2. Information extraction
  2. Information systems applications
    1. Data mining

Recommendations

Web usage mining: discovery and applications of usage patterns from Web data

Web usage mining is the application of data mining techniques to discover usage patterns from Web data, in order to understand and better serve the needs of Web-based applications. Web usage mining consists of three phases, namely preprocessing, pattern ...
Discovery of Interesting Association Rules Based on Web Usage Mining
MEDIACOM '10: Proceedings of the 2010 International Conference on Multimedia Communications

Mining of association rules is an important research topic in web usage mining. The purpose of this paper is to research how to dig interesting association rules effectively from the Web logs after been preprocessed. Firstly, using the FP-growth ...
Web personalization based on usage mining
FDIA'09: Proceedings of the Third BCS-IRSG conference on Future Directions in Information Access

Personalized or recommender systems are a particular type of information filtering applications. User profiles, representing the information needs and preferences of users, can be inferred from log or clickthrough data, or the ratings that users provide ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WSDM '13: Proceedings of the sixth ACM international conference on Web search and data mining

February 2013

816 pages

ISBN:9781450318693

DOI:10.1145/2433396

General Chairs:
Stefano Leonardi
Sapienza University of Rome, Italy
,
Alessandro Panconesi
Sapienza University of Rome, Italy
,
Program Chairs:
Paolo Ferragina
University of Pisa, Italy
,
Aristides Gionis
Yahoo! Research, Barcelona, Spain

Copyright © 2013 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 February 2013

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

WSDM 2013

Sponsor:

WSDM 2013: Sixth ACM International Conference on Web Search and Data Mining

February 4 - 8, 2013

Rome, Italy

Acceptance Rates

Overall Acceptance Rate 498 of 2,863 submissions, 17%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

14
Total Citations
View Citations
729
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)0

Reflects downloads up to 20 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Hemmatian FSohrabi M(2022)A survey on classification techniques for opinion mining and sentiment analysisArtificial Intelligence Review10.1007/s10462-017-9599-652:3(1495-1545)Online publication date: 10-Mar-2022
https://dl.acm.org/doi/10.1007/s10462-017-9599-6
Srinivas KRajeshwar J(2020)Identifying User’s Interest in Using E-Payment SystemsInnovations in Computer Science and Engineering10.1007/978-981-15-2043-3_40(353-361)Online publication date: 4-Mar-2020
https://doi.org/10.1007/978-981-15-2043-3_40
Kaushik NBhatia M(2020)Information Retrieval from Search Engine Using Particle Swarm OptimizationAdvances in Computing and Intelligent Systems10.1007/978-981-15-0222-4_11(127-140)Online publication date: 3-Jan-2020
https://doi.org/10.1007/978-981-15-0222-4_11
Vassio LDrago IMellia MHouidi ZLamali M(2018)You, the Web, and Your DeviceACM Transactions on the Web10.1145/323146612:4(1-30)Online publication date: 27-Sep-2018
https://dl.acm.org/doi/10.1145/3231466
Xin SYang HXian WEster MBu JWang ZWang CGuo YFarooq F(2018)Mobile Access Record Resolution on Large-Scale Identifier-Linkage GraphsProceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining10.1145/3219819.3219916(886-894)Online publication date: 19-Jul-2018
https://dl.acm.org/doi/10.1145/3219819.3219916
Vassio LMellia MFigueiredo FCouto da Silva AAlmeida J(2017)Mining and modeling web trajectories from passive traces2017 IEEE International Conference on Big Data (Big Data)10.1109/BigData.2017.8258416(4016-4021)Online publication date: Dec-2017
https://doi.org/10.1109/BigData.2017.8258416
Sathya Bama SIrfan Ahmed MSaravanan A(2017)Relevance Re-ranking Through Proximity Based Term Frequency ModelICT Innovations 201610.1007/978-3-319-68855-8_22(219-229)Online publication date: 12-Oct-2017
https://doi.org/10.1007/978-3-319-68855-8_22
Sunena Kaur K(2016)Web usage mining-current trends and future challenges2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT)10.1109/ICEEOT.2016.7754915(1409-1414)Online publication date: Mar-2016
https://doi.org/10.1109/ICEEOT.2016.7754915
Sun HSun JChen H(2016)Mining Frequent Attack Sequence in Web LogsGreen, Pervasive, and Cloud Computing10.1007/978-3-319-39077-2_16(243-260)Online publication date: 3-May-2016
https://doi.org/10.1007/978-3-319-39077-2_16
Goel NGupta SJha C(2015)ANALYZING WEB LOGS OF AN ASTROLOGICAL WEBSITE USING KEY INFLUENCERSIARS International Research Journal10.51611/iars.irj.v5i1.2015.455:1Online publication date: 8-Feb-2015
https://doi.org/10.51611/iars.irj.v5i1.2015.45
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents