Distributed Web Log Mining Using Maximal Large Itemsets

Sayal, Mehmet; Scheuermann, Peter

doi:10.1007/PL00011675

Distributed Web Log Mining Using Maximal Large Itemsets

Regular Paper
Published: November 2001

Volume 3, pages 389–404, (2001)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Mehmet Sayal¹ &
Peter Scheuermann²

85 Accesses
11 Citations
Explore all metrics

Abstract.

We introduce a partitioning-based distributed document-clustering algorithm using user access patterns from multi-server web sites. Our algorithm makes it possible to exploit simultaneously adaptive document replication and persistent connections, two techniques that are most effective in decreasing the response time that is observed by web users. The algorithm first distributes the user access data evenly among the servers by using a hash function. Then, each server generates a local clustering on its fair share of the user sessions records by employing a traditional single-machine document-clustering algorithm. Finally, those local clustering results are combined together by using a novel procedure that generates maximal large itemsets of web documents. We present preliminary experimental results and discuss alternative approaches to be pursued in the future.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Author information

Authors and Affiliations

Hewlett-Packard Labs, Palo Alto, California, USA, , , , , , US
Mehmet Sayal
Department of Electrical and Computer Engineering, Northwestern University, Evanston, Illinois, USA, , , , , , US
Peter Scheuermann

Authors

Mehmet Sayal
View author publications
You can also search for this author inPubMed Google Scholar
Peter Scheuermann
View author publications
You can also search for this author inPubMed Google Scholar

Additional information

Received 30 August 2000 / Revised 30 January 2001 / Accepted in revised form 9 May 2001

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sayal, M., Scheuermann, P. Distributed Web Log Mining Using Maximal Large Itemsets. Knowledge and Information Systems 3, 389–404 (2001). https://doi.org/10.1007/PL00011675

Download citation

Issue Date: November 2001
DOI: https://doi.org/10.1007/PL00011675

Keywords: Maximal large itemsets; User access patterns; Web document clustering

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Distributed Web Log Mining Using Maximal Large Itemsets

Abstract.

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A dockerized framework for hierarchical frequency-based document clustering on cloud computing infrastructures

Study and Analysis of Apriori and K-Means Algorithms for Web Mining

Efficient Techniques for Clustering of Users on Web Log Data

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Subscribe and save

Buy Now

Distributed Web Log Mining Using Maximal Large Itemsets

Abstract.

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A dockerized framework for hierarchical frequency-based document clustering on cloud computing infrastructures

Study and Analysis of Apriori and K-Means Algorithms for Web Mining

Efficient Techniques for Clustering of Users on Web Log Data

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Subscribe and save

Buy Now