Abstract
Clustering techniques have been applied to categorize documents on Web and extract knowledge from Web. In this paper, we introduce a novel clustering method into Web page clustering, which is an extension of affinity propagation (AP). This method is called partition adaptive affinity propagation (PAAP), which can automatically rerun AP procedure to yield optimal clustering results and eliminate number oscillations if they occur. Experiments are carried out to compare PAAP with K-means and AP on ten different Web page data sets. The results verify that PAAP can find better clusters when compared with similar methods. And the results also demonstrate that PAAP is robust and effective when clustering Web pages.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Athena, V., Theodore, D.: An Overview of Web Data Clustering Practices. In: Lindner, W., Mesiti, M., Türker, C., Tzitzikas, Y., Vakali, A.I. (eds.) EDBT 2004. LNCS, vol. 3268, pp. 597–606. Springer, Heidelberg (2004)
Jain, A.K., Murty, M.N., Flynn, P.J.: Data Clustering: A Review. ACM Computing Surveys 31(3), 264–323 (1999)
Forsati, R., Mahdavi, M., Kangavari, M., Safarkhani, B.: Web Page Clustering Using Harmony Search Optimization. In: Electrical and Computer Engineering, 2008. CCECE 2008, Canadian Conference, pp. 001601–001604 (2008)
McQueen, J.: Some Methods for Classification and Analysis of Multivariate Observations. In: Fifth Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297 (1967)
Frey, B.J., Dueck, D.: Clustering by Passing Messages Between Data Points. Science 315(5814), 972–976 (2007)
Zhang, X., Gao, J., Lu, P., Yan, Y.H.: A Novel Speaker Clustering Algorithm via Supervised Affinity Propagation. In: IEEE International Conference on Acoustics, Speech, and Signal Processing 2008, pp. 4369–4372 (2008)
Wang, K., Zhang, J., Li, D., Zhang, X., Guo, T.: Adaptive Affinity Propagation Clustering. ACTA Automatica Sinica 33(12), 1242–1246 (2007)
Sun, C., Wang, C., Song, S., Wang, Y.: A Local Approach of Adaptive Affinity Propagation. IJCNN 2009, NN-0048 (to appear, 2009)
Salton, G., Buckley, C.: Term Weighting Approaches in Automatic Text Retrieval. Inf. Process. Manage. 24(5), 513–523 (1988)
Frey, B.J., Dueck, D.: Non-metric Affinity Propagation for Unsupervised Image Categorization. In: IEEE International Conference on Computer Vision 2007, pp. 1–8 (2007)
Rousseeuw, P.J.: Silhouettes: a Graphical Aid to the Interpretation and Validation of Cluster Analysis. J. Comp. App. Math. 20, 53–65 (1987)
Zhao, Y., Karypis, G., Kumar, V.: A Comparison of Document Clustering Functions for Document Clustering. Machine Learning 55(3), 311–331 (2004)
Jiang, N., Gong, X., Shi, Z.: Text Clustering in High-dimension Feature Space. Computer Engineering and Applications 38, 63–67 (2002)
Rijsbergen, C.J.: Information Retrieval, 2nd edn. Butterworth, London (1979)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sun, C., Wang, Y., Zhao, H. (2009). Web Page Clustering via Partition Adaptive Affinity Propagation. In: Yu, W., He, H., Zhang, N. (eds) Advances in Neural Networks – ISNN 2009. ISNN 2009. Lecture Notes in Computer Science, vol 5552. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01510-6_82
Download citation
DOI: https://doi.org/10.1007/978-3-642-01510-6_82
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01509-0
Online ISBN: 978-3-642-01510-6
eBook Packages: Computer ScienceComputer Science (R0)