Abstract
Along with the development of Web2.0, folksonomy has become a hot topic related to data mining, information retrieval and social network. The tag semantic is the key for deep understanding the correlation of objects in folksonomy. This paper proposes two methods to cluster tags for core-tag by fusing multi-similarity measurements. The contributions of this paper include: (1) Proposing the concept of core-tag and the model of core-tag clusters. (2) Designing a core-tag clustering algorithm CETClustering, based on clustering ensemble method. (3) Designing a second kind of core-tag clustering algorithm named SkyTagClustering, based on skyline operator. (4) Comparing the two algorithms with modified K-means. Experiments show that the two algorithms are better than modified K-means with 20-30% on efficiency and 20% higher scores on quality.
Supported by the 11th Five Years Key Programs for Sci. & Tech. Development of China under grant No. 2006BAI05A01, the National Science Foundation under grant No. 60773169, the Software Innovation Project of Sichuan Youth under grant No. 2007AA0155.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Mates, A.: Folksonomies – Cooperative Classification and Communication through Shared Metadata. In: Computer Mediated Communication, LIS590CMC (2004)
Hammond, T., Hannay, T., Lund, B., Scott, J.: Social Bookmarking Tools:A General Review. D-Lib Magazine (2005)
Hotho, A., Jäschke, R., Schmitz, C., Stumme, G.: Information retrieval in folksonomies: Search and ranking. In: Sure, Y., Domingue, J. (eds.) ESWC 2006. LNCS (LNAI), vol. 4011, pp. 411–426. Springer, Heidelberg (2006)
Xu, K., Chen, Y., Jiang, Y., Tang, R., Liu, Y., Gong, J.: A comparative study of correlation measurements for searching similar tags. In: Tang, C., Ling, C.X., Zhou, X., Cercone, N.J., Li, X. (eds.) ADMA 2008. LNCS, vol. 5139, pp. 709–716. Springer, Heidelberg (2008)
Fred, A., Jain, A.K.: Evidence Accumulation Clustering based on the K-means Algorithm. In: Proceedings of the Joint IAPR International Workshop (2002)
Voorhees, E., Gupta, N.K., Johnson-Laird, B.: The Collection Fusion Problem. In: The Third Retrieval Conference (1995)
Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, p. 1. Springer, Heidelberg (2000)
Quinlan, J.R.: Bagging, boosting, and C4.5. In: Proc. of the13th AAAI Conference on Artificial Intelligence. AAAI Press, Menlo Park (1996)
Oza, N.C.: Ensemble Data Mining Methods. NASA Ame Research Center (2000)
Strehl, A., Ghosh, J.: Cluster Ensembles – A Knowledge Reuse Framework for Combining Partitionings. AAAI, Menlo Park (2002)
Topchy, A., Jain, A.K., Punch, W.: Combining Multiple Weak Clusterings. In: ICDM (2003)
Borzsony, S., Kossmann, D., Stocker, K.: The Skyline Operator. In: ICDE (2001)
Tan, K.L., Eng, P.K., Ooi, B.C.: Efficient progressive skyline computation. In: VLDB (2001)
Chan, C.-Y., Jagadish, H.V., Tan, K.-L., Tung, A.K.H., Zhang, Z.: On high dimensional skylines. In: Ioannidis, Y., Scholl, M.H., Schmidt, J.W., Matthes, F., Hatzopoulos, M., Böhm, K., Kemper, A., Grust, T., Böhm, C. (eds.) EDBT 2006. LNCS, vol. 3896, pp. 478–495. Springer, Heidelberg (2006)
Chan, C.Y., Jagadish, H.V., Tun, K.L., Tung, A.K.H., Zhang, Z.: Finding k-Dominant Skylines in High Dimensional Space. In: SIGMOD (2006)
Papadias, D., Tao, Y.: An optimal and progressive algorithm for skyline. In: SIGMOD (2003)
Kossmann, K., Ramsak, F., Rost, S.: Shooting Stars in the Sky-An Online Algorithm for Skyline Queries. In: VLDB (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jiang, Y. et al. (2009). Core-Tag Clustering for Web 2.0 Based on Multi-similarity Measurements. In: Chen, L., et al. Advances in Web and Network Technologies, and Information Management. APWeb WAIM 2009 2009. Lecture Notes in Computer Science, vol 5731. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03996-6_21
Download citation
DOI: https://doi.org/10.1007/978-3-642-03996-6_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03995-9
Online ISBN: 978-3-642-03996-6
eBook Packages: Computer ScienceComputer Science (R0)