Skip to main content

Core-Tag Clustering for Web 2.0 Based on Multi-similarity Measurements

  • Conference paper
Advances in Web and Network Technologies, and Information Management (APWeb 2009, WAIM 2009)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5731))

Abstract

Along with the development of Web2.0, folksonomy has become a hot topic related to data mining, information retrieval and social network. The tag semantic is the key for deep understanding the correlation of objects in folksonomy. This paper proposes two methods to cluster tags for core-tag by fusing multi-similarity measurements. The contributions of this paper include: (1) Proposing the concept of core-tag and the model of core-tag clusters. (2) Designing a core-tag clustering algorithm CETClustering, based on clustering ensemble method. (3) Designing a second kind of core-tag clustering algorithm named SkyTagClustering, based on skyline operator. (4) Comparing the two algorithms with modified K-means. Experiments show that the two algorithms are better than modified K-means with 20-30% on efficiency and 20% higher scores on quality.

Supported by the 11th Five Years Key Programs for Sci. & Tech. Development of China under grant No. 2006BAI05A01, the National Science Foundation under grant No. 60773169, the Software Innovation Project of Sichuan Youth under grant No. 2007AA0155.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Mates, A.: Folksonomies – Cooperative Classification and Communication through Shared Metadata. In: Computer Mediated Communication, LIS590CMC (2004)

    Google Scholar 

  2. Hammond, T., Hannay, T., Lund, B., Scott, J.: Social Bookmarking Tools:A General Review. D-Lib Magazine (2005)

    Google Scholar 

  3. Hotho, A., Jäschke, R., Schmitz, C., Stumme, G.: Information retrieval in folksonomies: Search and ranking. In: Sure, Y., Domingue, J. (eds.) ESWC 2006. LNCS (LNAI), vol. 4011, pp. 411–426. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  4. Xu, K., Chen, Y., Jiang, Y., Tang, R., Liu, Y., Gong, J.: A comparative study of correlation measurements for searching similar tags. In: Tang, C., Ling, C.X., Zhou, X., Cercone, N.J., Li, X. (eds.) ADMA 2008. LNCS, vol. 5139, pp. 709–716. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  5. Fred, A., Jain, A.K.: Evidence Accumulation Clustering based on the K-means Algorithm. In: Proceedings of the Joint IAPR International Workshop (2002)

    Google Scholar 

  6. Voorhees, E., Gupta, N.K., Johnson-Laird, B.: The Collection Fusion Problem. In: The Third Retrieval Conference (1995)

    Google Scholar 

  7. Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, p. 1. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  8. Quinlan, J.R.: Bagging, boosting, and C4.5. In: Proc. of the13th AAAI Conference on Artificial Intelligence. AAAI Press, Menlo Park (1996)

    Google Scholar 

  9. Oza, N.C.: Ensemble Data Mining Methods. NASA Ame Research Center (2000)

    Google Scholar 

  10. Strehl, A., Ghosh, J.: Cluster Ensembles – A Knowledge Reuse Framework for Combining Partitionings. AAAI, Menlo Park (2002)

    MATH  Google Scholar 

  11. Topchy, A., Jain, A.K., Punch, W.: Combining Multiple Weak Clusterings. In: ICDM (2003)

    Google Scholar 

  12. Borzsony, S., Kossmann, D., Stocker, K.: The Skyline Operator. In: ICDE (2001)

    Google Scholar 

  13. Tan, K.L., Eng, P.K., Ooi, B.C.: Efficient progressive skyline computation. In: VLDB (2001)

    Google Scholar 

  14. Chan, C.-Y., Jagadish, H.V., Tan, K.-L., Tung, A.K.H., Zhang, Z.: On high dimensional skylines. In: Ioannidis, Y., Scholl, M.H., Schmidt, J.W., Matthes, F., Hatzopoulos, M., Böhm, K., Kemper, A., Grust, T., Böhm, C. (eds.) EDBT 2006. LNCS, vol. 3896, pp. 478–495. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  15. Chan, C.Y., Jagadish, H.V., Tun, K.L., Tung, A.K.H., Zhang, Z.: Finding k-Dominant Skylines in High Dimensional Space. In: SIGMOD (2006)

    Google Scholar 

  16. Papadias, D., Tao, Y.: An optimal and progressive algorithm for skyline. In: SIGMOD (2003)

    Google Scholar 

  17. Kossmann, K., Ramsak, F., Rost, S.: Shooting Stars in the Sky-An Online Algorithm for Skyline Queries. In: VLDB (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jiang, Y. et al. (2009). Core-Tag Clustering for Web 2.0 Based on Multi-similarity Measurements. In: Chen, L., et al. Advances in Web and Network Technologies, and Information Management. APWeb WAIM 2009 2009. Lecture Notes in Computer Science, vol 5731. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03996-6_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-03996-6_21

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-03995-9

  • Online ISBN: 978-3-642-03996-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics