Skip to main content
Log in

KIPTC: a kernel information propagation tag clustering algorithm

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

In the social annotation systems, users annotate digital data sources by using tags which are freely chosen textual descriptions. Tags are used to index, annotate and retrieve resource as an additional metadata of resource. Poor retrieval performance remains a major challenge of most social annotation systems resulting from several problems of ambiguity, redundancy and less semantic nature of tags. Clustering is a useful tool to handle these problems in social annotation systems. In this paper, we propose a novel tag clustering algorithm based on kernel information propagation. This approach makes use of the kernel density estimation of the kNN neighborhood directed graph as a start to reveal the prestige rank of tags in tagging data. The random walk with restart algorithm is then employed to determine the center points of tag clusters. The main strength of the proposed approach is the capability of partitioning tags from the perspective of tag prestige rank rather than the intuitive similarity calculation itself. Experimental studies on the six real world data sets demonstrate the effectiveness and superiority of the proposed method against other state-of-the-art clustering approaches in terms of various evaluation metrics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. www.delicious.com

  2. www.citeulike.org

  3. http://www.medworm.com/

  4. http://www.movielens.org/

  5. http://www.michael-noll.com/dmoz100k06/

  6. http://www.tagora-project.eu/data/

  7. http://www.grouplens.org/

References

  • Chen, H., & Dumais, S. (2000). Bringing order to the web: automatically categorizing search results. In Proceedings of the SIGCHI conference on human factors in computing systems. ACM (pp. 145–152).

  • Cuzzocrea, A. (2006). Combining multidimensional user models and knowledge representation and management techniques for making web services knowledge-aware. Web Intelligence and Agent Systems, 4(3), 289–312.

    Google Scholar 

  • Cuzzocrea, A., & Mastroianni, C. (2003). A reference architecture for knowledge management-based web systems. In Proceedings of the fourth international conference on web information systems engineering, 2003. WISE 2003. IEEE (pp. 347–351).

  • Dattolo, A., Eynard, D., Mazzola, L. (2011). An integrated approach to discover tag semantics. In Proceedings of the 2011 ACM symposium on applied computing (pp. 814–820).

  • Davies, D.L., & Bouldin, D.W. (1979). A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI, 1(2).

  • Deutsch, S., Schrammel, J., Tscheligi, M. (2011). Comparing different layouts of tag clouds: findings on visual perception. Human Aspects of Visualization.

  • Dunn, J.C. (1974). Well separated clusters and optimal fuzzy-partitions. Journal of Cybernetics, 4, 95–104.

    Article  MathSciNet  Google Scholar 

  • Durao, F., & Dolog, P. (2010). Extending a hybrid tag-based recommender system with personalization. In SAC ’10: Proceedings of the 2010 ACM symposium on applied computing. New York, NY, USA: ACM (pp. 1723–1727). doi:10.1145/1774088.1774457.

    Chapter  Google Scholar 

  • Garcia-Plaza, A.P., Zubiaga, A., Fresno, V., Martinez, R. (2012). Reorganizing clouds: a study on tag clustering and evaluation. Expert Systems with Application, 39, 9483–9493

    Google Scholar 

  • Giannakidou, E., Koutsonikola, V., Vakali, A., Kompatsiaris, Y. (2008). Co-clustering tags and social data sources. In The ninth international conference on web-age information management. IEEE (pp. 317–324).

  • Guan, Z., Bu, J., Mei, Q., Chen, C., Wang, C. (2009). Personalized tag recommendation using graph-based ranking on multi-type interrelated objects. In Proceedings of the 32nd international ACM SIGIR conference on research and development in information retrieval. ACM (pp. 540–547).

  • Guan, Z., Wang, C., Bu, J., Chen, C., Yang, K., Cai, D., He, X. (2010). Document recommendation in social tagging services. In Proceedings of the 19th international conference on World wide web. ACM (pp. 391–400).

  • Hayes, C., & Avesani, P. (2007). Using tags and clustering to identify topic-relevant blogs. In International conference on weblogs and social media.

  • Hotho, A., Jäschke, R., Schmitz, C., Stumme, G. (2006). Folkrank: a ranking algorithm for folksonomies. In LWA (pp. 111–114).

  • Karydis, I., Nanopoulos, A., Gabriel, H.H., Spiliopoulou, M. (2009). Tag-aware spectral clustering of music items. In Proceedings of the 10th international society for music information retrieval conference, ISMIR 2009 (pp. 159–164).

  • Kollios, G., Potamias, M., Terzi, E. (2011). Clustering large probabilistic graphs. IEEE Transactions on Knowledge and Data Engineering, 99, 1–13.

    Google Scholar 

  • Lehwark, P., Risi, S., Ultsch, A. (2008). Visualization and clustering of tagged music data. In Data analysis, machine learning and applications (XI) (pp. 673–680). Heidelberg; Springer.

  • Liu, H., Lafferty, J., Wasserman, L. (2007). Sparse nonparametric density estimation in high dimensions using the rodeo. In Proceedings of the eleventh international conference on artificial intelligence and statistics. San Juan, Puerto Rico.

  • Mika, P. (2005). Ontologies are us: a unified model of social networks and semantics. The Semantic Web–ISWC, 5(1), 522–536.

    Google Scholar 

  • Milicevic, A.K., Nanopoulos, A., Ivanovic, M. (2010). Social tagging in recommender systems: a survey of the state-of-the-art and possible extensions. Artificial Intelligence Review, 33(3), 187–209.

    Article  Google Scholar 

  • Noll, M.G., & Meinel, C. (2007). Web search personalization via social bookmarking and tagging. In Proceedings of the 6th international the semantic web and 2nd Asian conference on Asian semantic web conference, ISWC’07/ASWC’07. Springer-Verlag, Berlin, Heidelberg (pp. 367–380).

  • Pan, R., Xu, G., Dolog, P. (2011). Improving recommendations in tag-based systems with spectral clustering of tag neighbors. In Proceedings of The 3rd FTRA international conference on computer science and its applications (CSA-2011): computer science and convergence, CSA’11. Lecture notes in electrical engineering, vol. 114, Part 1 (pp. 355–364).

  • Rousseeuw, P. (1987). Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20(1), 53–65.

    Article  MATH  Google Scholar 

  • Shafeeq, A., & Hareesha, K.S. (2012). Dynamic clustering of data with modified k-means algorithm. In Proceedings of the 2012 conference on information and computer networks (pp. 221–225).

  • Shepitsen, A., Gemmell, J., Mobasher, B., Burke, R. (2008). Personalized recommendation in social tagging systems using hierarchical clustering. In Proceedings of the 2008 ACM conference on recommender systems. ACM (pp. 259–266)

  • Song, Y., Zhang, L., Giles, C.L. (2011). Automatic tag recommendation algorithms for social recommender systems. ACM Transactions on the Web (TWEB), 5(1), 4.

    Google Scholar 

  • Sun, J., Qu, H., Chakrabarti, D., Faloutsos, C. (2005). Neighborhood formation and anomaly detection in bipartite graphs. In ICDM (pp. 418–425).

  • Tso-Sutter, K.H.L., Marinho, L.B., Schmidt-Thieme, L. (2008). Tag-aware recommender systems by fusion of collaborative filtering algorithms. In SAC ’08: Proceedings of the 2008 ACM symposium on applied computing. New York, NY, USA: ACM (pp. 1995–1999). doi:10.1145/1363686.1364171.

  • van Dam, J., Vandic, D., Hogenboom, F., Frasincar, F. (2010). Searching and browsing tag spaces using the semantic tag clustering search framework. In IEEE fourth international conference on semantic computing, (ICSC) 2010. IEEE (pp. 436–439).

  • Vandic, D., van Dam, J.W., Hogenboom, F., Frasincar, F. (2011). A semantic clustering-based approach for searching and browsing tag spaces. In Proceedings of the 2011 ACM symposium on applied computing (pp. 1693–1699).

Download references

Acknowledgements

The work described in this paper was supported by grants from Natural Science Foundation of China (Grant No. 60775037, 61202171), the Key Program of National Natural Science Foundation of China (Grant No. 60933013), the Nature Science Research of Anhui (Grant No. 1208085MF95), the Nature Science Foundation of Anhui Education Department(Grant No. KJ2012A273 and KJ2012A274), the China Postdoctoral Science Foundation funded project (No. 2012M521251). and the EU FP7 ICT project M-Eco: Medical Ecosystem Personalized Event-based Surveillance (No. 247829). The authors would like to thank the reviewers for their valuable comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ping Jin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xu, G., Zong, Y., Jin, P. et al. KIPTC: a kernel information propagation tag clustering algorithm. J Intell Inf Syst 45, 95–112 (2015). https://doi.org/10.1007/s10844-013-0262-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-013-0262-7

Keywords

Navigation