Skip to main content

A Precise and Robust Clustering Approach Using Homophilic Degrees of Graph Kernel

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9652))

Included in the following conference series:

Abstract

To address the difficulties of “data noise sensitivity” and “cluster center variance” in mainstream clustering algorithms, we propose a novel robust approach for identifying cluster centers unambiguously from data contaminated with noise; it incorporates the strength of homophilic degrees and graph kernel. Exploiting that in-degrees can breed the homophilic distribution if ordered by their associated sorted out-degrees, it is easy to separate clusters from noise. Then we apply the diffusion kernel to the graph formed by clusters so as to obtain graph kernel matrix, which is treated as the measurement of global similarities. Based on local data densities and global similarities, the proposed approach manages to identify cluster centers precisely. Experiments on various synthetic and real-world databases verify the superiority of our algorithm in comparison with state-of-the-art algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Byers, S., Raftery, A.E.: Nearest-neighbor clutter removal for estimating features in spatial point processes. J. Am. Stat. Assoc. 93(442), 577–584 (1998)

    Article  MATH  Google Scholar 

  2. Cai, D., He, X., Han, J., Huang, T.S.: Graph regularized nonnegative matrix factorization for data representation. IEEE Trans. Pattern Anal. Mach. Intell. 33(8), 1548–1560 (2011)

    Article  Google Scholar 

  3. Cho, M., MuLee, K.: Authority-shift clustering: hierarchical clustering by authority seeking on graphs. In: pp. 3193–3200. IEEE (2010)

    Google Scholar 

  4. Cho, M., Lee, K.M.: Mode-seeking on graphs via random walks. In: pp. 606–613. IEEE (2012)

    Google Scholar 

  5. Dietterich, T.G., Bakiri, G.: Error-correcting output codes: a general method for improving multiclass inductive learning programs. In: AAAI, pp. 572–577. Citeseer (1991)

    Google Scholar 

  6. Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  7. Georghiades, A.S., Belhumeur, P.N., Kriegman, D.J.: From few to many: illumination cone models for face recognition under variable lighting and pose. IEEE Trans. Pattern Anal. Mach. Intell. 23(6), 643–660 (2001)

    Article  Google Scholar 

  8. Guyon, I., Gunn, S., Ben-Hur, A., Dror, G.: Result analysis of the nipps 2003 feature selection challenge. In: Advances in Neural Information Processing Systems, pp. 545–552 (2004)

    Google Scholar 

  9. Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge University Press, Cambridge (1985)

    Book  MATH  Google Scholar 

  10. Hull, J.J.: A database for handwritten text recognition research. IEEE Trans. Pattern Anal. Mach. Intell. 16(5), 550–554 (1994)

    Article  Google Scholar 

  11. Katz, L.: A new status index derived from sociometric analysis. Psychometrika 18(1), 39–43 (1953)

    Article  MATH  Google Scholar 

  12. Kondor, R.I., Lafferty, J.: Diffusion kernels on graphs and other discrete input spaces. In: pp. 315–322. Morgan Kaufmann (2002)

    Google Scholar 

  13. Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E Stat. Nonlinear Soft Matter Phys. 69(2 Pt 2), 026113 (2004)

    Article  Google Scholar 

  14. O’Sullivan, D., Unwin, D.: Geographic Information Analysis. Wiley, Hoboken (2010)

    Book  Google Scholar 

  15. Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web (1999)

    Google Scholar 

  16. Papadopoulos, F., Kitsak, M., Serrano, M., Bogun, M., Krioukov, D.: Popularity versus similarity in growing networks. Nature 489(7417), 537 (2012)

    Article  Google Scholar 

  17. Phillips, P.J., Flynn, P.J., Scruggs, T., Bowyer, K.W., Chang, J., Hoffman, K., Marques, J., Min, J., Worek, W.: Overview of the face recognition grand challenge. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 1, pp. 947–954. IEEE (2005)

    Google Scholar 

  18. Rodriguez, A., Laio, A.: Clustering by fast search and find of density peaks. Science 344(6191), 1492 (2014)

    Article  Google Scholar 

  19. Samaria, F.S., Harter, A.C.: Parameterisation of a stochastic model for human face identification. In: Proceedings of the Second IEEE Workshop on Applications of Computer Vision, pp. 138–142. IEEE (1994)

    Google Scholar 

  20. Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)

    Article  Google Scholar 

  21. Strehl, A., Ghosh, J.: Cluster ensembles - a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2002)

    MathSciNet  MATH  Google Scholar 

  22. Wang, F.: Quantitative Methods and Applications in GIS. CRC Press, Boca Raton (2006)

    Book  Google Scholar 

  23. Wong, M.A., Lane, T.: A kth nearest neighbour clustering procedure. J. R. Stat. Soc. Ser. B (Methodol.) 45(3), 362–368 (1983)

    MathSciNet  MATH  Google Scholar 

  24. Zhao, D., Tang, X.: Homophilic clustering by locally asymmetric geometry. Eprint Arxiv (2014)

    Google Scholar 

Download references

Acknowledgments

This work was supported by the grants from China National Natural Science Foundation under Grant No. 613278050 & 61210013.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haolin Yang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Yang, H., Zhao, D., Cao, L., Sun, F. (2016). A Precise and Robust Clustering Approach Using Homophilic Degrees of Graph Kernel. In: Bailey, J., Khan, L., Washio, T., Dobbie, G., Huang, J., Wang, R. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2016. Lecture Notes in Computer Science(), vol 9652. Springer, Cham. https://doi.org/10.1007/978-3-319-31750-2_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-31750-2_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-31749-6

  • Online ISBN: 978-3-319-31750-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics