Skip to main content

A New Algorithm for Community Identification in Linked Data

  • Conference paper
Knowledge-Based Intelligent Information and Engineering Systems (KES 2008)

Abstract

In this paper, we propose four specifications which can be used for the evaluation of community identification algorithms. Furthermore, a novel algorithm VHITS meeting the four established specifications is presented. Basically, VHITS is based on a two-step approach. In the first step, the Nonnegative Matrix Factorization is used to estimate the community memberships. In the second step, a voting scheme is employed to identify the hubs and authorities of each community. VHITS is then compared to the HITS and PHITS algorithms. Experimental results show that VHITS is more adapted than HITS and PHITS to the task of community identification in citation networks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Zhang, Y., Xu Yu, J., Hou, J.: Web communities: Analysis and construction. Springer, Heidelberg (2006)

    Google Scholar 

  2. Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46(5), 604–632 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  3. Cohn, D., Chang, H.: Learning to probabilistically identify authoritative documents. In: 17th International Conference on Machine Learning, pp.167–174 (2000)

    Google Scholar 

  4. Chikhi, N.F., Rothenburger, B., Aussenac-Gilles, N.: A comparison of dimensionality reduction techniques for web structure mining. In: IEEE/WIC/ACM International Conference on Web Intelligence, pp. 116–119 (2007)

    Google Scholar 

  5. Hofmann, T.: Probabilistic latent semantic analysis. In: 15th UAI Conference (1999)

    Google Scholar 

  6. Fisher, M., Everson, R.: When Are Links Useful? Experiments in Text Classification. In: Sebastiani, F. (ed.) ECIR 2003. LNCS, vol. 2633, pp. 41–56. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  7. Agresti, A.: An Introduction to Categorical Data Analysis, 2nd edn. Wiley, Chichester (2007)

    MATH  Google Scholar 

  8. Lee, D., Seung, H.: Algorithms for non-negative matrix factorization. In: Neural Information Processing Systems, pp. 556–562 (2000)

    Google Scholar 

  9. Chu, M.: Data mining and applied linear algebra. In: International Conference on Informatics Education and Research for Knowledge-Circulating Society, pp. 20–25 (2008)

    Google Scholar 

  10. Lempel, R., Moran, S.: The stochastic approach for link-structure analysis (SALSA) and the TKC effect. Computer Networks 33(1-6), 387–401 (2000)

    Article  Google Scholar 

  11. WebKB, http://www.cs.cmu.edu/~webkb/

  12. McCallum, A., Nigam, K., Rennie, J., Seymore, K.: Automating the construction of internet portals with machine learning. Information Retrieval Journal 3, 127–163 (2000)

    Article  Google Scholar 

  13. Zhu, D., Yu, K., Chi, Y., Gong, Y.: Combining content and link for classification using matrix factorization. In: 30th Annual Intl. ACM SIGIR Conference, pp. 487–494 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Ignac Lovrek Robert J. Howlett Lakhmi C. Jain

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chikhi, N.F., Rothenburger, B., Aussenac-Gilles, N. (2008). A New Algorithm for Community Identification in Linked Data. In: Lovrek, I., Howlett, R.J., Jain, L.C. (eds) Knowledge-Based Intelligent Information and Engineering Systems. KES 2008. Lecture Notes in Computer Science(), vol 5177. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85563-7_81

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-85563-7_81

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-85562-0

  • Online ISBN: 978-3-540-85563-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics