Skip to main content

Hierarchical Topic-Based Communities Construction for Authors in a Literature Database

  • Conference paper
Trends in Applied Intelligent Systems (IEA/AIE 2010)

Abstract

In this paper, given a set of research papers with only title and author information, a mining strategy is proposed to discover and organize the communities of authors according to both the co-author relationships and research topics of their published papers. The proposed method applies the CONGA algorithm to discover collaborative communities from the network constructed from the co-author relationship. To further group the collaborative communities of authors according to research interests, the CiteSeerX is used as an external source to discover the hidden hierarchical relationships among the topics covered by the papers. In order to evaluate whether the constructed topic-based collaborative community is semantically meaningful, the first part of evaluation is to measure the consistency between the terms appearing in the published papers of a topic-based collaborative community and the terms in the documents related to the specific topic retrieved from other external source. The experimental results show that 81.61% of the topic-based collaborative communities satisfy the consistency requirement. On the other hand, the accuracy of the discovered sub-concept relationship is verified by checking the Wikipedia categories. It is shown that 75.96% of the sub-concept terms are properly assigned in the concept hierarchy.

This work was partially supported by the R.O.C. N.S.C. under Contract No. 98-2221-E-003-017 and NSC 98-2631-S-003-002.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Deng, H., Lyu, M.R., King, I.: Effective Latent Space Graph-based Re-ranking Model with Global Consistency. In: Proceeding of the Second ACM International Conference on Web Search and Data Mining, pp. 212–221 (2009)

    Google Scholar 

  2. Ding, C.H.Q., He, X., Zha, H., Gu, M., Simon, H.D.: A Min-max Cut Algorithm for Graph Partitioning and Data Clustering. In: Proceeding of the IEEE International Conference on Data Mining, pp. 107–114 (2001)

    Google Scholar 

  3. Gregory, S.: An Algorithm to Find Overlapping Community Structure in Networks. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) PKDD 2007. LNCS (LNAI), vol. 4702, pp. 91–102. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  4. Gregory, S.: A Fast Algorithm to Find Overlapping Communities in Networks. In: Proceeding of the 12th European Conference on Principles and Practice of Knowledge Discovery in Databases, pp. 408–423 (2008)

    Google Scholar 

  5. Grineva, M.P., Grinev, M.N., Lizorkin, D.: Extracting Key Terms From Noisy and Multitheme Documents. In: Proceeding of the 18th ACM International Conference on World Wide Web, pp. 661–670 (2009)

    Google Scholar 

  6. Hofmann, T.: Probabilistic Latent Semantic Indexing. In: Proceeding of the 22nd ACM SIGIR International Conference on Research and Development in Information Retrieval, pp. 50–57 (1999)

    Google Scholar 

  7. Hotho, A., Staab, S., Stumme, G.: Wordnet Improves Text Document Clustering. In: Proceeding of the 26th ACM SIGIR International Conference on Semantic Web Workshop, pp. 541–544 (2003)

    Google Scholar 

  8. Hu, X., Zhang, X., Lu, C., Park, E.K., Zhou, X.: Exploiting Wikipedia as External Knowledge for Document Clustering. In: Proceeding of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 389–396 (2009)

    Google Scholar 

  9. Ley, M.: The DBLP Computer Science Bibliography: Evolution, Research Issues, Perspectives. In: Proceeding of the 9th International Symposium on String Processing and Information Retrieval, pp. 1–10 (2002)

    Google Scholar 

  10. Mei, Q., Cai, D., Zhang, D., Zhai, C.: Topic Modeling with Network Regularization. In: Proceeding of the 17th ACM International Conference on World Wide Web, pp. 101–110 (2008)

    Google Scholar 

  11. Newman, M.E.J.: Modularity and Community Structure in Networks. Proceedings of the National Academy of Sciences of the United States of America 103(23), 8577–8582 (2006)

    Article  Google Scholar 

  12. Shi, J., Malik, J.: Normalized Cuts and Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(8), 888–905 (2000)

    Article  Google Scholar 

  13. White, S., Smyth, P.: A Spectral Clustering Approach to Finding communities in Graphs. In: Proceeding of the SIAM International Data Mining Conference, pp. 76–84 (2005)

    Google Scholar 

  14. Zaiane, O.R., Chen, J., Goebel, R.: DBConnect: Mining Research Community on DBLP Data. In: Proceeding of the First ACM Workshop on Social Network Mining and Analysis, pp. 74–81 (2007)

    Google Scholar 

  15. Zhang, H., Qiu, B., Giles, C.L., Foley, H.C., Yen, J.: An LDA-based Community Structure Discovery Approach for Large-Scale Social Networks. In: Proceeding of the IEEE International Conference on Intelligence and Security Informatics, pp. 200–207 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wu, CL., Koh, JL. (2010). Hierarchical Topic-Based Communities Construction for Authors in a Literature Database. In: García-Pedrajas, N., Herrera, F., Fyfe, C., Benítez, J.M., Ali, M. (eds) Trends in Applied Intelligent Systems. IEA/AIE 2010. Lecture Notes in Computer Science(), vol 6097. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13025-0_53

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-13025-0_53

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-13024-3

  • Online ISBN: 978-3-642-13025-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics