Skip to main content

Exploiting Explicit Semantics-Based Grouping for Author Interest Finding

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6612))

Abstract

This paper investigates the problem of finding author interest in co-author network through topic modeling with providing several performance evaluation measures. Intuitively, there are two types of explicit grouping exists in research papers (1) authors who have co-authored with author A in one document (subgroup) and (2) authors who have co-authored with author A in all documents (group). Traditional methods use graph-link structure by using keywords based matching and ignored semantics-based information, while topic modeling considered semantics-based information but ignored both types of explicit grouping e.g. State-of-the-art Author-Topic model used only one kind of explicit grouping single document (subgroup) for finding author interest. In this paper, we introduce Group-Author-Topic (GAT) modeling which exploits both types of grouping simultaneously. We compare four different topic modeling methods for same task on large DBLP dataset. We provide three performance measures for method evaluation from different domains which are; perplexity, entropy, and prediction ranking accuracy. We show the trade of between these performance evaluation measures. Experimental results demonstrate that our proposed method significantly outperformed the baselines in finding author interest. The trade of between used evaluation measures shows that they are equally useful for evaluating topic modeling methods.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Andrieu, C., de Freitas, N., Doucet, A., Jordan, M.I.: An introduction to MCMC for Machine Learning. Machine Learning 50, 5–43 (2003)

    Article  MATH  Google Scholar 

  2. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet Allocation. Journal of Machine Learning Research 3, 993–1022 (2003)

    MATH  Google Scholar 

  3. Blei, D.M., Jordan, M.I.: Modeling annotated data. In: Proceedings of the Annual Conference on Research and Development in Information Retrieval, SIGIR (2003)

    Google Scholar 

  4. Daud, A., Li, J., Zhu, L., Muhammad, F.: Temporal Expert Finding through Generalized Time Topic Modeling. Knowledge-Based Systems (KBS) 23(6), 615–625 (2010)

    Article  Google Scholar 

  5. Daud, A., Li, J., Zhou, L., Muhammad, F.: Conference Mining via Generalized Topic Modeling. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds.) ECML PKDD 2009. LNCS (LNAI), vol. 5781, pp. 244–259. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  6. DBLP Bibliography Database, http://www.informatik.uni-trier.de/~ley/db/

  7. Diederich, J., Kindermann, J., Leopold, E., Paass, G.: Authorship Attribution with Support Vector Machines. Applied Intelligence 19(1) (2003)

    Google Scholar 

  8. Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proceedings of the National Academy of Sciences, 5228–5235 (2004)

    Google Scholar 

  9. Gray, A., Sallis, P., MacDonell, S.: Softwareforensics: Extending Authorship Analysis Techniques to Computer Programs. In: Proceedings of the 3rd IAFL, Durham NC (1997)

    Google Scholar 

  10. Hofmann, T.: Probabilistic Latent Semantic Analysis. In: Proceedings of the 15th Annual Conference on Uncertainty in Artificial Intelligence (UAI), Stockholm, Sweden, July 30-August 1 (1999)

    Google Scholar 

  11. Mimno, D., McCallum, A.: Expertise modeling for matching papers with reviewers. In: Proceedings of KDD, pp. 500–509 (2007)

    Google Scholar 

  12. Mutschke, P.: Mining Networks and Central Entities in Digital Libraries: A Graph Theoretic Approach Applied to Co-author Networks. Intelligent Data Analysis, 155–166 (2003)

    Google Scholar 

  13. Newman, M.E.J.: Scientific collaboration networks: I. Network construction and fundamental results. Physical Review E 64, 016131 (2001)

    Article  Google Scholar 

  14. Kawamae, N.: Author Interest Topic Model. In: Proceedings of SIGIR, July 19–23, pp. 887–888 (2010)

    Google Scholar 

  15. Rosen-Zvi, M., Chemudugunta, C., Griffiths, T., Smyth, P., Steyvers, M.: Learning Author-Topic Models from Text Corpora. ACM Transactions on Information Systems, 1–38 (March 2009)

    Google Scholar 

  16. White, S., Smyth, P.: Algorithms for Estimating Relative Importance in Networks. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 266–275 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Daud, A. (2011). Exploiting Explicit Semantics-Based Grouping for Author Interest Finding. In: Du, X., Fan, W., Wang, J., Peng, Z., Sharaf, M.A. (eds) Web Technologies and Applications. APWeb 2011. Lecture Notes in Computer Science, vol 6612. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20291-9_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-20291-9_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-20290-2

  • Online ISBN: 978-3-642-20291-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics