Group topic modeling for academic knowledge discovery

Daud, Ali; Muhammad, Faqir

doi:10.1007/s10489-011-0302-3

Group topic modeling for academic knowledge discovery

Published: 11 June 2011

Volume 36, pages 870–886, (2012)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Ali Daud¹ &
Faqir Muhammad^1,2

434 Accesses
9 Citations
1 Altmetric
Explore all metrics

Abstract

Conference mining and expert finding are useful academic knowledge discovery problems from an academic recommendation point of view. Group level (GL) topic modeling can provide us with richer text semantics and relationships, which results in denser topics. And denser topics are more useful for academic discovery issues in contrast to Element level (EL) or Document level (DL) topic modeling, which produces sparser topics. Previous methods performed academic knowledge discovery by using network connectivity (only links not text of documents), keywords-based matching (no semantics) or by using semantics-based intrinsic structure of the words presented between documents (semantics at DL), while ignoring semantics-based intrinsic structure of the words and relationships between conferences (semantics at GL). In this paper, we consider semantics-based intrinsic structure of words and relationships presented in conferences (richer text semantics and relationships) by modeling from GL. We propose group topic modeling methods based on Latent Dirichlet Allocation (LDA). Detailed empirical evaluation shows that our proposed GL methods significantly outperformed DL methods for conference mining and expert finding problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Andrieu C, Freitas ND, Doucet A, Jordan M (2003) An introduction to MCMC for machine learning. J Mach Learn 50:5–43
Article MATH Google Scholar
Azzopardi L, Girolami M, van Risjbergen K (2003) Investigating the relationship between language model perplexity and IR precision-recall measures. In: Proc of the 26th ACM SIGIR conference on research and development in information retrieval, Toronto, Canada, July 28–August 1, 2003
Google Scholar
Balabanovic M, Shoham Y (1997) Content-based collaborative recommendation. Commun ACM
Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022
MATH Google Scholar
Blei DM, Lafferty J (2006) Dynamic topic models. In: Proc of 23rd international conference on machine learning (ICML), Pittsburgh, Pennsylvania, USA, June 25–29, 2006
Google Scholar
Breese J, Heckerman D, Kadie C (1998) Empirical analysis of predictive algorithms for collaborative filtering. In: Proc of the international conference on uncertainty in intelligence (UAI), pp 43–52
Google Scholar
Daud A, Li J, Zhu L, Muhammad F (2010) Knowledge discovery through directed probabilistic topic models a survey. J Front Comput Sci China 4(2):280–301
Article Google Scholar
Daud A, Li J, Zhu L, Muhammad F (2009) Conference mining via generalized topic modeling. In: Buntine W et al (ed) Proc of European conference on machine learning and principles and practices of knowledge discovery in databases (ECML PKDD), Part I. LNAI, vol 5781, pp 244–259
Google Scholar
Deshpande M, Karypis G (2004) Item-based top-n recommendation algorithms. ACM Trans Inf Sys 22(1):143–177
Article Google Scholar
DBLP bibliography database. http://www.informatik.uni-trier.de/~ley/db/
Girvan M, Newman MEJ (2002) Community structure in social and biological networks. In: Proc of the national academy of sciences, USA, vol 99, pp 8271–8276
Google Scholar
Griffiths TL, Steyvers M (2004) Finding scientific topics. In: Proc of the national academy of sciences, pp 5228–5235
Google Scholar
Hofmann T (1999) Probabilistic latent semantic analysis. In: Proc of the 15th annual conference on uncertainty in artificial intelligence (UAI), Stockholm, Sweden, July 30–August 1, 1999
Google Scholar
Kernighan BW, Lin S (1970) An efficient heuristic procedure for partitioning graphs. Bell Syst Tech J 49:291–307
MATH Google Scholar
Linstead E, Rigor P, Bajracharya S, Lopes C, Baldi P (2007) Mining eclipse developer contributions via author-topic models. In: 29th international conference on software engineering workshops (ICSEW)
Google Scholar
Ley M (2002) The DBLP computer science bibliography: evolution, research issues, perspectives. In: Proc of the international symposium on string processing and information retrieval (SPIRE), Lisbon, Portugal, September 11–13, 2002, pp 1–10
Google Scholar
McCallum A, Nigam K, Ungar LH (2000) Efficient clustering of high-dimensional data sets with application to reference matching. In: Proc of the 6th ACM SIGKDD conference on knowledge discovery and data mining, Boston, MA, USA, August 20–23, 2000, pp 169–178
Chapter Google Scholar
Popescul A, Flake GW, Lawrence S et al. (2000) Clustering and identifying temporal trends in document databases. IEEE Adv Digit Libr 173–182
Pothen A, Simon H, Liou KP (1990) Partitioning sparse matrices with eigenvectors of graphs. SIAM J Matrix Anal Appl 11:430–452
Article MathSciNet MATH Google Scholar
Radicchi F, Castellano C, Cecconi F et al (2004) Dening and identifying communities in networks. In: Proc of the national academy of sciences, USA
Google Scholar
Rosen-Zvi M, Griffiths T, Steyvers M, Smyth P (2004) The author-topic model for authors and documents. In: Proc of the 20th international conference on uncertainty in artificial intelligence (UAI), Banff, Canada, July 7–11 2004
Google Scholar
Tang J, Zhang J, Yao L, Li J, Zhang L, Su Z (2008) ArnetMiner: extraction and mining of academic social networks. In: Proc of the 14th ACM SIGKDD international conference on knowledge discovery and data mining (SIGKDD), Las Vegas, USA, August 24–27, 2008
Google Scholar
Tyler JR, Wilkinson DM, Huberman BA (2003) Email as spectroscopy: automated discovery of community structure within organizations. In: Proc of the international conference on communities and technologies, pp 81–96
Google Scholar
Wang X, McCallum A (2006) Topics over time: a non-Markov continuous-time model of topical trends. In: Proc of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, Philadelphia, USA, August 20–23, 2006
Google Scholar
Wang J, Xu C, Li G, Dai Z, Luo G (2007) Understanding research field evolving and trend with dynamic Bayesian networks. In: Proc of the PAKDD
Google Scholar
Zaiane OR, Chen J, Goebel R (2007) DBconnect: mining research community on DBLP data. In: Joint 9th WEBKDD and 1st SNA-KDD workshop, San Jose, California, USA, August 12, 2007
Google Scholar
Zhang J, Tang J, Liang B et al (2008) Recommendation over a heterogeneous social network. In: Proc of the 9th international conference on web-age information management (WAIM), ZhangJiaJie, China, July 20–22, 2008
Google Scholar
Zhai C, Lafferty J (2001) A study of smoothing methods for language models applied to ad hoc information retrieval. In: Proc of the 24th ACM SIGIR international conference on information retrieval, pp 334–342
Google Scholar
Kim HR, Chan PK (2008) Learning implicit user interest hierarchy for context in personalization. J Appl Intell 28:153–166
Article Google Scholar
Diederich J (2003) Authorship attribution with support vector machines. J Appl Intell 19:109–123
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Sector H-10, International Islamic University, Islamabad, 44000, Pakistan
Ali Daud & Faqir Muhammad
Department of Business Administration, Sector E-9, Air University, Islamabad, 44000, Pakistan
Faqir Muhammad

Authors

Ali Daud
View author publications
You can also search for this author in PubMed Google Scholar
Faqir Muhammad
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ali Daud.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Daud, A., Muhammad, F. Group topic modeling for academic knowledge discovery. Appl Intell 36, 870–886 (2012). https://doi.org/10.1007/s10489-011-0302-3

Download citation

Published: 11 June 2011
Issue Date: June 2012
DOI: https://doi.org/10.1007/s10489-011-0302-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Group topic modeling for academic knowledge discovery

Abstract

Access this article

Similar content being viewed by others

Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey

Advances in Collaborative Filtering

Scientific paper recommendation systems: a literature review of recent publications

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Group topic modeling for academic knowledge discovery

Abstract

Access this article

Similar content being viewed by others

Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey

Advances in Collaborative Filtering

Scientific paper recommendation systems: a literature review of recent publications

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation