Skip to main content
Log in

Uncovering research trends and topics of communities in machine learning

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This paper aims to uncover the research topics in machine learning research communities in a scientific collaboration network (SCN) to enhance the characteristic of systems such as retrieval or recommendation in intelligence-based systems. The existing research mainly focuses on the community evolution and measurement of typical features of the network. It is however unexplored how to identify the research interest of the communities along with authors in each community. A dataset is prepared consisting of 21,906 scientific articles from six top journals in the field of machine learning published from 1988 to 2017. An integrated approach combining the author-topic (AT) model with communities using through the directed affiliations (CoDA) method is explored to identify the research interest of the communities in a scientific collaboration network. The top rank communities are identified using the crank network community prioritization method. Finally, the similarity and dissimilarity of research interest in communities across decades are uncovered using the cosine similarity. The experimental results demonstrate the effectiveness and efficacy of the proposed technique. This study may be helpful for upcoming researchers to explore the research trends and topics in machine learning research communities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

Similar content being viewed by others

References

  1. Abbasi A, Hossain L (2011) Analyzing academic communities’ collaboration and performance. In: Proceedings of the international conference on information and knowledge engineering (IKE), the steering committee of the world congress in computer science, computer engineering and applied computing (WorldComp). p 1

  2. Abbasi A, Hossain L, Owen C (2012) Exploring the relationship between research impact and collaborations for information science. In: System Science (HICSS), 2012 45th Hawaii international conference on, IEEE, pp 774–780

  3. Abbasi A, Hossain L, Uddin S, Rasmussen KJ (2011) Evolutionary dynamics of scientific collaboration networks: multi-levels and cross-time analysis. Scientometrics 89(2):687–710

    Google Scholar 

  4. Abdel-Mottaleb M, Rosenfeld A (1992a) Inexact bayesian estimation. Pattern Recognit 25(6):641– 646

    Google Scholar 

  5. Abdel-Mottaleb M, Rosenfeld A (1992b) “Qualitative” Bayesian estimation of digital signals and images. Pattern Recognit 25(11):1371–1380

    Google Scholar 

  6. Abel GJ, Muttarak R, Bordone V, et al. (2019) Bowling together: scientific collaboration networks of demographers at european population conferences. Eur J Popul 35:543–562. https://doi.org/10.1007/s10680-018-9493-1

    Article  Google Scholar 

  7. Adamic LA, Glance N (2005) The political blogosphere and the 2004 us election: divided they blog. In: Proceedings of the 3rd international workshop on Link discovery, ACM, pp 36–43

  8. Ahn YY, Bagrow JP, Lehmann S (2010) Link communities reveal multiscale complexity in networks. Nature 466(7307):761

    Google Scholar 

  9. Arenas A, Danon L, Diaz-Guilera A, Gleiser PM, Guimera R (2004) Community analysis in social networks. Eur Phys J B 38(2):373–380

    MATH  Google Scholar 

  10. Backstrom L, Huttenlocher D, Kleinberg J, Lan X (2006) Group formation in large social networks: membership, growth, and evolution. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 44–54

  11. Balakrishnan H (2006) Algorithms for discovering communities in complex networks. Electronic Theses and Dissertations

  12. Banerjee S, Rosenfeld A (1993) Model-based cluster analysis. Pattern Recogn 26(6):963–974

    Google Scholar 

  13. Bhaskar S, Rosenfeld A, Wu A (1989) Models for neighbor dependency in planar point patterns. Pattern Recognit 22(5):533–559

    Google Scholar 

  14. Bhattacharya P, Rosenfeld A (1995) Polygonal ribbons in two and three dimensions. Pattern Recogn 28(5):769–779

    Google Scholar 

  15. Bird S, Loper E (2004) Nltk: the natural language toolkit. In: Proceedings of the ACL 2004 on Interactive poster and demonstration sessions, Association for Computational Linguistics, p 31

  16. Blei DM (2012) Probabilistic topic models. Commun ACM 55 (4):77–84

    Google Scholar 

  17. Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3(Jan):993–1022

    MATH  Google Scholar 

  18. Bohanec M, Bratko I (1994) Trading accuracy for simplicity in decision trees. Mach Learn 15(3):223–250

    MATH  Google Scholar 

  19. Bradford RB (2008) An empirical study of required dimensionality for large-scale latent semantic indexing applications. In: Proceedings of the 17th ACM conference on Information and knowledge management, ACM, pp 153–162

  20. Brunson JC, Fassino S, McInnes A, Narayan M, Richardson B, Franck C, Ion P, Laubenbacher R (2014) Evolutionary events in a mathematical sciences research collaboration network. Scientometrics 99(3):973–998

    Google Scholar 

  21. Buckley C, Salton G (1995) Stopword list 2. http://www.lextek.com/manuals/onix/stopwords2.html

  22. Cheng Z, Chang X, Zhu L, Kanjirathinkal RC, Kankanhalli M (2019) Mmalfm: Explainable recommendation by leveraging reviews and images. ACM Trans Inf Sys (TOIS) 37(2):1–28

    Google Scholar 

  23. Cheng Z, Ding Y, Zhu L, Kankanhalli M (2018) Aspect-aware latent factor model: rating prediction with ratings and reviews. In: Proceedings of the 2018 world wide web conference, pp 639–648

  24. Cheng Z, Shen J, Nie L, Chua TS, Kankanhalli M (2017) Exploring user-specific information in music retrieval. In: Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval, pp 655–664

  25. Cinque L, Levialdi S, Rosenfeld A (1995) Fast pyramidal algorithms for image thresholding. Pattern Recogn 28(6):901–906

    Google Scholar 

  26. Cucka P, Rosenfeld A (1992) Linear feature compatibility for pattern-matching relaxation. Pattern Recogn 25(2):189–196

    Google Scholar 

  27. Cucka P, Rosenfeld A (1993) Evidence-based pattern-matching relaxation. Pattern Recognit 26(9):1417–1427

    Google Scholar 

  28. De Micheli E, Caprile B, Ottonello P, Torre V (1989) Localization and noise in edge detection. IEEE Trans Pattern Anal Mach Intell 11(10):1106–1117

    Google Scholar 

  29. De Micheli E, Torre V, Uras S (1993) The accuracy of the computation of optical flow and of the recovery of motion parameters. IEEE Trans Pattern Anal Mach Intell 15(5):434–447. https://doi.org/10.1109/34.211464

    Article  Google Scholar 

  30. Depiero F, Trivedi M, Serbin S (1996) Graph matching using a direct classification of node attendance. Pattern Recogn 29(6):1031

    Google Scholar 

  31. Dickinson SJ, Pentland AP, Rosenfeld A (1992) 3-d shape recovery using distributed aspect matching. IEEE Trans Pattern Anal Mach Intell 14 (2):174–198. https://doi.org/10.1109/34.121788

    Article  Google Scholar 

  32. Doermann DS, Varma V, Rosenfeld A (1994) Instrument grasp: a model and its effects on handwritten strokes. Pattern Recogn 27(2):233–245

    Google Scholar 

  33. Evans TS (2010) Clique graphs and overlapping communities. J Stat Mech: Theory Exp 2010 (12):P12037

    MATH  Google Scholar 

  34. Evans T, Lambiotte R, Panzarasa P (2011) Community structure and patterns of scientific collaboration in business and management. Scientometrics 89 (1):381–396

    Google Scholar 

  35. Fejes S, Rosenfeld A (1997) Discrete active models and applications. Pattern Recognit 30(5):817–835

    Google Scholar 

  36. Feld SL (1981) The focused organization of social ties. Am J Sociol 86(5):1015–1035

    Google Scholar 

  37. Fortunato S (2010) Community detection in graphs. Phys Rep 486 (3-5):75–174

    MathSciNet  Google Scholar 

  38. Friedland NS, Rosenfeld A (1992) Compact object recognition using energy-function-based optimization. IEEE Trans Pattern Anal Mach Intell 14(7):770–777. https://doi.org/10.1109/34.142912

    Article  Google Scholar 

  39. Friedland NS, Rosenfeld A (1997) An integrated approach to 2d object recognition. Pattern Recogn 30(3):525–535

    Google Scholar 

  40. Girvan M, Newman ME (2002) Community structure in social and biological networks. Proc Natl Acad Sci 99(12):7821–7826

    MathSciNet  MATH  Google Scholar 

  41. Gregori E, Lenzini L, Mainardi S (2013) Parallel k-clique community detection on large-scale networks. IEEE Trans Parallel Distrib Syst 24(8):1651–1660

    Google Scholar 

  42. Han H, Xu S, Gui J, Qiao X, Zhu L, Zhang H (2014) Uncovering research topics of academic communities of scientific collaboration network. Int J Distrib Sens Netw 10(4):529842

    Google Scholar 

  43. He B, Ding Y, Tang J, Reguramalingam V, Bollen J (2013) Mining diversity subgraph in multidisciplinary scientific collaboration networks: a meso perspective. J Informetr 7(1):117–128

    Google Scholar 

  44. Hecht-Nielsen R, Zhou YT (1995) Vartac: a foveal active vision atr system. Neural Netw 8(7-8):1309–1321

    Google Scholar 

  45. Hemminger TL, Pao YH (1994) Detection and classification of underwater acoustic transients using neural networks. IEEE Trans Neural Netw 5 (5):712–718

    Google Scholar 

  46. Hemminger TL, Pomalaz-Raez CA (1996) Rotation-and scale-independent pattern recognition through optimization. Pattern Recognit 29(3):487–495

    Google Scholar 

  47. Hofmann T (1999) Probabilistic latent semantic analysis. In: Proceedings of the Fifteenth conference on uncertainty in artificial intelligence, Morgan Kaufmann Publishers Inc., pp 289–296

  48. Huang X, Gu J, Wu Y (1993) A constrained approach to multifont chinese character recognition. IEEE Trans Pattern Anal Mach Intell 15(8):838–843

    Google Scholar 

  49. Ichise R, Takeda H, Muraki T (2006) Research community mining with topic identification. In: Information visualization, 2006. IV 2006. tenth international conference on IEEE, pp 276–281

  50. Jolion JM, Rosenfeld A (1989) Cluster detection in background noise. Pattern Recogn 22(5):603– 607

    Google Scholar 

  51. Kamgar-Parsi B, Jones JL, Rosenfeld A (1989) Registration Of multiple overlapping range images: scenes without distinctive features. In: Computer vision and pattern recognition, 1989. Proceedings CVPR’89. IEEE computer society conference on IEEE, pp 282–290

  52. Karalič A, Bratko I (1997) First order regression. Mach Learn 26(2-3):147–176

    MATH  Google Scholar 

  53. Kononenko I, Bratko I (1991) Information-based evaluation criterion for classifier’s performance. Mach Learn 6(1):67–80

    Google Scholar 

  54. Krichel T, Bakkalbasi N (2006) A social network analysis of research collaboration in the economics community. In: International conference on webometrics, informetrics & scientometrics, Nancy, France, pp 10–12

  55. Krogan NJ, Cagney G, Yu H, Zhong G, Guo X, Ignatchenko A, Li J, Pu S, Datta N, Tikuisis AP, et al. (2006) Global landscape of protein complexes in the yeast saccharomyces cerevisiae. Nature 440(7084):637

    Google Scholar 

  56. Kronegger L, Mali F, Ferligoj A, Doreian P (2012) Collaboration structures in slovenian scientific communities. Scientometrics 90(2):631–647

    Google Scholar 

  57. Latecki L, Rosenfeld A, Silverman R (1995) Generalized convexity: Cp3 and boundaries of convex sets. Pattern Recogn 28(8):1191–1199

    Google Scholar 

  58. Leskovec J, Mcauley JJ (2012) Learning to discover social circles in ego networks. In: Advances in neural information processing systems, pp 539–547

  59. Leskovec J, Sosic R (2014) Snap: A general purpose network analysis and graph mining library in c++

  60. Liu X, Bollen J, Nelson ML, Van de Sompel H (2005) Co-authorship networks in the digital library research community. Inform Process Manag 41(6):1462–1480

    Google Scholar 

  61. Meer P, Baugher ES, Rosenfeld A (1988) Extraction of trend lines and extrema from multiscale curves. Pattern Recogn 21(3):217–226

    Google Scholar 

  62. Meer P, Jolion JM, Rosenfeld A (1990a) A fast parallel algorithm for blind estimation of noise variance. IEEE Trans Pattern Anal Mach Intell 12 (2):216–223

    Google Scholar 

  63. Meer P, Sher CA, Rosenfeld A (1990b) The chain pyramid: hierarchical contour processing. IEEE Trans Pattern Anal Mach Intell 12(4):363–376

    Google Scholar 

  64. Montanvert A, Meer P, Rosenfeld A (1991) Hierarchical image analysis using irregular tessellations. IEEE Trans Pattern Anal Mach Intell 13 (4):307–316. https://doi.org/10.1109/34.88566

    Article  Google Scholar 

  65. Newman ME (2001a) Scientific collaboration networks. i. network construction and fundamental results. Phys Rev E 64(1):016131

    Google Scholar 

  66. Newman ME (2001b) Scientific collaboration networks. ii. shortest paths, weighted networks, and centrality. Phys Rev E 64(1):016132

    Google Scholar 

  67. Newman ME (2004) Fast algorithm for detecting community structure in networks. Phys Rev E 69(6):066133

    Google Scholar 

  68. Palla G, Derényi I, Farkas I, Vicsek T (2005) Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043):814

    Google Scholar 

  69. Pepe A, Rodriguez M (2009) Collaboration in sensor network research: an in-depth longitudinal analysis of assortative mixing patterns. Scientometrics 84 (3):687–701

    Google Scholar 

  70. Phillips TY, Rosenfeld A, Sher AC (1989) O (log n) bimodality analysis. Pattern Recogn 22(6):741–746

    Google Scholar 

  71. Porter MF (1980) An algorithm for suffix stripping. Program 14 (3):130–137

    Google Scholar 

  72. Porter MA, Onnela JP, Mucha PJ (2009) Communities in networks. Notices of the AMS 56(9):1082–1097

    MathSciNet  MATH  Google Scholar 

  73. Rehurek R, Sojka P (2010) Software framework for topic modelling with large corpora. In: Proceedings of the LREC 2010 workshop on new challenges for NLP frameworks, Citeseer

  74. Rosen-Zvi M, Griffiths T, Steyvers M, Smyth P (2004) The author-topic model for authors and documents. In: Proceedings of the 20th conference on Uncertainty in artificial intelligence, AUAI Press, pp 487–494

  75. Rosenfeld A, Sher AC (1988) Detection and delineation of compact objects using intensity pyramids. Pattern Recogn 21(2):147–151

    Google Scholar 

  76. Schaeffer SE (2007) Graph clustering. Comput Sci Rev 1(1):27–64

    MATH  Google Scholar 

  77. Shi Q, Qiao X, Xu S, Nong G (2013) Author-topic evolution model and its application in analysis of research interests evolution. J Chin Soc Sci Tech Inf 32(9):912–919

    Google Scholar 

  78. Simmel G (2010) Conflict and the web of group affiliations. Simon and Schuster

  79. Sitaraman R, Rosenfeld A (1989) Probabilistic analysis of two stage matching. Pattern Recogn 22 (3):331–343

    MathSciNet  Google Scholar 

  80. Steyvers M, Smyth P, Rosen-Zvi M, Griffiths T (2004) Probabilistic author-topic models for information discovery. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 306–315

  81. Thompson S, Rosenfeld A (1995) Isotropic growth on a grid. Pattern Recognit 28(2):241–253

    Google Scholar 

  82. Thompson SF, Rosenfeld A (1997) Growth processes based on 8-neighbor time delays. Pattern Recognit 30(2):321–337

    MATH  Google Scholar 

  83. Tóth B, Vicsek T, Palla G (2013) Overlapping modularity at the critical point of k-clique percolation. J Stat Phys 151(3-4):689–706

    MathSciNet  MATH  Google Scholar 

  84. Van Nguyen M, Kirley M, García-flores R (2012) Community evolution in a scientific collaboration network. In: Evolutionary computation (CEC) 2012 IEEE Congress on, IEEE, pp. 1–8

  85. Waksman A, Rosenfeld A (1996) Sparse, opaque three-dimensional texture, 2b: photometry. Pattern Recogn 29(2):297–313

    Google Scholar 

  86. Wu AY, Bhaskar S, Rosenfeld A (1989) Parallel processing of region boundaries. Pattern Recogn 22(2):165–172

    Google Scholar 

  87. Wu YJ, Chau PM, Hecht-Nielsen R (1995) A supervised learning neural network coprocessor for soft-decision maximum-likelihood decoding. IEEE Trans Neural Netw 6(4):986–992

    Google Scholar 

  88. Wu Y, Iyengar SS, Jain R, Bose S (1994) A new generalized computational framework for finding object orientation using perspective trihedral angle constraint. IEEE Trans Pattern Anal Mach Intell 16(10):961–975

    Google Scholar 

  89. Wu AY, Rosenfeld A (1988) Parallel processing of encoded bit strings. Pattern Recognit 21(6):559–565

    MathSciNet  MATH  Google Scholar 

  90. Wu S, Wang J, Feng X, Lu D (2013) Scientific collaboration networks in chinaâ™system engineering subject. Int J of u-and e-Service Sci Technol 6(6):31–40

    Google Scholar 

  91. Xie J, Kelley S, Szymanski BK (2013) Overlapping community detection in networks: the state-of-the-art and comparative study. Acm Comput Surv (csur) 45(4):43

    MATH  Google Scholar 

  92. Xu S, Shi Q, Qiao X, Zhu L, Zhang H, Jung H, Lee S, Choi SP (2014) A dynamic users interest discovery model with distributed inference algorithm. Int J Distrib Sens N 10(4):280892

    Google Scholar 

  93. Yang J, Leskovec J (2013) Overlapping community detection at scale: a nonnegative matrix factorization approach. In: Proceedings of the sixth ACM international conference on Web search and data mining, ACM, pp 587–596

  94. Yang J, McAuley J, Leskovec J (2014) Detecting cohesive and 2-mode communities indirected and undirected networks. In: Proceedings of the 7th ACM international conference on Web search and data mining, ACM, pp 323–332

  95. Zakarauskas P, Ozard JM (1996) Complexity analysis for partitioning nearest neighbor searching algorithms. IEEE Trans Pattern Anal Mach Intell 18 (6):663–668

    Google Scholar 

  96. Zhang C, Bu Y, Ding Y, Xu J (2018) Understanding scientific collaboration: homophily, transitivity, and preferential attachment. J Assoc Inf Sci Technol 69(1):72–86

    Google Scholar 

  97. Zhang Z, Li Q, Zeng D, Gao H (2013) User community discovery from multi-relational networks. Decis Support Syst 54(2):870–879

    Google Scholar 

  98. Zhao W, Chen JJ, Perkins R, Liu Z, Ge W, Ding Y, Zou W (2015) A heuristic approach to determine an appropriate number of topics in topic modeling. In: BMC Bioinformatics, Biomed Central, vol 16, p S8

  99. Zitnik M, Sosic R, Leskovec J (2018) Prioritizing network communities. arXiv:180502411

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Deepak Sharma.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sharma, D., Kumar, B., Chand, S. et al. Uncovering research trends and topics of communities in machine learning. Multimed Tools Appl 80, 9281–9314 (2021). https://doi.org/10.1007/s11042-020-10072-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-10072-8

Keywords

Navigation