Skip to main content
Log in

How we collaborate: characterizing, modeling and predicting scientific collaborations

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

The large amounts of publicly available bibliographic repositories on the web provide us great opportunities to study the scientific behaviors of scholars. This paper aims to study the way we collaborate, model the dynamics of collaborations and predict future collaborations among authors. We investigate the collaborations in three disciplines including physics, computer science and information science,and different kinds of features which may influence the creation of collaborations. Path-based features are found to be particularly useful in predicting collaborations. Besides, the combination of path-based and attribute-based features achieves almost the same performance as the combination of all features considered. Inspired by the findings, we propose an agent-based model to simulate the dynamics of collaborations. The model merges the ideas of network structure and node attributes by leveraging random walk mechanism and interests similarity. Empirical results show that the model could reproduce a number of realistic and critical network statistics and patterns. We further apply the model to predict collaborations in an unsupervised manner and compare it with several state-of-the-art approaches. The proposed model achieves the best predictive performance compared with the random baseline and other approaches. The results suggest that both network structure and node attributes may play an important role in shaping the evolution of collaboration networks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Abbasi, A., Hossain, L., & Leydesdorff, L. (2012). Betweenness centrality as a driver of preferential attachment in the evolution of research collaboration networks. Journal of Informetrics, 6(3), 403–412.

    Article  Google Scholar 

  • Adamic, L. A., & Adar, E. (2003). Friends and neighbors on the web. Social Networks, 25(3), 211–230.

    Article  Google Scholar 

  • Amaral, L. A. N., Scala, A., Barthelemy, M., & Stanley, H. E. (2000). Classes of small-world networks. Proceedings of the National Academy of Sciences, 97(21), 11,149–11,152.

    Article  Google Scholar 

  • Backstrom, L., & Leskovec, J. (2011). Supervised random walks: Predicting and recommending links in social networks. In: Proceedings of the fourth ACM international conference on Web search and data mining, ACM, pp. 635–644.

  • Barabási, A. L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439), 509–512.

    Article  MathSciNet  Google Scholar 

  • Barabási, A. L., Jeong, H., Néda, Z., Ravasz, E., Schubert, A., & Vicsek, T. (2002). Evolution of the social network of scientific collaborations. Physica A: Statistical Mechanics and its Applications, 311(3), 590–614.

    Article  MATH  MathSciNet  Google Scholar 

  • Beaver, D., & Rosen, R. (1978). Studies in scientific collaboration. part i. The professional origins of scientific co-authorship. Scientometrics, 1(1), 65–84.

    Article  Google Scholar 

  • Beaver, D., & Rosen, R. (1979). Studies in scientific collaboration part iii. Professionalization and the natural history of modern scientific co-authorship. Scientometrics, 1(3), 231–245.

    Article  Google Scholar 

  • Boguna, M., Pastor-Satorras, R., Diaz-Guilera, A., & Arenas, A. (2004). Models of social networks based on social distance attachment. Physical Review E, 70(5), 56,122.

    Article  Google Scholar 

  • Börner, K., Contractor, N., Falk-Krzesinski, H. J., Fiore, S. M., Hall, K. L., Keyton, J., et al. (2010). A multi-level systems perspective for the science of team science. Science Translational Medicine, 2(49), 49cm24–49cm24.

    Article  Google Scholar 

  • Clauset, A., Moore, C., & Newman, M. E. (2008). Hierarchical structure and the prediction of missing links in networks. Nature, 453(7191), 98–101.

    Article  Google Scholar 

  • de Beaver, D., & Rosen, R. (1979). Studies in scientific collaboration. part ii. Scientific co-authorship, research productivity and visibility in the french scientific elite. Scientometrics, 1(2), 133–149.

    Article  Google Scholar 

  • de Solla Price, D. J., & Beaver, D. (1966). Collaboration in an invisible college. American Psychologist, 21(11), 1011.

    Article  Google Scholar 

  • Fiala, D., Rousselot, F., & Ježek, K. (2008). Pagerank for bibliographic networks. Scientometrics, 76(1), 135–158.

    Article  Google Scholar 

  • Getoor, L., & Diehl, C. P. (2005). Link mining: A survey. ACM SIGKDD Explorations Newsletter, 7(2), 3–12.

    Article  Google Scholar 

  • Granovetter, M. (1973). The strength of weak ties. American Journal of Sociology, 78(6), 1.

    Article  Google Scholar 

  • He, B., Ding, Y., Tang, J., Reguramalingam, V., & Bollen, J. (2013). Mining diversity subgraph in multidisciplinary scientific collaboration networks: A meso perspective. Journal of Informetrics, 7(1), 117–128.

    Article  Google Scholar 

  • Hou, H., Kretschmer, H., & Liu, Z. (2008). The structure of scientific collaboration networks in scientometrics. Scientometrics, 75(2), 189–202.

    Article  Google Scholar 

  • Huang, J., Zhuang, Z., Li, J., & Giles, CL. (2008). Collaboration over time: Characterizing and modeling network evolution. In: Proceedings of the 2008 international conference on web search and data mining, ACM, pp. 107–116.

  • Katz, L. (1953). A new status index derived from sociometric analysis. Psychometrika, 18(1), 39–43.

    Article  MATH  Google Scholar 

  • Ley, M., & Reuther, P. (2006). Maintaining an online bibliographical database: The problem of data quality. In EGC, pp. 5–10.

  • Liben-Nowell, D., & Kleinberg, J. (2007). The link-prediction problem for social networks. Journal of the American Society for Information Science and Technology, 58(7), 1019–1031.

    Article  Google Scholar 

  • Lichtenwalter, RN., Lussier, JT., & Chawla, NV., (2010). New perspectives and methods in link prediction. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp. 243–252.

  • Liu, Y., Rousseau, R., & Guns, R. (2013). A layered framework to study collaboration as a form of knowledge sharing and diffusion. Journal of Informetrics, 7(3), 651–664.

    Article  Google Scholar 

  • McCarty, C., Jawitz, J. W., Hopkins, A., & Goldman, A. (2013). Predicting author h-index using characteristics of the co-author network. Scientometrics, 96(2), 467–483.

    Article  Google Scholar 

  • Milojević, S. (2014). Principles of scientific research team formation and evolution. Proceedings of the National Academy of Sciences, 111(11), 3984–3989.

    Article  Google Scholar 

  • Newman, M. E. (2001a). Clustering and preferential attachment in growing networks. Physical Review E, 64(2), 025,102.

    Article  Google Scholar 

  • Newman, M. E. (2001b). The structure of scientific collaboration networks. Proceedings of the National Academy of Sciences, 98(2), 404–409.

    Article  MATH  MathSciNet  Google Scholar 

  • Newman, M. E. (2002). Assortative mixing in networks. Physical Review Letters, 89(20), 208,701.

    Article  Google Scholar 

  • Newman, M. E. (2004). Coauthorship networks and patterns of scientific collaboration. Proceedings of the National Academy of Sciences, 101(suppl 1), 5200–5205.

    Article  Google Scholar 

  • Payette, N. (2012). Agent-based models of science. In: Models of science dynamics (pp. 127–157). Berlin: Springer.

  • Sharan, U., & Neville, J., (2008). Temporal-relational classifiers for prediction in evolving domains. In: Data mining, 2008. ICDM’08. Eighth IEEE international conference on, IEEE, pp. 540–549.

  • Sun, X., Kaur, J., Milojević, S., Flammini, A., & Menczer, F., (2013a). Social dynamics of science. Scientific Reports 3:1069, doi:10.1038/srep01069.

  • Sun, X., Kaur, J., Possamai, L., & Menczer, F. (2013b). Ambiguous author query detection using crowdsourced digital library annotations. Information Processing & Management, 49(2), 454–464.

    Article  Google Scholar 

  • Sun, X., Lin, H., Xu, K., (2015). A Social Network Model Driven by Events and Interests. Expert Systems With Applications 42(9):4229–4238, doi:10.1016/j.eswa.2015.01.020.

  • Tang, J., Wu, S., Sun, J., & Su, H. (2012). Cross-domain collaboration recommendation. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp. 1285–1293.

  • Toivonen, R., Kovanen, L., Kivelä, M., Onnela, J. P., Saramäki, J., & Kaski, K. (2009). A comparative study of social network models: Network evolution models and nodal attribute models. Social Networks, 31(4), 240–254.

    Article  Google Scholar 

  • Tong, H., Faloutsos, C., & Pan, JY. (2006). Fast random walk with restart and its applications. 2013 IEEE 13th international conference on data mining 0:613–622.

  • Watts, D. J., & Strogatz, S. H. (1998). Collective dynamics of ‘small-world’ networks. Nature, 393(6684), 440–442.

    Article  Google Scholar 

  • Watts, D. J., Dodds, P. S., & Newman, M. E. J. (2002). Identity and search in social networks. Science, 296(5571), 1302–1305.

    Article  Google Scholar 

Download references

Acknowledgments

This work is partially supported by grant from the Natural Science Foundation of China (No. 61277370, 61402075), Natural Science Foundation of Liaoning Province, China (No. 201202031), the Fundamental Research Funds for the Central Universities.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaoling Sun.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, X., Lin, H., Xu, K. et al. How we collaborate: characterizing, modeling and predicting scientific collaborations. Scientometrics 104, 43–60 (2015). https://doi.org/10.1007/s11192-015-1597-3

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-015-1597-3

Keywords

Navigation