Skip to main content
Log in

The taxonomy of research collaboration in science and technology: evidence from mechanical research through probabilistic clustering analysis

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

This paper suggests an empirical framework to classify research collaboration activities with developed indicators that carry on a previous theoretical framework (Wagner [Science and Technology Policy for Development, Dialogues at the Interface, 2006]; Wagner et al. [Linking effectively: Learning lessons from successful collaboration in science and technology. DB-345-OSTP, 2002]) by employing the Gaussian mixture model, an advanced probabilistic clustering analysis. By further exploring the method upon a profound evidence-based reflection of actual phenomena, this paper also proposes an exploratory analysis to manage and evaluate research projects upon their differentiated classification in a preceding perspective of research collaboration and R&D management. In addition, the results show that international collaboration tends to be associated with more evenly committed collaboration, and that collaboration featuring a higher degree of funding or dispersed commitments generally results in larger outcomes than research clustered on the opposite side of the framework.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. Wagner (2006) lists the major motivations that organize international collaboration: increasing visibility among peers and exploiting complementary capabilities, sharing the costs of projects that are large in scale or scope, obtaining access or sharing expensive physical resources, achieving greater leverage by sharing their data, and exchanging ideas in order to encourage greater creativity.

  2. European organization for nuclear research.

  3. International project to design and build an experimental fusion reactor.

  4. International space station project.

  5. Human frontiers science program.

  6. Human genome project.

  7. Intergovernmental panel on climate.

  8. Arctic research.

  9. Ocean drilling program.

  10. Wagner (2006) stresses taxonomy of international collaboration in using this frame.

  11. Akaike information criterion (AIC), Bayesian information criterion (BIC) and minimum description length criterion (MDL) are commonly used as criteria.

  12. The MDL criterion in this study, the two-part MDL code suggested by Rissanen (1983), is formally identical to the BIC (Schwarz 1978), although the BIC is conceptually different to the MDL criterion.

  13. In Korea, these megascience programs are performed primarily by other institutes: Korea aerospace research institute, National fusion research institute, or Pohang accelerator laboratory.

  14. Rather than such an academic topic, in order to meet ordinary national agendas, the research topics of the department to which the co-authors belong—namely, the department of system dynamics—mainly focus on designs that reduce the noise otherwise created by industrial equipment.

  15. Impact factor is gathered from SCI papers only. It may be considered biased, but no critical difference in ranking compared to SCI publications between the upper-half and lower-half groups can be found. This fact rather enhances the stratification between clusters.

  16. Observation × dimension.

References

  • Abramo, G., D’Angelo, C. A., & Solazzi, M. (2011). The relationship between scientists’ research performance and the degree of internationalization of their research. Scientometrics, 86(3), 629–643.

    Article  Google Scholar 

  • Acedo, F. J., Barroso, C., Casanueva, C., & Galán, J. L. (2006). Co-authorship in management and organizational studies: An empirical and network analysis. Journal of Management Studies, 43(5), 957–983.

    Article  Google Scholar 

  • Bouman, C. A., Shapiro, M., Cook, G. W., Atkins, C. B., Cheng, H., Jennifer, G., et al. (2005). Cluster: An unsupervised algorithm for modeling gaussian mixtures. West Lafayette: School of Electrical Engineering Purdue University.

    Google Scholar 

  • Breschi, S., & Malerba, F. (2011). Assessing the scientific and technological outcome of EU framework programmes: Evidence from the FP6 projects in the ICT field. Scientometrics, 88(1), 239–257.

    Article  Google Scholar 

  • Cheng, J., Yang, J., & Zhou, Y. (2005). A novel adaptive Gaussian mixture model for background subtraction. Lecture Notes in Computer Science, 3522, 587–593.

    Article  Google Scholar 

  • Cohen, W. M., Nelson, R. R., & Walsh, J. P. (2002). Links and impacts: The influence of public research on industrial R&D. Management Science, 48(1), 1–23.

    Article  Google Scholar 

  • Crane, D. (1972). Invisible colleges. Chicago: University of Chicago Press.

    Google Scholar 

  • Crowston, K. (1994). A taxonomy of organisational dependencies and coordination mechanisms. MIT Center for Coordination Science Working Paper. Massachusetts Institute of Technology. Retrieved from http://ccs.mit.edu/ccsmainhtml). Accessed 1 June 2011.

  • Dempster, A., Laird, N., & Rubin, D. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of Royal Statistical Society Series B, 39(1), 1–38.

    MathSciNet  MATH  Google Scholar 

  • Edge, D. (1979). Quantitative measures of communication in science: A critical review. History of Science, 17, 102–134.

    Google Scholar 

  • Frame, J. D. (1987). Managing projects in organizations. How to make best use of time, techniques, and people. San Francisco: Jossey-Bass.

    Google Scholar 

  • Goffman, W., & Warren, K. S. (1980). Scientific information systems and the principle of selectivity. New York: Praeger.

    Google Scholar 

  • Hagstrom, W. O. (1965). The scientific community. New York: Basic Books.

    Google Scholar 

  • Han, D. S., Jang, D. H., Han, S. H., & Yang, J. M. (2008). An empirical study on the impacts of public funding on the research performance of academic faculties. Korean Public Administration Review, 42(4), 265–290.

    Google Scholar 

  • Hansen, M. H., & Yu, B. (2001). Model selection and the principle of minimum description length. Journal of American Statistical Association, 96(454), 746–774.

    Article  MathSciNet  MATH  Google Scholar 

  • Hinings, C. R., & Greenwood, R. (1996). Working together. In P. J. Frost & S. M. Taylor (Eds.), Rhythms of academic life: Personal accounts of careers in Academia (pp. 225–237). Thousand Oaks: Sage.

    Google Scholar 

  • Hoegl, M., & Gemuenden, H. G. (2001). Teamwork quality and the success of innovative projects: A theoretical concept and empirical evidence. Organization Science, 12(4), 435–449.

    Article  Google Scholar 

  • Kashyap, R. L. (1980). Inconsistency of the AIC rule for estimating the order of autoregressive models. IEEE Transactions on Automatic Control, 25(5), 996–998.

    Article  MathSciNet  MATH  Google Scholar 

  • Katz, J. S., & Martin, B. R. (1997). What is research collaboration? Research Policy, 26, 1–18.

    Article  Google Scholar 

  • Laband, D. N., & Tollison, R. D. (2000). Intellectual collaboration. Journal of Political Economy, 108, 632–662.

    Article  Google Scholar 

  • Laudel, G. (2002). What do we measure by co-authorships? Research Evaluation, 11, 3–15.

    Article  Google Scholar 

  • Liao, C. H. (2011). How to improve research quality? Examining the impacts of collaboration intensity and member diversity in collaboration networks. Scientometrics, 86(3), 747–761.

    Article  Google Scholar 

  • Lundberg, J., Tomson, G., Lundkvist, I., Skar, J., & Brommels, M. (2006). Collaboration uncovered: Exploring the adequacy of measuring university-industry collaboration through co-authorship and funding. Scientometrics, 69, 575–589.

    Article  Google Scholar 

  • Martin, B., & Salter, A. (1996). The relationship between publicly funded basic research and economic performance. Report of the science and policy research Unit. East Sussex: University of Sussex.

    Google Scholar 

  • Melin, G., & Persson, O. (1996). Studying research collaboration using co-authorships. Scientometrics, 36, 363–377.

    Article  Google Scholar 

  • Narin, F., & Whitlow, E. S. (1990). Measurement of scientific cooperation and coauthorship in CEC-related areas of science (report EUR 12900). Luxembourg: Office for Official Publications of the European Communities.

    Google Scholar 

  • Newman, M. E. J. (2001). Scientific collaboration networks. Physical Review E, 64. doi:10.1103/PhysRevE.64.016131.

  • Payne, J. (1995). Management of multiple simultaneous projects: A state-of-the-art review. International Journal of Project Management, 13(3), 163–168.

    Article  Google Scholar 

  • Pfeffer, J., & Salancik, G. R. (1978). The design and management of externally controlled organization. New York: Harper and Row.

    Google Scholar 

  • Piette, M. J., & Ross, K. L. (1992). An analysis of the determinants of co-authorship in economics. The Journal of Economic Education, 23, 277–283.

    Google Scholar 

  • Rissanen, J. (1983). A universal prior for integers and estimation by minimum description length. Annals of Statistics, 11(2), 417–431.

    Article  MathSciNet  Google Scholar 

  • Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6(2), 461–464.

    Article  MathSciNet  MATH  Google Scholar 

  • Smith, D., & Katz, J. S. (2000). Collaborative approaches to research, HEFCE fund review of research policy and funding. East Sussex: University of Sussex.

    Google Scholar 

  • Solla Price, D., & Beaver, D. (1966). Collaboration in an invisible college. American Psychologist, 21, 1011–1018.

    Article  Google Scholar 

  • Thomson, J. D. (1967). Organizations in Action. NY: McGraw-Hill.

    Google Scholar 

  • Traore, N., & Landry, R. (1997). On the determinants of scientists’ collaboration. Science Communication, 19, 124–140.

    Google Scholar 

  • Um, I. (2011). The determinants of R&D budget. Doctoral dissertation, Kookmin University,Seoul.

  • Vafeas, N. (2010). Determinants of single authorship. EuroMed Journal of Business, 5, 332–344.

    Google Scholar 

  • van Raan, A. F. J. (1998). The influence of international collaboration on the impact of research result. Scientometrics, 42, 423–428.

    Article  Google Scholar 

  • Wagner, C. S. (2005). Six case studies of international collaboration in science. Scintrometrics, 62, 3–26.

    Article  Google Scholar 

  • Wagner, C. S. (2006). International collaboration in science and technology: Promises and pitfalls. In B. Louk & E. Rutger (Eds.), Science and technology policy for development, dialogues at the interface (pp. 165–176). London: Anthem Press.

    Google Scholar 

  • Wagner, C. S., & Leydesdorff, L. (2005). Network structure, self-organization, and the growth of international collaboration in science. Research Policy, 34, 1608–1618.

    Article  Google Scholar 

  • Wagner, C. S., Staheli, L., Silberglitt, R., Wong, A., & Kadtke, J. (2002). Linking effectively: Learning lessons from successful collaboration in science and technology. DB-345-OSTP. Santa Monica: RAND.

    Google Scholar 

  • Wagner, C., Yezril, A., & Hassell, S. (2000). International cooperation in research and development: An update to an inventory of US government spending, MR-1248. Santa Monica: RAND.

    Google Scholar 

  • Woolley, A. W., Chabris, C. F., Pentland, A., Hashmi, N., & Malone, T. W. (2010). Evidence for a collective intelligence factor in the performance of human groups. Science, 330(6004), 686–688. doi:10.1126/science.1193147.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jae Young Choi.

Appendix

Appendix

Finding the maximum likelihood mixture density parameters via the EM algorithm can be performed as follows. AIC and MDL differ in their penalty terms. The MDL criterion reflects the total number of NM Footnote 16 data values, unlike the AIC criterion, in which data tends to result in an over-fitting of the model. Thus, when the number of observations heads toward infinity, the estimated model order C does not converge to the true value (Kashyap 1980). The MDL criterion, on the contrary, attempts to find the model order C by minimizing the MDL value, the code of which reflects both the number of data samples and the parameter vector. Consequently, unlike AIC, the MDL criterion does not have the limitations of over-fitting the model by generally guaranteeing a consistent estimator (Bouman et al. 2005). This study adopts the MDL criterion proposed by Rissanen (1983) for order identification.

Direct minimization of the MDL criterion is difficult, but if only the clusters of each observation, \( x_{n} , \) are known, then the estimation of parameters \( \theta = \left( {\pi ,\mu ,W} \right) \) will be quite simple (Bouman et al. 2005).

The EM algorithm is a method that is generally used to find the MLE of the parameters, especially when the data is incomplete or has missing values (Dempster et al. 1977). Although the EM algorithm can be applied in cases where the data actually has missing values, it is generally applied in cases where optimizing the likelihood function is analytically intractable or the likelihood function can be simplified by assuming the existence of values for additional but missing or hidden parameters.

The EM algorithm first finds the expected value of the complete-data log-likelihood \( Q(\theta ;\theta^{(i)} ) \) with respect to the unknown data, \( x_{n} , \) given \( Y_{n} \) and \( \theta^{(i)} , \) which are the observations and current parameter estimates of the clusters, respectively. Membership is represented by a probability function. This study adopts the merging approach taken by Rissanen (1983), which constrains the parameters of two clusters to be equal to a decrease in the number of clusters from C to C − 1. In other words, if two clusters, a and b, are merged into a single cluster, their mean and covariance parameters are constrained to be equal, as denoted in Eqs. A.1 and A.2.

$$ \mu_{a} = \mu_{b} = \mu_{(a,b)} , $$
(A.1)
$$ W_{a} = W_{b} = W_{(a,b)} , $$
(A.2)
$$ d(a,b) = Q(\theta^{*} ;\theta^{(i)} ) - Q(\theta^{*}_{(a,b)} ;\theta^{(i)} ),\,{\text{and}} $$
(A.3)
$$ \left( {a^{*} ,b^{*} } \right) = \mathop {\arg \min }\limits_{(a,b)} d(a,b), $$
(A.4)

where \( \mu_{(a,b)} \) and \( W_{(a,b)} \) denote the mean and covariance of the new cluster and \( d(a,b) \) denotes the distant function, In addition, \( \theta^{*} \) and \( \theta^{*}_{(a,b)} \) are the unconstrained and constrained optima. In particular, if the EM algorithm has been conducted to converge for a fixed order \( C,\,\theta^{*} \) equals \( \theta^{(i)} , \) which satisfies \( Q(\theta^{(i)} ;\theta^{(i)} ) - Q(\theta^{*} ;\theta^{(i)} ) = 0 \). The value of \( \theta^{*}_{(a,b)} \) is obtained by maximizing \( Q(\theta_{(a,b)} ;\theta^{(i)} ) \) as a function of \( \theta_{(a,b)} , \) subject to the constraints.

Having found the cluster pair \( (a^{*} ,b^{*} ) \) that minimizes the distant function, and an upper bound on the change in the MDL criterion among all pairs, \( (a^{*} ,b^{*} ) \) are merged. The parameters of this merged cluster are then calculated and used as an initial value for another EM optimization process with C − 1 clusters (Bouman et al. 2005).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jeong, S., Choi, J.Y. The taxonomy of research collaboration in science and technology: evidence from mechanical research through probabilistic clustering analysis. Scientometrics 91, 719–735 (2012). https://doi.org/10.1007/s11192-012-0686-9

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-012-0686-9

Keywords

Mathematics Subject Classification (2000)

JEL Classification

Navigation