Collaboration networks are elegant representations for studying the dynamical processes that shape the scientific community. In this paper, we are particularly interested in studying the local context of a node in collaboration network that can help explain the behavior of an author as an individual within the group and a member along with the group. The best representation of such local contextual substructures in a collaboration network are “network motifs”. In particular, we propose two fundamental goodness measures of such a group represented by a motif—productivity and longevity. We observe that while 4-semi clique motif, quite strikingly, shows highest longevity, the productivity of the 4-star and the 4-clique motifs is the largest among all the motifs. Based on the productivity distribution of the motifs, we propose a predictive model that successfully classifies the highly cited authors from the rest. Further, we study the characteristic features of motifs and show how they are related with the two goodness measures. Building on these observations, finally we propose two supervised classification models to predict, early in a researcher’s career, how long the group where she belongs to will persist (longevity) and how much the group would be productive. Thus this empirical study sets the foundation principles of a recommendation system that would forecast how long lasting and productive a given collaboration could be in future.

The code is publicly available in https://github.com/remenberl/KDDCup2013.
Note that, we refer to the rare class as negative class and frequently observed class as positive class in the rest of the paper.
T. Chakraborty was financially supported by Google India Ph.D. Fellowship.
