Abstract
Korean National Science & Technology Information Service (NTIS) provides a service of evaluating national R&D projects and providing such evaluated national R&D projects along with their participating researcher information. It also provides a service of recommending and selecting evaluation committees for the R&D projects. Transparency is an important aspect that should be ensured on the evaluation process of the national R&D projects. Thus, the recommending unfamiliar evaluation committees with the participants of the R&D projects are one of the important aspects that can ensure the transparency for the evaluation process. In this paper, we present an evaluation-committee recommendation system using an online detection method of researcher connections by a partitioning-based clustering algorithm and random walks. The clustering algorithm enables us to partition the network to number of small graphs that can be processed via random walks. Then, we can rank the connection weight of each suspicious researcher according to a researcher in charge of a R&D project and we can exclude the researchers having higher connection weight from the evaluation committee of the R&D project. In addition, we also present a text-data refinement and entity identification method using Jaro–Winkler distance algorithm to construct more precise researcher network.








Similar content being viewed by others
References
Newman, M.E.: Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74(3), 036104 (2006)
Leskovec, J., et al.: Community structure in large networks: natural cluster sizes and the absence of large well-defined clusters. Internet Math. 6(1), 29–123 (2009)
Fortunato, S.: Community detection in graphs. Phys. Rep. 486(3), 75–174 (2010)
Batarfi, O., El Shawi, R., Fayoumi, A.G., Nouri, R., Barnawi, A., Sakr, S.: Large scale graph processing systems: survey and an experimental evaluation. Cluster Comput. 18(3), 1189–1213 (2015)
NTIS, http://www.ntis.go.kr/
Jaro, M.A.: Probabilistic linkage of large public health data files. Stat. Med. 14(5–7), 491–498 (1995)
William, C., Ravikumar, P., Fienberg, S.: A comparison of string metrics for matching names and records. In: KDD Workshop on Data Cleaning and Object Consolidation, vol. 3, pp. 73–78 (2003)
Winkler, W.E.: Overview of record linkage and current research directions. In: Bureau of the Census (2006)
Porter, M.A., Onnela, J.P., Mucha, P.J.: Communities in networks. Notices AMS 56(9), 1082–1097 (2009)
Brandes, U., Delling, D., Gaertler, M., Görke, R., Hoefer, M., Nikoloski, Z., Wagner, D.: On modularity clustering. IEEE Trans. Knowl. Data Eng. 20(2), 172–188 (2008)
Bernard, T., Bui, A., Pilard, L., Sohier, D.: A distributed clustering algorithm for large-scale dynamic networks. Cluster Comput. 15(4), 335–350 (2012)
Jin, S., Lin, W., Yin, H., Yang, S., Li, A., Deng, B.: Community structure mining in big data social media networks with MapReduce. Cluster Comput 18(3), 999–1010 (2015)
Brooks, S.P., Morgan, B.J.: Optimization using simulated annealing. Statistician, pp. 241–257 (1995)
Guimera, R., Sales-Pardo, M., Amaral, L.A.N.: Modularity from fluctuations in random graphs and complex networks. Phys. Rev. E 70(2), 025101 (2004)
Guimera, R., Amaral, L.A.N.: Functional cartography of complex metabolic networks. Nature 433(7028), 895–900 (2005)
Mu, C.H., Xie, J., Liu, Y., Chen, F., Liu, Y., Jiao, L.C.: Memetic algorithm with simulated annealing strategy and tightness greedy optimization for community detection in networks, Appl. Soft Comput. 34, 485–501 (2015)
Moharil, S., Lee, S.Y.: Load balancing on temporally heterogeneous cluster of workstations for parallel simulated annealing. Cluster Comput. 14(4), 295–310 (2011)
Gutierrez-Garcia, J.O., Ramirez-Nafarrate, A.: Agent-based load balancing in cloud data centers. Cluster Comput. 18(3), 1041–1062 (2015)
Newman, M.E.: Fast algorithm for detecting community structure in networks. Phys. Rev. E 69(6), 066133 (2004)
Clauset, A., Newman, M.E., Moore, C.: Finding community structure in very large networks. Phys. Rev. E 70(6), 066111 (2004)
Zhou, Z., Wang, W., Wang, L.: Community detection based on an improved modularity. In: Pattern Recognition. Springer, Berlin Heidelberg (2012)
Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. (CSUR) 31(3), 264–323 (1999)
Van Mieghem, P.: Graph Spectra for Complex Networks. Cambridge University Press, Cambridge (2010)
Kurucz, M., Benczúr, A.A.: Geographically organized small communities and the hardness of clustering social networks. In: Data Mining for Social Network Data, pp. 177–199. Springer, US (2010)
Sarkar, S., Dong, A.: Community detection in graphs using singular value decomposition. Phys. Rev. E 83(4), 046114 (2011)
Kang, U., Meeder, B., Faloutsos, C.: Spectral analysis for billion-scale graphs: Discoveries and implementation. In: Advances in Knowledge Discovery and Data Mining, pp. 13–25. Springer, Berlin Heidelberg (2011)
NDSL, http://www.ndsl.kr/
Geyer, C. J.: Markov chain Monte Carlo maximum likelihood. In: Computing Science and Statistics: Proceedings of the 23rd Symposium on the Interface (1991)
Acknowledgments
This research was supported by Maximize the Value of National Science and Technology by Strengthen Sharing/Collaboration of National R&D Information funded by the Korea Institute of Science and Technology Information (KISTI).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Jeong, H., Kim, YK. & Kim, J. An evaluation-committee recommendation system for national R&D projects using social network analysis. Cluster Comput 19, 921–930 (2016). https://doi.org/10.1007/s10586-016-0545-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-016-0545-1