Abstract
Studies are actively ongoing for better understanding and strengthening the capabilities of researchers. To do so requires an accurate diagnosis and analysis of such researchers. Therefore, data of each researcher must be collected and be identified in a big-data environment. Consequently, researcher-name identification has emerged as an important issue. This paper proposes a framework for collecting, refining, identifying, and publicly offering researcher data. For identifying authors’ name, the proposed framework extracts timeline based patterns that make help to identify the same name authors with their representative attributes such as emails and affiliations. The results of the proposed framework based on timeline patterns, show a 69.5 % average author-identification rate given a group of otherwise unidentified authors.
Similar content being viewed by others
References
Suber, P. (2007). Open access overview. Retrieved from Peter Suber’s website: http://www.earlham.edu/~peters/fos/overview.htm.
Onte, M. B., & Marcial, D. E. (2013). Developing a web-based knowledge product outsourcing system at a university. Journal of Information Processing Systems, 9(4), 548–566.
Song, S. K., Kim, D. J., Hwang, M., Kim, J., Jeong, D. H., Lee, S., & Sung, W. (2013). Prescriptive analytics system for improving research power. In 2013 IEEE 16th international conference on computational science and engineering (CSE) (pp. 1144–1145). IEEE.
Lee, M., Cho, M., Gim, J., Jeong, D. H., & Jung, H. (2014). Prescriptive analytics system for scholar research performance enhancement. In International conference on human-computer interaction (pp. 186–190). Springer.
Kumar, K. K., & Geethakumari, G. (2014). Detecting misinformation in online social networks using cognitive psychology. Human-centric Computing and Information Sciences, 4(1), 1.
Ley, M. (2009). DBLP: some lessons learned. Proceedings of the VLDB Endowment, 2(2), 1493–1500.
Ley, M. (2002). The DBLP computer science bibliography: Evolution, research issues, perspectives. In International symposium on string processing and information retrieval (pp. 1–10). Springer, Berlin.
Katsumata, M. (2014). Task context-aware e-mail platform for collaborative tasks. Human-centric Computing and Information Sciences, 4(1), 1.
Minkov, E., Cohen, W. W., & Ng, A. Y. (2006). Contextual search and name disambiguation in email using graphs. In Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval (pp. 27–34). ACM.
Smalheiser, N. R., & Torvik, V. I. (2009). Author name disambiguation. Annual review of information science and technology, 43(1), 1–43.
Haak, L. L., Fenner, M., Paglione, L., Pentz, E., & Ratner, H. (2012). ORCID: A system to uniquely identify researchers. Learned Publishing, 25(4), 259–264.
Wang, X., Tang, J., Cheng, H., & Philip, S. Y. (2011). Adana: Active name disambiguation. In 2011 IEEE 11th international conference on data mining (pp. 794–803). IEEE.
Kang, I. S., Na, S. H., Lee, S., Jung, H., Kim, P., Sung, W. K., et al. (2009). On co-authorship for author disambiguation. Information Processing and Management, 45(1), 84–97.
Li, L., Wang, H., Gao, H., & Li, J. (2010). EIF: A framework of effective entity identification. In International conference on web-age information management (pp. 717–728). Springer, Berlin.
Maali, F., Cyganiak, R., & Peristeras, V. (2011). Re-using cool URIs: Entity reconciliation against LOD hubs. LDOW, 813.
Ham, K. (2013). OpenRefine (version 2.5). http://openrefine.org. Free, open-source tool for cleaning and transforming data. Journal of the Medical Library Association: JMLA, 101(3), 233.
Bizer, C., & Cyganiak, R. (2006). D2r server-publishing relational databases on the semantic web. In Poster at the 5th international semantic web conference (vol. 175).
Gim, J., Jang, Y., Jeong, D. H., & Jung, H. (2014). Analyzing email patterns with timelines on researcher data. In 2014 3rd international workshop on semantic web-based computer intelligence with big-data (SWCIB workshop) (pp. 25–32).
Bäumer, F. S., Gim, J., Jeong, D. H., Geierhos, M., & Jung, H. (2014). Linked open data system for scientific data sets. In IPaMin@ KONVENS.
Almende, B. V. (2015). vis.js. http://visjs.org/.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that there is no conflict of interest regarding the publication of this paper.
Rights and permissions
About this article
Cite this article
Gim, J., Jang, Y., Jung, H. et al. Feature-Based Researcher Identification Framework Using Timeline Data. Wireless Pers Commun 91, 1653–1667 (2016). https://doi.org/10.1007/s11277-016-3662-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11277-016-3662-5