Abstract
This article describes how to utilize data-driven artificial intelligence (AI) to automate researcher assessment using data from profiling systems. We consider that a researcher assessment is done for a purpose and not divorced from a specific target placement. We formulate researcher assessment as a binary classification task, that is, a candidate researcher is classified as either fit or unfit for a given placement. For classifying researchers, we adopt case-based reasoning, a transparent artificial intelligence methodology that implements analogical reasoning, allows adaptation, machine learning, and explainability. This work addresses a human limitation through AI. Given a small number of candidates for a job or award and a clear job description, even if capable of selecting the best fit candidate, human decisions may be neither transparent nor reproducible. The approach in this article describes how to use AI methods to, from a job description, select the best fit candidate while considering career trajectories, providing explanations, and being reproducible. We describe the implementation of the methodology for a hypothetical placement in a real research institute from real but anonymized curriculum vitae from the Brazilian Lattes Database. We describe an experiment demonstrating that the purpose-oriented approach is more accurate than purpose-independent classifiers. The proposed methodology meets various principles from the Leiden Manifesto.
Similar content being viewed by others
Availability of data and material
Data is not made available because it is under IP. It was obtained through a partnership for this work.
Code availability
Authors are not making code available because the implementation is not generic enough for reuse.
References
Aamodt, A., & Plaza, E. (1994). Case-based reasoning: foundational issues, methodological variations, and system approaches. AI communications. IOS Press, 7(1), 39–59.
Arrieta, A. B., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., et al. (2020). Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion, 58, 82–115.
Bennett, B. M., & Underwood, R. E. (1970). 283. Note: On McNemar’s test for the 2 × 2 table and its power function. Biometrics, 26(2), 339–343. https://doi.org/10.2307/2529083.
Byrne, R. M. J. (2019). Counterfactuals in explainable artificial intelligence (XAI): Evidence from human reasoning. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19) (pp. 6276–6282).
CNPq (2020). National Council for Scientific and Technological Development. Plataforma Lattes. Accessed 28 Nov 2020 http://lattes.cnpq.br
DGP (2020). Directory of research groups in Brazil. Microcephaly Epidemic Research Group. Accessed 28 Nov 2020 http://dgp.cnpq.br/dgp/espelhogrupo/2723404431935999
Doyle, D., Cunningham, P., & Walsh, P. (2006). An evaluation of the usefulness of explanation in a CBR system for decision support in bronchiolitis treatment. Computational Intelligence Journal, 22(3–4), 269–281.
Duarte, K. B., Weber, R. O., & Pacheco, R. C. S. (2016). Case-cased comparison of career trajectories. In ICCBR 2016 Workshops (Vol. 1815, pp. 152–161). Atlanta, GA, USA.
Duarte, K. B. (2017). Assessing researcher quality for collaborative purposes. Doctoral dissertation, Federal University of Santa Catarina (UFSC), Florianópolis, Brazil. Retrieved from http://btd.egc.ufsc.br/?p=2495
Microcephaly Epidemic Research Group. (2016). Microcephaly in infants, Pernambuco state, Brazil, 2015. Emerging infectious Diseases, 22(6), 1090–1093.
Hall, M. A. (1999). Correlation-based Feature Selection for Machine Learning. Doctoral dissertation, University of Waikato, Hamilton, NewZealand. https://doi.org/10.1.1.37.4643
Hicks, D., Wouters, P., Waltman, L., Rijcke, S., & Rafols, I. (2015). The Leiden manifesto for research metrics. Nature, 520(7548), 429–431.
Johs, A., Lutts, M., & Weber, R. (2018). Measuring explanation quality in XCBR. In Proceedings of ICCBR 2018 (pp. 75–78).
Kenny, E. M., & Keane, M. T. (2019). Twin-systems to explain artificial neural networks using case-based reasoning: Comparative tests of feature-weighting methods in ANN-CBR twins for XAI. In Proceedings of the Twenty-Eighth International Joint Conferences on Artificial Intelligence (IJCAI), (pp. 2708–2715).
Kim, B., Rudin, C., & Shah, J. (2014). The Bayesian case model: A generative approach for case-based reasoning and prototype classification. In Advances in neural information processing systems (pp. 1952–1960). http://www.arXiv:1503.01161v1. [stat.ML]. https://proceedings.neurips.cc/paper/2014/file/390e982518a50e280d8e2b535462ec1f-Paper.pdf.
Kohavi, R., & John, G. H. (1997). Wrappers for feature subset selection. Artificial Intelligence, 97(1997), 273–324.
Leake, D. (1992). Evaluating Explanations: A Content Theory. Hillsdale, NJ: Lawrence Erlbaum.
Liu, B., Dai, Y., Li, X., Lee, W. S., & Yu, P. S. (2003). Building Text Classifiers Using Positive and Unlabeled Examples. In Proceedings of the Third IEEE International Conference on Data Mining (p. 179). USA: IEEE Computer Society.
Mitchell, R., Michalski, J., & Carbonell, T. (2013). An artificial intelligence approach. Berlin: Springer.
Moor, J. (2006). The Dartmouth College artificial intelligence conference: The next fifty years. AI Magazine, 27(4), 87–87. https://doi.org/10.1609/aimag.v27i4.1911.
Parliament, E. (2016). Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation).
Richter, M. M., & Weber, R. O. (2013). Case-based reasoning: A textbook. Berlin: Springer-Verlag.
Sørmo, F., Cassens, J., & Aamodt, A. (2005). Explanation in case-based reasoning–perspectives and goals. Artificial Intelligence Review, 24(2), 109–143.
Wettschereck, D., Aha, D. W., & Mohri, T. (1997). A review and empirical evaluation of feature weighting methods for a class of lazy learning algorithms. Artificial Intelligence Review, 11, 273–314. https://doi.org/10.1023/A:1006593614256.
Acknowledgements
The authors thank and acknowledge Professor Roberto C. S. Pacheco for his contributions to earlier versions of this work and for the use of anonymized data from the Lattes Database through the agreement between the Stela Institute and the Goiás State Research Support Foundation (FAPEG), Brazil, where Kedma B. Duarte held a position at the time of this research.
Funding
The authors were not funded during the period this research was conducted.
Author information
Authors and Affiliations
Contributions
Both authors contributed equally. The first author was primarily responsible for the conception of the article and the second author primarily responsible for conducting experiments.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Rights and permissions
About this article
Cite this article
Weber, R.O., Duarte, K.B. Data-driven artificial intelligence to automate researcher assessment. Scientometrics 126, 3265–3281 (2021). https://doi.org/10.1007/s11192-020-03859-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-020-03859-x
Keywords
- Researcher assessment
- Profiling systems
- Case-based reasoning
- Machine learning
- Leiden manifesto
- Career trajectory
Mathematical Subject Classification
- 03–04 Explicit machine computation and programs
- 01–08 Computational methods
- 68T05 Learning and adaptive systems
- 91E40 Memory and learning