Abstract
This paper explores the role of gender gap in the actuarial research community with advanced data science tools. The web scraping tools were employed to create a database of publications that encompasses six major actuarial journals. This database includes the article names, authors’ names, publication year, volume, and the number of citations for the time period 2005–2018. The advanced tools built as part of the R software were used to perform gender classification based on the author’s name. Further, we developed a social network analysis by gender in order to analyze the collaborative structure and other forms of interaction within the actuarial research community. A Poisson mixture model was used to identify major clusters with respect to the frequency of citations by gender across the six journals. The analysis showed that women’s publishing and citation networks are more isolated and have fewer ties than male networks. The paper contributes to the broader literature on the “Matthew effect” in academia. We hope that our study will improve understanding of the gender gap within the actuarial research community and initiate a discussion that will lead to developing strategies for a more diverse, inclusive, and equitable community.
Similar content being viewed by others
References
Bol, T., de Vaan, M., & van de Rijt, A. (2018). The Matthew effect in science funding. Proceedings of the National Academy of Sciences, 115(19), 4887–4890.
Boyack, K. W., Small, H., & Klavans, R. (2013). Improving the accuracy of co-citation clustering using full text. Journal of the American Society for Information Science and Technology, 64(9), 1759–1767.
Brown, G. O., & Buckley, W. S. (2015). Experience rating with poisson mixtures. Annals of Actuarial Science, 9(2), 304–321.
Butts, C. T. (2016). sna: Tools for social network analysis. R package version 2.4. https://CRAN.R-project.org/package=sna.
Butts, C. T., et al. (2008). network: A package for managing relational data in R. Journal of Statistical Software, 24(2), 1–36.
Casualty Actuarial Society. (2016). 2016 CAS Annual Report.
Chan, K. C., & Liano, K. (2009). Infuential articles, journals, and institutions in risk management and insurance. Risk Management and Insurance Review, 12(1), 125–139.
Colquitt, L. L. (1997). Relative significance of insurance and actuarial journals and articles: A citation analysis. Journal of Risk and Insurance, 64, 505–527.
Colquitt, L. L. (2003). An analysis of risk, insurance, and actuarial research: Citations from 1996 to 2000. Journal of Risk and Insurance, 70(2), 315–338.
Colquitt, L. L. (2005). An examination of the infuence of leading actuarial journals. Proceedings of the Casualty Actuarial Society, 92, 1–30.
Colquitt, L. L., & D’Arcy, S. P. (2009). Actuarial journals. In Edward L. Melnick (Editor-in-Chief), Brian S. Everitt (Editor-in-Chief). Encyclopedia of quantitative risk analysis and assessment. New York: Wiley.
Colquitt, L. L., Sommer, D. W., & Ferguson, W. L. L. (2009). A citation analysis of risk, insurance, and actuarial research: 2001 through 2005. Journal of Risk and Insurance, 76(4), 933–953.
Dayton, C. M., & Macready, G. B. (1988). Concomitant-variable latent-class models. Journal of the American Statistical Association, 83(401), 173–178.
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM-algorithm. Journal of the Royal Statistical Society B, 39, 1–38.
Ding, Y. (2011). Scientific collaboration and endorsement: Network analysis of coauthorship and citation networks. Journal of Informetrics, 5(1), 187–203.
Dion, M. L., Sumner, J. L., & Mitchell, S. M. (2018). Gendered citation patterns across political science and social science methodology fields. Political Analysis, 26(3), 312–327.
Emberg, J. (2012). A study of women working in the actuarial field. Smithfield: Bryant University.
Garrido, J., Genest, C., & Schulz, J. (2016). Generalized linear models for dependent frequency and severity of insurance claims. Insurance: Mathematics and Economics, 70, 205–215.
Genest, C., & Carabarý-Aguirre, A. (2013). A digital picture of the actuarial research community. North American Actuarial Journal, 17(1), 3–12.
Grün, B., & Leisch, F. (2008). FlexMix version 2: Finite mixtures with concomitant variables and varying and constant parameters. Journal of Statistical Software, 28(4), 1–35.
Hesli, V. L., & Lee, J. M. (2011). Faculty research productivity: Why do some of our colleagues publish more than others? Political Science & Politics, 44(2), 393–408.
Hesli, V. L., Lee, J. M., & Mitchell, S. M. (2012). Predicting rank attainment in political science: What else besides publications affects promotion? Political Science & Politics, 45(3), 475–492.
Hill, C., Corbett, C., & St Rose, A. (2010). Why so few? Women in science, technology, engineering, and mathematics. Hoboken, NJ: ERIC.
Holman, L., Stuart-Fox, D., & Hauser, C. E. (2018). The gender gap in science: How long until women are equally represented? PLoS Biology, 16(4), e2004956.
Iefremova, O., Wais, K., & Kozak, M. (2018). Biographical articles in scientific literature: Analysis of articles indexed in web of science. Scientometrics, 117(3), 1695–1719.
Igarashi, T., Takai, J., & Yoshida, T. (2005). Gender differences in social network development via mobile phone text messages: A longitudinal study. Journal of Social and Personal Relationships, 22(5), 691–713.
Kejzjar, N., Cjerne, S. K., & Batagelj, V. (2010). Network analysis of works on clustering and classification from web of science. In Classification as a tool for research (pp. 525–536). New York: Springer.
Kulis, S., Sicotte, D., & Collins, S. (2002). More than a pipeline problem: Labor supply constraints and gender stratification across academic science disciplines. Research in Higher Education, 43(6), 657–691.
Leisch, F. (2004). Flexmix: A general framework for finite mixture models and latent glass regression in R. Journal of Statistical Software, 11(8), 1–18.
Mathews, A. L., & Andersen, K. (2001). A gender gap in publishing? Women’s representation in edited political science books. Political Science & Politics, 34(1), 143–147.
McLachlan, G., & Peel, D. (1994). Finite mixture models (Vol. 2). Hoboken: Wiley.
McLachlan, G. J., & Basford, K. E., (1988). Mixture Models: Inference and Applications to Clustering. Statistics, Textbooks and Monographs. New York: M. Dekker.
Merton, R. K. (1968). The Matthew effect in science: The reward and communication systems of science are considered. Science, 159(3810), 56–63.
Merton, R. K. (1988). The Matthew effect in science, II: Cumulative advantage and the symbolism of intellectual property. ISIS, 79(4), 606–623.
Mihaljević-Brandt, H., Santamará, L., & Tullney, M. (2016). The effect of gender in the publication patterns in mathematics. PLoS ONE, 11(10), 165–367.
Miljkovic, T., & Fernández, D. (2018). On two mixture-based clustering approaches used in modeling an insurance portfolio. Risks, 6(2), 57.
Miljkovic, T., & SenGupta, I. (2018). A new analysis of VIX using mixture of regressions: Examination and short-term forecasting for the S&P 500 market. High Frequency, 1(1), 53–65.
Nolan, D. (1992). Women in statistics in academe: Mentors matter. Statistical Science,. https://doi.org/10.1214/ss/1177011366.
R Core Team. (2018). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna. https://www.R-project.org/.
Rørstad, K., & Aksnes, D. W. (2015). Publication rate expressed by age, gender and academic position a large scale analysis of Norwegian academic staff. Journal of Informetrics, 9(2), 317–333.
Schloerke, B., Crowley, J., Cook, D., Briatte, F., Marbach, M., Thoen, E., et al. (2018). GGally: Extension to ‘ggplot2’. R package version 1.4.0. https://CRAN.R-project.org/package=GGally.
Shen, Y. A., Webster, J. M., Shoda, Y., & Fine, I. (2018). Persistent underrepresentation of women’s science in high profile journals. BioRxiv, 275–362.
Shiau, W.-L., Dwivedi, Y. K., & Yang, H. S. (2017). Co-citation and cluster analyses of extant literature on social networks. International Journal of Information Management, 37(5), 390–399.
Shibley Hyde, J., & Kling, K. C. (2001). Women, motivation, and achievement. Psychology of Women Quarterly, 25(4), 364–378.
Shi, P., Feng, X., & Ivantsova, A. (2015). Dependent frequency–severity modeling of insurance claims. Insurance: Mathematics and Economics, 64, 417–428.
Small, H., Sweeney, E., & Greenlee, E. (1985). Clustering the science citation index using co-citations. ii. Mapping science. Scientometrics, 8(5–6), 321–340.
Smith, R. M., & Schumacher, P. A. (2005). Predicting success for actuarial students in undergraduate mathematics courses. College Student Journal, 39(1).
Teele, D. L., & Thelen, K. (2017). Gender in the journals: Publication patterns in political science. Political Science & Politics, 50(2), 433–447.
Topaz, C. M., & Sen, S. (2016). Gender representation on journal editorial boards in the mathematical sciences. PLoS ONE, 11(8), e0161357.
Valian, V. (1999). Why so slow? The advancement of women. Cambridge: MIT Press.
Wais, K. (2016). genderizeR: Gender prediction based on first names. R package version 2.0.0. https://CRAN.R-project.org/package=genderizeR.
Wasserman, S., & Faust, K. (1994). Social network analysis: Methods and applications (Vol. 8). Cambridge: Cambridge University Press.
West, J. D., Jacquet, J., King, M. M., Correll, S. J., & Bergstrom, C. T., (2013). The role of gender in scholarly authorship. PLoS ONE, 8 (7). Public Library of Science.
Wickham, H. (2016). rvest: Easily Harvest (Scrape) Web Pages. R package version 0.3.2. https://CRAN.R-project.org/package=rvest.
Wickham, H. (2018). stringr: Simple, consistent wrappers for common string operations. R package version 1.3.1. https://CRAN.R-project.org/package=stringr
Acknowledgements
The authors are grateful to Dr. Bettina Grün for her valuable feedback and discussion that improved the content of this paper. We also appreciate the guidance received on using the flexmix package. Additional appreciation is extended to those participants in the 2018 Women in Statistics and Data Science Conference and the 54th Actuarial Research Conference whose support reassured us about the importance of this topic. Finally, we would like to thank the Editor and the anonymous reviewers for their suggestions and generous support.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Rights and permissions
About this article
Cite this article
Yu, M., Krehbiel, M., Thompson, S. et al. An exploration of gender gap using advanced data science tools: actuarial research community. Scientometrics 123, 767–789 (2020). https://doi.org/10.1007/s11192-020-03412-w
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-020-03412-w