Social scientists’ data sharing behaviors: Investigating the roles of individual motivations, institutional pressures, and data repositories

https://doi.org/10.1016/j.ijinfomgt.2015.04.007Get rights and content

Highlights

  • Propose data sharing model based on motivational, institutional, and resource factors.

  • The research model was validated by the results of a survey of 361 social scientists.

  • Career benefit and risk significantly affect social scientists’ data sharing attitude.

  • Attitude, effort, and norm significantly influence social scientists’ data sharing.

  • Funding agency and journal's pressures and data repository need to be more encouraged.

Abstract

The purpose of this study is to locate individual, institutional, and resource factors that influence data sharing behaviors among social scientists. Given the benefits to the social science disciplines in the advancement of scholarship, and the recent data sharing policy changes of funding agencies, it is necessary to determine the factors that support and impede data sharing behaviors. A research model was developed and validated based on the results of a survey of 361 social scientists. The model is informed by theory of planned behavior and institutional theory to map underlying individual motivations, institutional pressures, and availability of resources facilitating social scientists’ data sharing. It was found that social scientists’ data sharing behaviors are mainly driven by personal motivations (i.e., perceived career benefit and risk, perceived effort, and attitude toward data sharing) and perceived normative pressure. Funding agencies’ pressure, journals’ pressure, and availability of data repository were not found to be significant factors in influencing social scientists’ data sharing. This research suggests that personal motivations and norm of data sharing currently support social scientists’ data sharing; however, institutional pressures by funding agencies and journals and data repository need to be further encouraged to better facilitate social scientists’ data sharing behaviors.

Introduction

Raw data sets have become important “information currency” for scholarly communication (Davis & Vickery, 2007). Not only do data sets add value to traditional journal publications, they also increase transparency and facilitate high quality, continued research. Existing studies confirm, however, that data sharing in the social sciences is not widely practiced due to a variety of factors, including infrastructural and institutional barriers, ethical concerns, and personal reasons. The objective of the present study is to investigate the institutional and individual factors that influence social scientists’ data sharing behaviors and to provide a model, based on institutional theory and the theory of planned behavior, for understanding and predicting behavior. Examining both individual motivation and institutional contexts provides a holistic view of data sharing practices across diverse social science disciplines and can inform policy-making decisions regarding the design and practices surrounding data archives. As data sharing is not a norm, this study can help to identify ways to encourage and support data sharing in the social sciences.

Reasons for encouraging data sharing in the social sciences are many (Fienberg, 1994, King, 1995). Data sharing increases the transparency of quantitative analytic work, thereby lending more credibility to research findings, providing evidence to support analytic frameworks and decisions, and a source for researchers to consult when considering how to build upon existing studies. Having data openly available means that replication and verification is made immediately possible. Shared data allows testing different hypotheses and building better research studies. Furthermore, openly shared data facilitates participation from multiple perspectives, allowing access to the data for more disciplines and for researchers from different backgrounds. It reduces costs by avoiding the duplication of data collection efforts. Additionally, data made available through sharing contributes to the education of students. It is important to note that national scientific organizations and funding agencies have increasingly issued data archiving policies, and agencies including the National Science Foundation (NSF), National Institutes of Health (NIH), and the Institute for Museum and Library Services (IMLS) now require data sharing and management plans as part of grant applications. Given the benefits to the social science disciplines in the advancement of scholarship, and the recent data sharing policy changes of funding agencies, there is a particularly pressing need to determine the factors that impede and support data sharing practices.

Data sharing has been defined somewhat differently in various studies. For the purpose of this research, data sharing is broadly defined as an individual scientist's behavior in providing their raw (or preprocessed) data of his/her published work to other scientists by making it accessible through central/local data repositories or by sending data via personal communication methods upon request.

The first sections of this paper present the literature relevant to the study and an overview of the theoretical frameworks that support our model for data sharing behavior. Section 4 describes the development of the model and hypotheses, and Section 5 provides a description of the research method, including how the survey data was collected and used. In Section 6 we present the results and analysis of the data, and Section 7 provides a discussion, which includes an account of qualitative data derived from participants’ comments. The paper concludes by addressing limitations and considerations for further study.

Section snippets

Literature review

Researchers have examined a variety of dimensions related to data sharing. Across this research there is consensus that, although there is increasing awareness of the benefits of openly shared data, policies and standards are inconsistent across the social science disciplines and institutions, and data sharing is not a common practice among social scientists. Freese (2007) has argued that disciplines within the social sciences have different norms for data sharing, and suggests that much of the

Theoretical framework

The present study makes strides toward building a model to explain and predict data sharing behavior by building upon new institutional theory (DiMaggio and Powell, 1983, Scott, 2001) and the theory of planned behavior (Ajzen, 1991, Fishbein and Ajzen, 1975). New institutional theory accounts for the context in which individual social scientists are acting, whereas the theory of planned behavior helps to explain the underlying motivations and availability of resources facilitating social

Research model

The research model below provides a specific map of social scientists’ data sharing behaviors. The research model is designed to understand and distinguish both individual, institutional, and resource factors influencing social scientists’ data sharing behaviors. The theory of planned behavior can provide insights into how social actors’ behaviors are influenced by their attitudinal beliefs and attitude toward their behaviors and perceived behavioral controls or perceived availabilities of

Research method

A survey was used to systematically investigate the extent to which the data sharing factors identified from the theory of planned behavior and institutional theory influence social scientists’ actual data sharing behaviors. The survey data have been analyzed using descriptive statistics, reliability and validity analysis, and structural analysis. Participants were allowed to provide comments, an account of which is included in Section 7 of this paper.

Data analysis and results

A Structural Equation Modeling (SEM) approach was employed as a primary data analysis method in order to evaluate the proposed research model and hypothesized relationships. This research used a variance-based Partial Least Squares (PLS) method over a covariance-based SEM method because the PLS-SEM is appropriate for the exploratory studies with fewer limitations (Chin, 1998). The PLS does not require normality (Hair, Black, Babin, Anderson, & Tatham, 2006), and it can be done with a small

Discussion and conclusions

This final section describes the findings and includes written comments provided by survey participants. This study found that social scientists’ data sharing behaviors are significantly driven by personal motivations (i.e., perceived career benefit and risk, perceived effort, and attitude toward data sharing) and perceived normative pressure. As data sharing in the social sciences is not a widely established research practice, and as social science researchers are more apt to engage in

Limitations and future research

Continuing research in scientific data sharing should expand upon the relationships examined in this research, as well as include closer investigations into some of the research constructs employed in this research. Future research will also need to consider organizational-level factors influencing social scientists’ data sharing behaviors, such as organizational supports and resources involved in scientists’ data sharing behaviors.

Where the research found that there were not significant

Acknowledgements

We would like to acknowledge the ProQuest Pivot for allowing us to use its Community of Scientists (CoS) Scholar Database in recruiting the survey participants.

Youngseek Kim is an assistant professor in the School of Library and Information Science at the University of Kentucky. He completed his Ph.D. at Syracuse University, and he received the Eugene Garfield Doctoral Dissertation Award from the Association for Library and Information Science Education. His current research efforts in eScience try to understand the nature of scientists’ data practices, focusing on data sharing and reuse, the education of eScience professionals, and adoption and use

References (99)

  • H.-D. Yang et al.

    It's all about attitude: Revisiting the technology acceptance model

    Decision Support Systems

    (2004)
  • A. Abbott

    Notes on replication

    Sociological Methods & Research

    (2007)
  • I. Ajzen et al.

    Uncerstanding attitudes and predicting social behavior

    (1980)
  • I. Ajzen et al.

    The influence of attitudes on behavior

  • J.C. Anderson et al.

    Structural equation modeling in practice: A review and recommended two-step approach

    Psychological Bulletin

    (1988)
  • P. Arzberger et al.

    Promoting access to public research data for scientific, economic, and social development

    Data Science Journal

    (2004)
  • N.F. Awad et al.

    The personalization privacy paradox: An empirical evaluation of information transparency and the willingness to be profiled online for personalization

    MIS Quarterly

    (2006)
  • M.J. Bebeau et al.

    Authorship and publication practices in the social sciences: Historical reflections on current practices

    Science and Engineering Ethics

    (2011)
  • S. Bell

    Institutionalism

  • D. Blumenthal et al.

    Data withholding in genetics and the other life sciences: Prevalences and predictors

    Academic Medicine

    (2006)
  • G.-W. Bock et al.

    Breaking the myths of rewards: An exploratory study of attitudes about knowledge sharing

    Information Resources Management Journal

    (2002)
  • G.-W. Bock et al.

    Behavioral intention formation in knowledge sharing: Examining the roles of extrinsic motivators, social-psychological forces, and organizational climate

    MIS Quarterly

    (2005)
  • C.L. Borgman

    The digital future is now: A call to action for the humanities

    Digital Humanities Quarterly

    (2009)
  • C.L. Borgman et al.

    Little science confronts the data deluge: Habitat ecology, embedded sensor networks, and digital libraries

    International Journal on Digital Libraries

    (2007)
  • A. Burton-Jones et al.

    Reconceptualizing system usage: An approach and empirical test

    Information Systems Research

    (2006)
  • E.G. Campbell et al.

    Data-sharing and data-withholding in genetics and the life sciences: results of a national survey of technology transfer officers

    Journal of Health Care Law and Policy

    (2003)
  • E.G. Campbell et al.

    Data withholding in academic genetics – Evidence from a national survey

    Journal of the American Medical Association

    (2002)
  • S.J. Ceci

    Scientists attitudes toward data sharing

    Science Technology & Human Values

    (1988)
  • W.W. Chin

    The partial least squares approach to structural equation modeling

    Modern Methods for Business Research

    (1998)
  • S.Y. Choi et al.

    The effects of socio-technical enablers on knowledge sharing: An exploratory examination

    Journal of Information Science

    (2008)
  • L. Cliggett

    Qualitative data archiving in the digital age: Strategies for data preservation and sharing

    The Qualitative Report

    (2013)
  • M.H. Cragin et al.

    Data sharing, small science and institutional repositories

    Philosophical Transactions of the Royal Society A – Mathematical Physical and Engineering Sciences

    (2010)
  • J.R. Craig et al.

    Psychology in action – Retention of raw data – Problem revisited

    American Psychologist

    (1973)
  • F.D. Davis

    Perceived usefulness, perceived ease of use, and user acceptance in information technology

    MIS Quarterly

    (1989)
  • F.D. Davis et al.

    User acceptance of computer technology: A comparison of two theoretical models

    Management Science

    (1989)
  • A. Dickinger et al.

    The role of perceived enjoyment and social norm in the adoption of technology with network externalities

    European Journal of Information Systems

    (2008)
  • P.J. DiMaggio et al.

    The iron cage revisited: Institutional isomorphism and collective rationality in organizational fields

    American Sociological Review

    (1983)
  • C. Emmelhainz

    Controlled vocabulary standards for anthropological datasets

    International Journal of Digital Curation

    (2014)
  • A. Field

    Discovering statistics using SPSS

    (2009)
  • S.E. Fienberg

    Sharing statistical data in the biomedical and health sciences: Ethical, institutional, legal, and professional dimensions

    Annual Review of Public Health

    (1994)
  • M. Fishbein et al.

    Belief, attitude, intention, and behavior

    (1975)
  • C. Fornell et al.

    Two structural equation models: LISREL and PLS applied to consumer exit-voice theory

    Journal of Marketing Research (JMR)

    (1982)
  • C. Fornell et al.

    Structural equation models with unobservable variables and measurement error: Algebra and statistics

    Journal of Marketing Research

    (1981)
  • N.F. Foster et al.

    Understanding faculty to improve content recruitment for institutional repositories

    D-Lib Magazine

    (2005)
  • J. Freese

    Replication standards for quantitative social science why not sociology?

    Sociological Methods & Research

    (2007)
  • A.H. Gold et al.

    Knowledge management: An organizational capabilities perspective

    Journal of Management Information Systems

    (2001)
  • J.F. Hair et al.

    Multivariate data analysis

    (2006)
  • P.P.M.A.R. Heugens et al.

    Structure! agency! (and other quarrels): A meta-analysis of institutional theories of organization

    Academy of Management Journal

    (2009)
  • J. Jasperson et al.

    A comprehensive conceptualization of post-adoptive behaviors associated with information technology enabled work systems

    MIS Quarterly

    (2005)
  • Cited by (76)

    • Incentive or disincentive for research data disclosure? A large-scale empirical analysis and implications for open science policy

      2021, International Journal of Information Management
      Citation Excerpt :

      As extensively discussed by information management and policy scholars, the information-sharing behavior of scientists may be associated with not only personal motives but also the institutional environments of the researcher (McCullough et al., 2008; Piwowar, 2011). In the context of research data sharing, borrowing the institutional theory that explains that an individual’s behavior is interactively configured by institutions (DiMaggio & Powell, 1983; North, 1990; Williamson, 2000)—the norms and rules that individuals formally or informally comply with— Kim and Adler (2015) argued that institutions in multiple venues affect research data sharing behavior of scientists. If the researcher is in a field where the norm of science is well complied by other researchers, the researcher may be more willing to disclose their data than others affected by nomadic or peer researcher pressure (Campbell et al., 2000; Kim & Adler, 2015).

    • Implications of the use of artificial intelligence in public governance: A systematic literature review and a research agenda

      2021, Government Information Quarterly
      Citation Excerpt :

      To boost research concerning public governance concerning AI, opening up underlying research data should become standard practice, a practice that was barely existent in the articles we systematically reviewed. On a general note, data reuse can lead to more findings from the same dataset (Joo, Kim, & Kim, 2017), to asking new questions (Wallis, Rolando, & Borgman, 2013), to testing different hypotheses (Kim & Adler, 2015), and to increasing the knowledge in the field (Joo et al., 2017). Both scholarly societies and funding organizations active in the domains of public governance are advised to incentivize and trigger more research, focusing on openness, rigor, and transparency in the diverse areas of AI and public governance.

    View all citing articles on Scopus

    Youngseek Kim is an assistant professor in the School of Library and Information Science at the University of Kentucky. He completed his Ph.D. at Syracuse University, and he received the Eugene Garfield Doctoral Dissertation Award from the Association for Library and Information Science Education. His current research efforts in eScience try to understand the nature of scientists’ data practices, focusing on data sharing and reuse, the education of eScience professionals, and adoption and use of cyberinfrastructure. He published articles in International Journal of Digital Curation, Journal of Education for Library and Information Science, and Journal of Computational Science Education.

    Melissa Adler is Assistant Professor in the School of Library and Information Science and Faculty Affiliate with the Committee on Social Theory at the University of Kentucky. She received her PhD from the School of Library and Information Studies with a PhD minor in Gender and Women's Studies from University of Wisconsin–Madison. She is currently serving as Chair-elect for the Classification Research Group of the Association for Information Science & Technology. Her book manuscript, Perverse Subjects: Becoming Bodies in the Library, is currently under review.

    1

    Tel.: +1 859 218 2294; fax: +1 859 257 4205.

    View full text