Social scientists’ data sharing behaviors: Investigating the roles of individual motivations, institutional pressures, and data repositories
Introduction
Raw data sets have become important “information currency” for scholarly communication (Davis & Vickery, 2007). Not only do data sets add value to traditional journal publications, they also increase transparency and facilitate high quality, continued research. Existing studies confirm, however, that data sharing in the social sciences is not widely practiced due to a variety of factors, including infrastructural and institutional barriers, ethical concerns, and personal reasons. The objective of the present study is to investigate the institutional and individual factors that influence social scientists’ data sharing behaviors and to provide a model, based on institutional theory and the theory of planned behavior, for understanding and predicting behavior. Examining both individual motivation and institutional contexts provides a holistic view of data sharing practices across diverse social science disciplines and can inform policy-making decisions regarding the design and practices surrounding data archives. As data sharing is not a norm, this study can help to identify ways to encourage and support data sharing in the social sciences.
Reasons for encouraging data sharing in the social sciences are many (Fienberg, 1994, King, 1995). Data sharing increases the transparency of quantitative analytic work, thereby lending more credibility to research findings, providing evidence to support analytic frameworks and decisions, and a source for researchers to consult when considering how to build upon existing studies. Having data openly available means that replication and verification is made immediately possible. Shared data allows testing different hypotheses and building better research studies. Furthermore, openly shared data facilitates participation from multiple perspectives, allowing access to the data for more disciplines and for researchers from different backgrounds. It reduces costs by avoiding the duplication of data collection efforts. Additionally, data made available through sharing contributes to the education of students. It is important to note that national scientific organizations and funding agencies have increasingly issued data archiving policies, and agencies including the National Science Foundation (NSF), National Institutes of Health (NIH), and the Institute for Museum and Library Services (IMLS) now require data sharing and management plans as part of grant applications. Given the benefits to the social science disciplines in the advancement of scholarship, and the recent data sharing policy changes of funding agencies, there is a particularly pressing need to determine the factors that impede and support data sharing practices.
Data sharing has been defined somewhat differently in various studies. For the purpose of this research, data sharing is broadly defined as an individual scientist's behavior in providing their raw (or preprocessed) data of his/her published work to other scientists by making it accessible through central/local data repositories or by sending data via personal communication methods upon request.
The first sections of this paper present the literature relevant to the study and an overview of the theoretical frameworks that support our model for data sharing behavior. Section 4 describes the development of the model and hypotheses, and Section 5 provides a description of the research method, including how the survey data was collected and used. In Section 6 we present the results and analysis of the data, and Section 7 provides a discussion, which includes an account of qualitative data derived from participants’ comments. The paper concludes by addressing limitations and considerations for further study.
Section snippets
Literature review
Researchers have examined a variety of dimensions related to data sharing. Across this research there is consensus that, although there is increasing awareness of the benefits of openly shared data, policies and standards are inconsistent across the social science disciplines and institutions, and data sharing is not a common practice among social scientists. Freese (2007) has argued that disciplines within the social sciences have different norms for data sharing, and suggests that much of the
Theoretical framework
The present study makes strides toward building a model to explain and predict data sharing behavior by building upon new institutional theory (DiMaggio and Powell, 1983, Scott, 2001) and the theory of planned behavior (Ajzen, 1991, Fishbein and Ajzen, 1975). New institutional theory accounts for the context in which individual social scientists are acting, whereas the theory of planned behavior helps to explain the underlying motivations and availability of resources facilitating social
Research model
The research model below provides a specific map of social scientists’ data sharing behaviors. The research model is designed to understand and distinguish both individual, institutional, and resource factors influencing social scientists’ data sharing behaviors. The theory of planned behavior can provide insights into how social actors’ behaviors are influenced by their attitudinal beliefs and attitude toward their behaviors and perceived behavioral controls or perceived availabilities of
Research method
A survey was used to systematically investigate the extent to which the data sharing factors identified from the theory of planned behavior and institutional theory influence social scientists’ actual data sharing behaviors. The survey data have been analyzed using descriptive statistics, reliability and validity analysis, and structural analysis. Participants were allowed to provide comments, an account of which is included in Section 7 of this paper.
Data analysis and results
A Structural Equation Modeling (SEM) approach was employed as a primary data analysis method in order to evaluate the proposed research model and hypothesized relationships. This research used a variance-based Partial Least Squares (PLS) method over a covariance-based SEM method because the PLS-SEM is appropriate for the exploratory studies with fewer limitations (Chin, 1998). The PLS does not require normality (Hair, Black, Babin, Anderson, & Tatham, 2006), and it can be done with a small
Discussion and conclusions
This final section describes the findings and includes written comments provided by survey participants. This study found that social scientists’ data sharing behaviors are significantly driven by personal motivations (i.e., perceived career benefit and risk, perceived effort, and attitude toward data sharing) and perceived normative pressure. As data sharing in the social sciences is not a widely established research practice, and as social science researchers are more apt to engage in
Limitations and future research
Continuing research in scientific data sharing should expand upon the relationships examined in this research, as well as include closer investigations into some of the research constructs employed in this research. Future research will also need to consider organizational-level factors influencing social scientists’ data sharing behaviors, such as organizational supports and resources involved in scientists’ data sharing behaviors.
Where the research found that there were not significant
Acknowledgements
We would like to acknowledge the ProQuest Pivot for allowing us to use its Community of Scientists (CoS) Scholar Database in recruiting the survey participants.
Youngseek Kim is an assistant professor in the School of Library and Information Science at the University of Kentucky. He completed his Ph.D. at Syracuse University, and he received the Eugene Garfield Doctoral Dissertation Award from the Association for Library and Information Science Education. His current research efforts in eScience try to understand the nature of scientists’ data practices, focusing on data sharing and reuse, the education of eScience professionals, and adoption and use
References (99)
The theory of planned behavior
Organizational Behavior and Human Decision Process
(1991)- et al.
Open access to data: An ideal professed but not practised
Research Policy
(2014) - et al.
A feedback model to understand information system usage
Information & Management
(1998) - et al.
Understanding Web-based learning continuance intention: The role of subjective task value
Information & Management
(2008) - et al.
Datasets, a shift in the currency of scholarly communication: Implications for library collections and acquisitions
Serials Review
(2007) - et al.
Predicting e-services adoption: A perceived risk facets perspective
International Journal of Human Computer Studies
(2003) - et al.
Beliefs and attitudes affecting intentions to share information in an organizational setting
Information and Management
(2003) - et al.
Personal innovativeness, social influences and adoption of wireless Internet services via mobile technology
The Journal of Strategic Information Systems
(2005) - et al.
it is what one does: Why people participate and help others in electronic communities of practice
Journal of Strategic Information Systems
(2000) - et al.
Knowledge sharing behavior of physicians in hospitals
Expert Systems with Applications
(2003)
It's all about attitude: Revisiting the technology acceptance model
Decision Support Systems
Notes on replication
Sociological Methods & Research
Uncerstanding attitudes and predicting social behavior
The influence of attitudes on behavior
Structural equation modeling in practice: A review and recommended two-step approach
Psychological Bulletin
Promoting access to public research data for scientific, economic, and social development
Data Science Journal
The personalization privacy paradox: An empirical evaluation of information transparency and the willingness to be profiled online for personalization
MIS Quarterly
Authorship and publication practices in the social sciences: Historical reflections on current practices
Science and Engineering Ethics
Institutionalism
Data withholding in genetics and the other life sciences: Prevalences and predictors
Academic Medicine
Breaking the myths of rewards: An exploratory study of attitudes about knowledge sharing
Information Resources Management Journal
Behavioral intention formation in knowledge sharing: Examining the roles of extrinsic motivators, social-psychological forces, and organizational climate
MIS Quarterly
The digital future is now: A call to action for the humanities
Digital Humanities Quarterly
Little science confronts the data deluge: Habitat ecology, embedded sensor networks, and digital libraries
International Journal on Digital Libraries
Reconceptualizing system usage: An approach and empirical test
Information Systems Research
Data-sharing and data-withholding in genetics and the life sciences: results of a national survey of technology transfer officers
Journal of Health Care Law and Policy
Data withholding in academic genetics – Evidence from a national survey
Journal of the American Medical Association
Scientists attitudes toward data sharing
Science Technology & Human Values
The partial least squares approach to structural equation modeling
Modern Methods for Business Research
The effects of socio-technical enablers on knowledge sharing: An exploratory examination
Journal of Information Science
Qualitative data archiving in the digital age: Strategies for data preservation and sharing
The Qualitative Report
Data sharing, small science and institutional repositories
Philosophical Transactions of the Royal Society A – Mathematical Physical and Engineering Sciences
Psychology in action – Retention of raw data – Problem revisited
American Psychologist
Perceived usefulness, perceived ease of use, and user acceptance in information technology
MIS Quarterly
User acceptance of computer technology: A comparison of two theoretical models
Management Science
The role of perceived enjoyment and social norm in the adoption of technology with network externalities
European Journal of Information Systems
The iron cage revisited: Institutional isomorphism and collective rationality in organizational fields
American Sociological Review
Controlled vocabulary standards for anthropological datasets
International Journal of Digital Curation
Discovering statistics using SPSS
Sharing statistical data in the biomedical and health sciences: Ethical, institutional, legal, and professional dimensions
Annual Review of Public Health
Belief, attitude, intention, and behavior
Two structural equation models: LISREL and PLS applied to consumer exit-voice theory
Journal of Marketing Research (JMR)
Structural equation models with unobservable variables and measurement error: Algebra and statistics
Journal of Marketing Research
Understanding faculty to improve content recruitment for institutional repositories
D-Lib Magazine
Replication standards for quantitative social science why not sociology?
Sociological Methods & Research
Knowledge management: An organizational capabilities perspective
Journal of Management Information Systems
Multivariate data analysis
Structure! agency! (and other quarrels): A meta-analysis of institutional theories of organization
Academy of Management Journal
A comprehensive conceptualization of post-adoptive behaviors associated with information technology enabled work systems
MIS Quarterly
Cited by (76)
Coopetition in social commerce: What influences livestreaming knowledge sharing in agricultural clusters?
2024, Electronic Commerce Research and ApplicationsPapers with code or without code? Impact of GitHub repository usability on the diffusion of machine learning research
2023, Information Processing and ManagementIncentive or disincentive for research data disclosure? A large-scale empirical analysis and implications for open science policy
2021, International Journal of Information ManagementCitation Excerpt :As extensively discussed by information management and policy scholars, the information-sharing behavior of scientists may be associated with not only personal motives but also the institutional environments of the researcher (McCullough et al., 2008; Piwowar, 2011). In the context of research data sharing, borrowing the institutional theory that explains that an individual’s behavior is interactively configured by institutions (DiMaggio & Powell, 1983; North, 1990; Williamson, 2000)—the norms and rules that individuals formally or informally comply with— Kim and Adler (2015) argued that institutions in multiple venues affect research data sharing behavior of scientists. If the researcher is in a field where the norm of science is well complied by other researchers, the researcher may be more willing to disclose their data than others affected by nomadic or peer researcher pressure (Campbell et al., 2000; Kim & Adler, 2015).
Implications of the use of artificial intelligence in public governance: A systematic literature review and a research agenda
2021, Government Information QuarterlyCitation Excerpt :To boost research concerning public governance concerning AI, opening up underlying research data should become standard practice, a practice that was barely existent in the articles we systematically reviewed. On a general note, data reuse can lead to more findings from the same dataset (Joo, Kim, & Kim, 2017), to asking new questions (Wallis, Rolando, & Borgman, 2013), to testing different hypotheses (Kim & Adler, 2015), and to increasing the knowledge in the field (Joo et al., 2017). Both scholarly societies and funding organizations active in the domains of public governance are advised to incentivize and trigger more research, focusing on openness, rigor, and transparency in the diverse areas of AI and public governance.
Mediating agricultural entrepreneurship through embracing innovative technology: a tale from small rural enterprises in an emerging economy
2024, International Journal of Entrepreneurial Behaviour and Research
Youngseek Kim is an assistant professor in the School of Library and Information Science at the University of Kentucky. He completed his Ph.D. at Syracuse University, and he received the Eugene Garfield Doctoral Dissertation Award from the Association for Library and Information Science Education. His current research efforts in eScience try to understand the nature of scientists’ data practices, focusing on data sharing and reuse, the education of eScience professionals, and adoption and use of cyberinfrastructure. He published articles in International Journal of Digital Curation, Journal of Education for Library and Information Science, and Journal of Computational Science Education.
Melissa Adler is Assistant Professor in the School of Library and Information Science and Faculty Affiliate with the Committee on Social Theory at the University of Kentucky. She received her PhD from the School of Library and Information Studies with a PhD minor in Gender and Women's Studies from University of Wisconsin–Madison. She is currently serving as Chair-elect for the Classification Research Group of the Association for Information Science & Technology. Her book manuscript, Perverse Subjects: Becoming Bodies in the Library, is currently under review.
- 1
Tel.: +1 859 218 2294; fax: +1 859 257 4205.