What we talk about when we talk about (big) data
Introduction
Data and their effects on individuals, organisations, business models and society have, rightly, attracted growing attention In the Journal of Strategic Information Systems (Newell and Marabelli, 2015, Loebbecke and Picot, 2015, Günther et al., 2017, Markus, 2017). The immediate prompt for this attention has been the “widespread diffusion of digital devices that have the ability to monitor our everyday lives” (Newell and Marabelli, 2015: 3), a process that is referred to as “datafication”. Discussions of this phenomenon, however, have largely taken the data themselves for granted and have focused on how data “are being used, and by whom and with what consequences” (Newell and Marabelli, 2015:3). While, as Galliers et al. (2017) argue, the uses of data raise important questions that deserve the attention of scholars in the IS field (and more widely) in this Viewpoint paper I would like to switch the focus around and consider what constitutes these data, the effects of the accumulation of which we have begun to explore. What actually is it that is having these effects?
This enquiry will critically examine a number of commonly held, and often implicit, assumptions about the nature of data. In doing so I hope to extend the discussion of the datafication phenomenon beyond “its issues, impacts and implications” (Galliers et al., 2017: 188) to include an awareness of the particular character of the ‘material’ on which this phenomenon is based. A better appreciation of this character, it will be argued, may inform a richer understanding of the effects of datafication and open them to more rigorous scrutiny.
What has given questions about the nature of data a particular relevance, of course, is not just the increasing datafication of contemporary life. Rather it is the accumulation of these data in repositories, the analysis of which, often by “pre-determined algorithms that lead to decisions that follow on directly without further human intervention” (Galliers et al., 2017: 185), is seen as transforming work, organisations and society, a development commonly referred to as “big data”.
Although, as will be discussed, “big data” are not necessarily a product of datafication and the meaning of the term itself is highly contested, there is little doubt both that the volume of data being created has greatly expanded in recent years and that techniques to analyse data at this scale have significantly advanced. The paper will therefore also examine assumptions that relate specifically to this accumulated data, not just data themselves.
Section snippets
Assumptions about data
Discussions of data are bedevilled by inconsistencies in the way in with which the term is defined in the literature. A Delphi study of Information Scientists by Zins (2007), for example yielded more than 40 different definitions, while Checkland and Holwell (1998) list seven different definitions from IS textbooks. Although it is certainly beyond the scope of this paper to propose a definitive analysis of these definitions, it would nevertheless seem important to clarify some of the main
Assumptions about big data
The starting point for much discussion of big data is typically a reference to the increasing volume and velocity of data “flowing into every area of the global economy” (Manyika et al., 2011). This is often illustrated by the quoting of very large numbers with exa, peta and tera prefixes describing how many Facebook posts, or Google searches are undertaken every minute, or comparing the volume of data produced between the dawn of civilisation and the early 2000s and the amount of data now
Questioning data
To provide a common reference point for the examination of data, the discussion will draw on examples from ongoing research on the implementation and use of electronic medical records in acute hospitals, and particularly in critical care. Electronic medical records are also widely considered to be prime targets for “big data” initiatives (Groves et al., 2013, Raghupathi and Raghupathi, 2014). This is not to suggest that these examples will be representative of all data, but that they highlight
How data come to be
Looking first at how data are produced, it would seem reasonable to maintain the assumption that data are generally intended to be referential. That is, with maybe a few exceptions, data are collected and used on the basis that they tell us something (although perhaps not everything) about the world. The initial stage in the creation of data, therefore, involves a decision on the phenomenon that they are considered to be a representation of. This decision does not necessarily have a single
How data come to be used
Even the presence of data in the record, however, does not necessarily equate to what actually gets used as data about a phenomenon and there is a further process that mediates between the two. This may also be broken down into a number of stages as shown in Fig. 3.
A necessary starting point for the use of data would seem to be some demand that they are perceived to fulfil. A clinician treating a patient for example seeks data that will help them in their task. The specific data they look for
Discussion and conclusion
If we consider data not as givens that are out there in the world, waiting to be gathered, but as contingent representations that are brought into being through situated practices of conceptualization, recording and use, then what might this mean for our understanding of datafication and of big data more generally? While a complete answer to this question is clearly beyond the scope of this initial account of data in practice, four broad domains of implications may be identified.
The first of
Acknowledgements
The ideas presented in this paper were developed as part of the ReCliC project on the repurposing of clinical data for quality improvement in critical care, a collaboration between the Judge Business School and the Computer Laboratory at the University of Cambridge and Royal Papworth Hospital, funded by the Health Foundation, an independent charity working to improve the quality of healthcare in the UK.
References (64)
- et al.
Data Wrangling: Making data useful again
IFAC-PapersOnLine
(2015) - et al.
Datification and its human, organizational and societal effects: the strategic opportunities and challenges of algorithmic decision-making
J. Strateg. Inf. Syst.
(2017) - et al.
Debating big data: a literature review on realizing value from big data
J. Strateg. Inf. Syst.
(2017) - et al.
Reflections on societal and business model transformation arising from digitization and big data analytics: a research agenda
J. Strateg. Inf. Syst.
(2015) Datification, organizational strategy, and IS research: what’s the score?
J. Strateg. Inf. Syst.
(2017)- et al.
Strategic opportunities (and challenges) of algorithmic decision-making: a call for action on the long-term societal effects of “datification”
J. Strateg. Inf. Syst.
(2015) - et al.
Everything counts in large amounts: a critical realist case study on data-based production
J. Inf. Technol.
(2014) From data to wisdom
J. Appl. Syst. Anal.
(1989)The Petabyte age: because more isn’t just more – more is different
Wired
(2008)The end of theory: the data deluge makes the scientific method obsolete
Wired
(2008)
Business Information Systems: Technology, Development and Management for the e-Business
Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon
Inf. Commun. Soc.
Information, Systems and Information Systems: Making Sense of the Field
New games, new rules: big data and the changing context of strategy
J. Inf. Technol.
The rise of big data: how it’s changing the way we think about the world
Foreign Aff.
Pork to Performance: Open Government and Program Performance Tracking in the Philippines – Phase two
Information Ecology
Working within a black box: transparency in the collection and production of big twitter data
Int. J. Commun.
The coming of the new organisation
Harvard Bus. Rev.
Theorizing practice and practicing theory
Org. Sci.
Genesis and Development of a Scientific Fact
Discipline and Punish: The Birth of the Prison
The knowledge pyramid: a critique of the DIKW hierarchy
J. Inf. Sci.
Big data and its epistemology
J. Assoc. Inf. Sci. Technol.
Introduction
The ‘big data’ revolution in healthcare
McKinsey Quart.
Management Information Systems for the Information Age
Beyond “New Scientific Management?” Critical reflections on the epistemology of Evidence-based Management
We need transparency in algorithms, but too much can backfire
Harvard Bus. Rev. Digital Art.
The Data Revolution: Big Data, Open Data, Data Infrastructures and Their Consequences
Big data, new epistemologies and paradigm shifts
Big Data Soc.
Cited by (68)
Enhancing innovativeness and performance of the manufacturing supply chain through datafication: The role of resilience
2024, Computers and Industrial EngineeringHow big data analytics can create competitive advantage in high-stake decision forecasting? The mediating role of organizational innovation
2024, Technological Forecasting and Social ChangeBeyond effective use: Integrating wise reasoning in machine learning development
2023, International Journal of Information ManagementData sustainability: Data governance in data infrastructures across technological and human generations
2023, Information and OrganizationFuture directions for scholarship on data governance, digital innovation, and grand challenges
2023, Information and Organization