Understanding human-data interaction: Literature review and recommendations for design
Introduction
In the last years, the world has seen a revolution in issues involving the tracking of information generated by the most diversified human activities. People's actions and preferences are being monitored in a myriad of ways. Collected data are used to influence decisions in almost all areas of individual and social life.
The understanding and analysis of large amounts of data play a critical role in everyone's empowerment. In this context, one of the greatest challenges is the complexity of the ecosystem within which data are produced, collected, edited, and used (Hornung et al., 2015). This varies from scenarios in which a single person is the producer, collector, editor, and user of data to scenarios demanding the involvement of many people. This complexity has given rise to questions such as: how to facilitate the understanding, manipulation, analysis, and sensemaking of large amounts of information?
Different fields of study are concerned with issues related to the interaction between people and data. Recently, the area “Human-Data Interaction” (HDI) has begun to investigate how people interact with data as an analogy with how Human-Computer Interaction (HCI) investigates the relationship between people and computers (Hornung et al., 2015).
Several relevant researches working on HDI have attempted to explain the objectives and frontiers of the area (Cafaro, 2012; Elmqvist, 2011; Hornung et al., 2015; Mortier et al., 2013, 2014). They introduced the problem, defined concepts, and presented some open challenges related to the subject. However, at the moment of writing this article, we are not aware of an updated and thorough literature review.
As a first contribution of this article, we provide an overview of the HDI research area based on a thorough literature review. We aggregate background information by analyzing publications and identifying key research topics. We use the guidelines proposed by Kitchenham et al. (2007) to support a systematic investigation into HDI. A systematic literature review is an organized approach to evaluate and interpret available research relevant to a particular research question, topic area, or phenomenon of interest by using a trustworthy, rigorous, and auditable methodology (Kitchenham et al., 2007). Our research has identified primary studies on the area by determining what, where, when, and who has published any material regarding the subject. We detected the most frequently approached aspects and classified the publications by well-defined criteria.
As a second contribution, we use the lessons learnt from the literature review to propose a set of HDI recommendations for information systems. These recommendations are based on the needs described by the studied publications. We organized them according to key aspects of interactivity with a pervasive computing environment including data representation, interaction with data, and data processing logic. We validated the applicability of the proposed set by evaluating online systems that demand intensive human-data interaction and have a large amount of data openly available. The evaluation focused on three websites containing data provided by the administration of the city of São Paulo to its citizens. According to the results, the proposed requirements are not fully satisfied.
The remaining of this study is organized as follows: in Section 2 we define the HDI concept; in Section 3 we detail the results obtained from the systematic literature review; In Section 4 we propose and evaluate a set of recommendations for HDI in information systems; in Section 5 we compile a series of open research challenges to be addressed in the area; in Section 6 we present a discussion about the obtained findings; and in Section 7 we summarize the conclusions.
Section snippets
Fundamental concepts of human-data interaction
The context of data being increasingly produced, collected, and used has motivated several discussions about ways of enabling the data interaction and their understanding. This gave rise to several definitions of the term “Human-Data Interaction” (HDI).
Elmqvist (2011) proposes the term “HDI” by referring to the “human manipulation, analysis, and sensemaking of large, unstructured, and complex datasets.” He suggests an embodied approach to HDI as a way to support these tasks by creating physical
Literature review
In this section we describe the systematic literature review carried out to address HDI publications. After explaining the protocol defined to guide the procedure (SubSection 3.1), we present the results (SubSection 3.2) followed by a visual analysis to get a complementary perspective about the HDI area and to validate the findings (SubSection 3.3). In addition, we list the major themes that constitute HDI (SubSection 3.4).
Evaluation of HDI recommendations
During the literature review, we identified lack of practical approaches, requirements, guidelines, or recommendations that could guide the design of information systems focused on HDI. Then, we seek to establish a set of good practices from the literature.
With the aim of practically contributing to design based on previous experiences, we consolidate a set of recommendations derived from experiences reported in the analyzed literature. In order to bring these recommendations to a real-world
Open research challenges of HDI
The support to humans in the interaction with data still presents complex research challenges enumerated in several investigations in literature. Relying on our thorough survey in literature and our understanding of the HDI area, we consolidated a series of open research challenges. From our viewpoint, these stand for research topics that still require further investigations and appropriate methods to overcome such challenges.
Discussion
In this section we analyze the obtained results, followed by considerations regarding the limitations of our study.
Conclusion
Nowadays, data have been collected, tracked, and used to influence decisions. The ability to understand, extract, and update the information contained in such is important to enable the generation of knowledge in an ethical way. We conceive HDI as the area addressing human manipulation, analysis, and sensemaking of large, unstructured, and complex datasets, understanding their meaning as well as considering stakeholders and the data life cycle phases. The HDI field aims to investigate how
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Annex A – sample references
Alnusair, A., Zhong, C., Rawashdeh, M., Hossain, M.S., Alamri, A., 2017. Context-aware multimodal recommendations of multimedia data in cyber situational awareness. Multimed. Tools Appl. 1–21. https://doi.org/10.1007/s11042-017–4681–2
Babaee, M., Yu, X., Merget, D., Babaeian, A., Rigoll, G., Datcu, M., 2015. Interactive Feature Learning from SAR Image Patches, in: IEEE International Geoscience and Remote Sensing Symposium. pp. 541–54
Becker, T., Curry, E., Jentzsch, A., Palmetshofer, W., 2016.
Acknowledgements
We would like to thank the São Paulo Research Foundation (FAPESP) (Grants #2017/02325-5, #2015/24300-9 and #2015/165280).9
Eliane Zambon Victorelli is Lecturer at the São Paulo State Technology College, Brazil and Ph.D. Student in Computer Science at the Institute of Computing of the University of Campinas (UNICAMP), Brazil. She holds a degree in Chemical Engineering, and a M.Sc. in Computer Science from UNICAMP. She has been working in the areas of IT Governance, Requirements, Development, and Systems Integration. Her current research interests are focused on Human-Data Interaction.
References (53)
- et al.
Static and interactive infographics in daily tasks: a value-in-use and quality of interaction user study
Comput. Hum. Behav.
(2017) - et al.
Semi-automatic mapping of geological structures using UAV-based photogrammetric data: an image analysis approach
Comput. Geosci.
(2014) Graphs in statistical analysis author
Am. Stat.
(1973)Enactive systems & computing mapping the terrain for human-computer interaction research
SEMISH - Semin. Integr. Softw. e Hardware XXXVI Congr. da Soc. Bras. Comput.
(2016)- et al.
Interactive feature learning from SAR image patches, in
- et al.
Evidence relating to object-oriented software design: a survey
Proc. - 1st Int. Symp. Empir. Softw. Eng. Meas. ESEM
(2007) - Becker, T., Curry, E., Jentzsch, A., Palmetshofer, W., 2016. Cross-sectorial requirements analysis for big data...
- Brasil, 2011. Lei 12.527 de 18 de novembro de...
- et al.
Valuable visualization of healthcare information
- et al.
Probing interactivity in open data for general practice. An evidence-based approach
VVH@ AVI
(2016)
Using embodied allegories to design gesture suites for human-data interaction
RFID localization for tangible and embodied multi-user interaction with museum exhibits
Cognitive cities
Big Data and Citizen Participation: The Essentials of Privacy and Security
Searching for music: understanding the discovery, acquisition, processing and organization of music in a domestic setting for design
Pers. Ubiquitous Comput.
Designing data practices
Interactions
Regression by eye : estimating trends in bivariate visualizations
Enabling the new economic actor: personal data regulation and the digital economy
Conceptual and methodological issues in evaluating multidimensional visualizations for decision support
EEE Trans. Vis. Comput. Graph.
Virtual reality: beyond visualisation [accepted for publication]
J. Mol. Biol.
Big data curation
New Horizons for a Data-Driven Economy - Springer International Publishing
Cited by (46)
Rational satisficing heuristics as determinants of online search behavior
2024, International Journal of Information Management Data InsightsPrivacy preferences in automotive data collection
2024, Transportation Research Interdisciplinary PerspectivesCognitive Overload, Anxiety, Cognitive Fatigue, Avoidance Behavior and Data Literacy in Big Data environments
2023, Information Processing and ManagementInvolving psychological therapy stakeholders in responsible research to develop an automated feedback tool: Learnings from the ExTRAPPOLATE project
2022, Journal of Responsible TechnologyAn information retrieval benchmarking model of satisficing and impatient users’ behavior in online search environments
2022, Expert Systems with ApplicationsCitation Excerpt :Indeed, the first two alternatives composing the ranking receive a disproportionate number of clicks compared to the remaining ones within the first page of search results (Chitika, 2013; Dean, 2019). The formalization of the information retrieval behavior observed through standard utility approaches must deal with the cognitive limits of users (Gupta et al., 2018; Lieder & Griffiths, 2020), whose behavior cannot be based on the almost four million permutations that can be computed from the ten results composing the initial page delivered by the engine (Basu, 2018; Victorelli et al., 2020). We are therefore left with the order implicit in the ranking provided by the engine as the only guideline available to replicate the information retrieval behavior of users (European Commission, 2016).
Enhancing the pattern recognition capacity of machine learning techniques: The importance of feature positioning
2022, Machine Learning with Applications
Eliane Zambon Victorelli is Lecturer at the São Paulo State Technology College, Brazil and Ph.D. Student in Computer Science at the Institute of Computing of the University of Campinas (UNICAMP), Brazil. She holds a degree in Chemical Engineering, and a M.Sc. in Computer Science from UNICAMP. She has been working in the areas of IT Governance, Requirements, Development, and Systems Integration. Her current research interests are focused on Human-Data Interaction.
Julio Cesar dos Reis is an Assistant Professor at the Institute of Computing, University of Campinas (UNICAMP), Brazil. He received a Ph.D. degree in Computer Science in 2014 from the University of Paris-Sud XI (France). Julio holds a M.Sc. degree in Computer Science (2011) and B.Tech. in Informatics (2008) from UNICAMP. Julio's research interests are focused on Semantic Web, Computational Ontologies, and Human-Computer Interaction.
Heiko Hornung is an Assistant Professor at the Institute of Computing of the University of Campinas, UNICAMP, Brazil. He holds a degree in Business Informatics from the Darmstadt University of Technology, and a M.Sc. and Ph.D. in Computer Science from UNICAMP. His research interests comprise topics such as digital inclusion, interaction design, pragmatic aspects of electronically mediated human-human interaction, universal access to information and knowledge, and participatory design.
Alysson Bolognesi Prado is a Computer Engineer, M.Sc. and Ph.D. in Computer Science, all degrees from the University of Campinas, Brazil. He has been working for the last 15 years on software development. His current research interests are focused on human-computer interaction and sociotechnological interplay, including the instantiation of Actor-Network Theory concepts in pragmatic aspects of design of digital artefacts.