Understanding human-data interaction: Literature review and recommendations for design

https://doi.org/10.1016/j.ijhcs.2019.09.004Get rights and content

Highlights

  • An in-depth literature review of the Human-Data Interaction (HDI) area.

  • Description of key research topics and open research challenges.

  • Synthesis of HDI recommendations for design to meet good interactivity.

  • Evaluation of HDI recommendations in Web portals with large amounts of data.

Abstract

The trend of collecting information about human activities to inform and influence actions and decisions poses a series of challenges to analyze this data deluge. The lack of ability to understand and interact with this amount of data prevents people and organizations from taking the best of this information. To investigate how people interact with data, a new area of study called “Human-Data Interaction” (HDI) is emerging. In this article, we conduct a thorough literature review to create the big picture about the subject. We carry out a variety of analyses and visual examinations to understand the characteristics of existing publications, detecting the most frequently addressed research topics and consolidating the research challenges. Based on the needs of HDI we found in the analyzed publications, we organize a set of recommendations and evaluate online systems that demand intensive human-data interaction. The obtained results indicate there are still many open questions for this interesting area, which is maturing with an increase number of publications in the last years, and that systems with large amount of data openly available poorly meet the proposed recommendations.

Introduction

In the last years, the world has seen a revolution in issues involving the tracking of information generated by the most diversified human activities. People's actions and preferences are being monitored in a myriad of ways. Collected data are used to influence decisions in almost all areas of individual and social life.

The understanding and analysis of large amounts of data play a critical role in everyone's empowerment. In this context, one of the greatest challenges is the complexity of the ecosystem within which data are produced, collected, edited, and used (Hornung et al., 2015). This varies from scenarios in which a single person is the producer, collector, editor, and user of data to scenarios demanding the involvement of many people. This complexity has given rise to questions such as: how to facilitate the understanding, manipulation, analysis, and sensemaking of large amounts of information?

Different fields of study are concerned with issues related to the interaction between people and data. Recently, the area “Human-Data Interaction” (HDI) has begun to investigate how people interact with data as an analogy with how Human-Computer Interaction (HCI) investigates the relationship between people and computers (Hornung et al., 2015).

Several relevant researches working on HDI have attempted to explain the objectives and frontiers of the area (Cafaro, 2012; Elmqvist, 2011; Hornung et al., 2015; Mortier et al., 2013, 2014). They introduced the problem, defined concepts, and presented some open challenges related to the subject. However, at the moment of writing this article, we are not aware of an updated and thorough literature review.

As a first contribution of this article, we provide an overview of the HDI research area based on a thorough literature review. We aggregate background information by analyzing publications and identifying key research topics. We use the guidelines proposed by Kitchenham et al. (2007) to support a systematic investigation into HDI. A systematic literature review is an organized approach to evaluate and interpret available research relevant to a particular research question, topic area, or phenomenon of interest by using a trustworthy, rigorous, and auditable methodology (Kitchenham et al., 2007). Our research has identified primary studies on the area by determining what, where, when, and who has published any material regarding the subject. We detected the most frequently approached aspects and classified the publications by well-defined criteria.

As a second contribution, we use the lessons learnt from the literature review to propose a set of HDI recommendations for information systems. These recommendations are based on the needs described by the studied publications. We organized them according to key aspects of interactivity with a pervasive computing environment including data representation, interaction with data, and data processing logic. We validated the applicability of the proposed set by evaluating online systems that demand intensive human-data interaction and have a large amount of data openly available. The evaluation focused on three websites containing data provided by the administration of the city of São Paulo to its citizens. According to the results, the proposed requirements are not fully satisfied.

The remaining of this study is organized as follows: in Section 2 we define the HDI concept; in Section 3 we detail the results obtained from the systematic literature review; In Section 4 we propose and evaluate a set of recommendations for HDI in information systems; in Section 5 we compile a series of open research challenges to be addressed in the area; in Section 6 we present a discussion about the obtained findings; and in Section 7 we summarize the conclusions.

Section snippets

Fundamental concepts of human-data interaction

The context of data being increasingly produced, collected, and used has motivated several discussions about ways of enabling the data interaction and their understanding. This gave rise to several definitions of the term “Human-Data Interaction” (HDI).

Elmqvist (2011) proposes the term “HDI” by referring to the “human manipulation, analysis, and sensemaking of large, unstructured, and complex datasets.” He suggests an embodied approach to HDI as a way to support these tasks by creating physical

Literature review

In this section we describe the systematic literature review carried out to address HDI publications. After explaining the protocol defined to guide the procedure (SubSection 3.1), we present the results (SubSection 3.2) followed by a visual analysis to get a complementary perspective about the HDI area and to validate the findings (SubSection 3.3). In addition, we list the major themes that constitute HDI (SubSection 3.4).

Evaluation of HDI recommendations

During the literature review, we identified lack of practical approaches, requirements, guidelines, or recommendations that could guide the design of information systems focused on HDI. Then, we seek to establish a set of good practices from the literature.

With the aim of practically contributing to design based on previous experiences, we consolidate a set of recommendations derived from experiences reported in the analyzed literature. In order to bring these recommendations to a real-world

Open research challenges of HDI

The support to humans in the interaction with data still presents complex research challenges enumerated in several investigations in literature. Relying on our thorough survey in literature and our understanding of the HDI area, we consolidated a series of open research challenges. From our viewpoint, these stand for research topics that still require further investigations and appropriate methods to overcome such challenges.

Discussion

In this section we analyze the obtained results, followed by considerations regarding the limitations of our study.

Conclusion

Nowadays, data have been collected, tracked, and used to influence decisions. The ability to understand, extract, and update the information contained in such is important to enable the generation of knowledge in an ethical way. We conceive HDI as the area addressing human manipulation, analysis, and sensemaking of large, unstructured, and complex datasets, understanding their meaning as well as considering stakeholders and the data life cycle phases. The HDI field aims to investigate how

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Annex A – sample references

  • Alnusair, A., Zhong, C., Rawashdeh, M., Hossain, M.S., Alamri, A., 2017. Context-aware multimodal recommendations of multimedia data in cyber situational awareness. Multimed. Tools Appl. 1–21. https://doi.org/10.1007/s11042-017–4681–2

  • Babaee, M., Yu, X., Merget, D., Babaeian, A., Rigoll, G., Datcu, M., 2015. Interactive Feature Learning from SAR Image Patches, in: IEEE International Geoscience and Remote Sensing Symposium. pp. 541–54

  • Becker, T., Curry, E., Jentzsch, A., Palmetshofer, W., 2016.

Acknowledgements

We would like to thank the São Paulo Research Foundation (FAPESP) (Grants #2017/02325-5, #2015/24300-9 and #2015/165280).9

Eliane Zambon Victorelli is Lecturer at the São Paulo State Technology College, Brazil and Ph.D. Student in Computer Science at the Institute of Computing of the University of Campinas (UNICAMP), Brazil. She holds a degree in Chemical Engineering, and a M.Sc. in Computer Science from UNICAMP. She has been working in the areas of IT Governance, Requirements, Development, and Systems Integration. Her current research interests are focused on Human-Data Interaction.

References (53)

  • A. Locoro et al.

    Static and interactive infographics in daily tasks: a value-in-use and quality of interaction user study

    Comput. Hum. Behav.

    (2017)
  • Y. Vasuki et al.

    Semi-automatic mapping of geological structures using UAV-based photogrammetric data: an image analysis approach

    Comput. Geosci.

    (2014)
  • F.J. Anscombe

    Graphs in statistical analysis author

    Am. Stat.

    (1973)
  • A. Arpetti

    Enactive systems & computing mapping the terrain for human-computer interaction research

    SEMISH - Semin. Integr. Softw. e Hardware XXXVI Congr. da Soc. Bras. Comput.

    (2016)
  • M. Babaee et al.

    Interactive feature learning from SAR image patches, in

  • J. Bailey et al.

    Evidence relating to object-oriented software design: a survey

    Proc. - 1st Int. Symp. Empir. Softw. Eng. Meas. ESEM

    (2007)
  • Becker, T., Curry, E., Jentzsch, A., Palmetshofer, W., 2016. Cross-sectorial requirements analysis for big data...
  • Brasil, 2011. Lei 12.527 de 18 de novembro de...
  • F. Cabitza et al.

    Valuable visualization of healthcare information

  • F. Cabitza et al.

    Probing interactivity in open data for general practice. An evidence-based approach

    VVH@ AVI

    (2016)
  • Caceffo, R., Moreira, E.A., Bonacin, R., Cesar, J., Carbajal, M.L., Abreu, V.V.D., Camilla, V.L.T., Lombello, L., 2019....
  • F. Cafaro

    Using embodied allegories to design gesture suites for human-data interaction

  • F. Cafaro et al.

    RFID localization for tangible and embodied multi-user interaction with museum exhibits

  • A. Cavoukian et al.

    Cognitive cities

    Big Data and Citizen Participation: The Essentials of Privacy and Security

    (2016)
  • A. Chamberlain et al.

    Searching for music: understanding the discovery, acquisition, processing and organization of music in a domestic setting for design

    Pers. Ubiquitous Comput.

    (2016)
  • Chowdhury, S.N., Dhawan, S., 2016. HDI based data ownership model for smart cities, in: international conference on...
  • E.F. Churchill

    Designing data practices

    Interactions

    (2016)
  • M. Correll et al.

    Regression by eye : estimating trends in bivariate visualizations

  • A. Crabtree

    Enabling the new economic actor: personal data regulation and the digital economy

  • Crabtree, A., Mortier, R., 2015 a. Human data interaction: historical lessons from social studies and CSCW....
  • Crabtree, A., Mortier, R., 2015 b. Human data interaction: historical lessons from social studies and CSCW 19–23....
  • Crabtree, A., Mortier, R., 2015 c. Human data interaction: historical lessons from social studies and CSCW 19–23....
  • E. Dimara et al.

    Conceptual and methodological issues in evaluating multidimensional visualizations for decision support

    EEE Trans. Vis. Comput. Graph.

    (2018)
  • M. El Beheiry et al.

    Virtual reality: beyond visualisation [accepted for publication]

    J. Mol. Biol.

    (2019)
  • Elmqvist, N., 2011. Embodied human-data interaction. ACM CHI 2011 work. “Embodied interact. Theory Pract. HCI...
  • A. Freitas et al.

    Big data curation

    New Horizons for a Data-Driven Economy - Springer International Publishing

    (2016)
  • Cited by (46)

    • Rational satisficing heuristics as determinants of online search behavior

      2024, International Journal of Information Management Data Insights
    • Privacy preferences in automotive data collection

      2024, Transportation Research Interdisciplinary Perspectives
    • An information retrieval benchmarking model of satisficing and impatient users’ behavior in online search environments

      2022, Expert Systems with Applications
      Citation Excerpt :

      Indeed, the first two alternatives composing the ranking receive a disproportionate number of clicks compared to the remaining ones within the first page of search results (Chitika, 2013; Dean, 2019). The formalization of the information retrieval behavior observed through standard utility approaches must deal with the cognitive limits of users (Gupta et al., 2018; Lieder & Griffiths, 2020), whose behavior cannot be based on the almost four million permutations that can be computed from the ten results composing the initial page delivered by the engine (Basu, 2018; Victorelli et al., 2020). We are therefore left with the order implicit in the ranking provided by the engine as the only guideline available to replicate the information retrieval behavior of users (European Commission, 2016).

    View all citing articles on Scopus

    Eliane Zambon Victorelli is Lecturer at the São Paulo State Technology College, Brazil and Ph.D. Student in Computer Science at the Institute of Computing of the University of Campinas (UNICAMP), Brazil. She holds a degree in Chemical Engineering, and a M.Sc. in Computer Science from UNICAMP. She has been working in the areas of IT Governance, Requirements, Development, and Systems Integration. Her current research interests are focused on Human-Data Interaction.

    Julio Cesar dos Reis is an Assistant Professor at the Institute of Computing, University of Campinas (UNICAMP), Brazil. He received a Ph.D. degree in Computer Science in 2014 from the University of Paris-Sud XI (France). Julio holds a M.Sc. degree in Computer Science (2011) and B.Tech. in Informatics (2008) from UNICAMP. Julio's research interests are focused on Semantic Web, Computational Ontologies, and Human-Computer Interaction.

    Heiko Hornung is an Assistant Professor at the Institute of Computing of the University of Campinas, UNICAMP, Brazil. He holds a degree in Business Informatics from the Darmstadt University of Technology, and a M.Sc. and Ph.D. in Computer Science from UNICAMP. His research interests comprise topics such as digital inclusion, interaction design, pragmatic aspects of electronically mediated human-human interaction, universal access to information and knowledge, and participatory design.

    Alysson Bolognesi Prado is a Computer Engineer, M.Sc. and Ph.D. in Computer Science, all degrees from the University of Campinas, Brazil. He has been working for the last 15 years on software development. His current research interests are focused on human-computer interaction and sociotechnological interplay, including the instantiation of Actor-Network Theory concepts in pragmatic aspects of design of digital artefacts.

    View full text