Skip to main content
Log in

Data science ethical considerations: a systematic literature review and proposed project framework

  • Original Paper
  • Published:
Ethics and Information Technology Aims and scope Submit manuscript

Abstract

Data science, and the related field of big data, is an emerging discipline involving the analysis of data to solve problems and develop insights. This rapidly growing domain promises many benefits to both consumers and businesses. However, the use of big data analytics can also introduce many ethical concerns, stemming from, for example, the possible loss of privacy or the harming of a sub-category of the population via a classification algorithm. To help address these potential ethical challenges, this paper maps and describes the main ethical themes that were identified via systematic literature review. It then identifies a possible structure to integrate these themes within a data science project, thus helping to provide some structure in the on-going debate with respect to the possible ethical situations that can arise when using data science analytics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Boell, S., & Cecez-Kecmanovic, D. (2014). A hermeneutic approach for conducting literature reviews and literature searches. Communications of the Association for Information Systems, 34, 1.

    Article  Google Scholar 

  • Boyd, D, & Crawford, K. (2012). Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon. Information, communication & society, 15(5), 662–679.

    Article  Google Scholar 

  • Boyd, D, Levy, K., & Marwick, A. E. (2014). The networked nature of algorithmic discrimination. In Data and discrimination: Collected essays (pp. 43–57). Washington, DC: Open Technology Institute.

    Google Scholar 

  • Boyd, K. (2012). Critical questions for big data. Information, Communication & Society, 15, 662–679.

    Article  Google Scholar 

  • Braun, A., & Garriga, G. (2018). Consumer journey analytics in the context of data privacy and ethics. In C. Linnhoff-Popien, R. Schneider & M. Zaddach (Eds.), Digital marketplaces unleashed. Berlin: Springer.

    Google Scholar 

  • Brey, P., & Soraker, J. (2009). Philosophy of computing and information technology. In D. M. Gabbay, A. W. M. Meijers, J. Woods, & P. Thagard (Eds). Philosophy of technology and engineering sciences (pp. 1341–1408). North Holland: Elsevier.

    Chapter  Google Scholar 

  • Butrymowicz, S., & Garland, S. (2012). How New York city’s value-added model compares to what other districts, states are doing, hechingerreport. Retrieved from http://hechingerreport.org/content/how-new-york-citys-value-added-model-compares-to-what-other-districts-states-are-doing_7757/.

  • Bynum, T. (2008). Computer and information ethics. In Stanford encyclopedia of philosophy. Retrieved from http://plato.stanford.edu/entries/ethics-computer/. Accessed 14 January 2016

  • Bynum, T., & Rogerson, S. (2003). Computer ethics and professional responsibility: Introductory text. New York: Wiley

    Google Scholar 

  • Chen, A. (2017). Using machine learning to find the 8 types of players in the NBA, Fastbreak. http://fastbreakdata.com/classifying-the-modern-nba-player-with-machine-learning-539da03bb824.

  • Clarke, R. (2016). Big data, big risks. Information Systems Journal, 26(1), 77–90.

    Article  Google Scholar 

  • Crawford, K. (2013). The hidden biases in big data. Harvard Business Review Online Edn. Harvard Business Review.

  • De Laat, P. B. (2017). Big data and algorithmic decision-making: Can transparency restore accountability? ACM SIGCAS Computers and Society, 47(3), 39–53.

    Article  Google Scholar 

  • Dorasamy, N., & Pomazalová, N. (2016). Social impact and social media analysis relating to big data. In Data science and big data computing (pp. 293–313). Cham: Springer.

    Chapter  Google Scholar 

  • Drosou, M., Jagadish, H. V., Pitoura, E., & Stoyanovich, J. (2017). Diversity in big data: A review. Big data, 5(2), 73–84.

    Article  Google Scholar 

  • Elo, S., & Kyngäs, H. (2007). The qualitative content analysis process. Journal of Advanced Nursing, 62(1), 107–115.

    Article  Google Scholar 

  • Fairfield, J., & Shtein, H. (2014). Big data, big problems: Emerging issues in the ethics and data science of journalism. Journal of Mass Media Ethics, 29, 38–51.

    Article  Google Scholar 

  • Fleiss, J. L., Levin, B., & Paik, M. C. (2004). Determining sample sizes needed to detect a difference between two proportions. Statistical Methods for Rates and Proportions, 2, 64–85.

    Google Scholar 

  • Floridi, L., & Taddeo, M. (2016). What is data ethics?. Philosophical Transactions Series A, 374, 2083.

    Google Scholar 

  • Fong, K. (2016). The ethics conversation we’re not having about analytics. Harvard Business Review Online Edn. Retrieved from http://blogs.hbr.org/2013/04/thehidden-biases-in-big-data/. Accessed 20 August 2017.

  • Fuller, M. (2017). Big data, ethics and religion: New questions from a new science. Religions, 8(5), 88.

    Article  Google Scholar 

  • Grindrod, P. (2016). Beyond privacy and exposure: Ethical issues within citizen-facing analytics. Philosophical Transactions of the Royal Society A, 374(2083), 20160132.

    Article  Google Scholar 

  • Gumbus, A., & Grodzinsky, F. (2016). Era of big data: Danger of descrimination. ACM SIGCAS Computers and Society, 45(3), 118–125.

    Article  Google Scholar 

  • Haffar, J. (2015). Have you seen ASUM-DM? Retrieved from IBM: https://developer.ibm.com/predictiveanalytics/2015/10/16/have-you-seen-asum-dm/.

  • Harkens, A. (2016). ‘Rear window ethics’ and discrimination: The darker side of big data. In European conference on e-government (p. 267). Academic Conferences International Limited.

  • Hsieh, H.-F., & Shannon, S. E. (2005). Three approaches to qualitative content analysis. Qualitative Health Research, 15(9), 1277–1288.

    Article  Google Scholar 

  • Jagadish, H., Gehrke, J., Labrinidis, A., Papakonstantinou, Y., Patel, J. M., Ramakrishnan, R., & Shahabi, C. (2014). Big data and its technical challenges. Communications of the ACM, 57(7), 86–94.

    Article  Google Scholar 

  • Johnson, D. (1985). Computer ethics. Upper Saddle River: Prentice-Hall.

    Google Scholar 

  • Johnson, D., & Nissenbaum, H. (1995). Computers, ethics and social values. New York: Pearson.

    Google Scholar 

  • Joseph, D., Ng, K., Koh, C., and Ang. S (2007). Turnover of information technology professionals: A narrative review, meta-analytic structural equation modeling, and model development. MIS Quarterly, 31(3), 547–577.

    Article  Google Scholar 

  • Kitchenham, B., & Charters, S. (2007). Guidelines for performing systematic literature reviews in software engineering. UK: Keele.

    Google Scholar 

  • Leonelli, S. (2016). Locating ethics in data science: Responsibility and accountability in global and distributed knowledge production systems. Philosophical Transactions of the Royal Society A, 374(2083), 20160122.

    Article  Google Scholar 

  • Manders-Huits, N., & Zimmer, M. (2009). Values and pragmatic action: The challenges of introducing ethical intelligence in technical design communities. International Review of Information Ethics, 10(2), 37–45.

    Google Scholar 

  • Martin, K. E. (2015). Ethical issues in the big data industry. MIS Quarterly Executive, 14, 2.

    Google Scholar 

  • Mateosian, R. (2013). Ethics of big data. IEEE Micro, 33(2), 60–61.

    Article  Google Scholar 

  • Metcalf, J., Keller, E., Boyd, D. (2016). Perspectives on big data, ethics and society. Council for Big Data, Ethics and Society. http://bdes.datasociety.net/council-output/perspectives-on-big-data-ethics-andsociety/.

  • Mingers, J., & Walsham, G. (2010). Towards ethical information systems: The contribution of discourse ethics. MIS Quarterly, 34(4), 833–854.

    Article  Google Scholar 

  • Mittelstadt, B. (2017). From individual to group privacy in big data analytics. Philosophy & Technology, 30, 475–494.

    Article  Google Scholar 

  • Newell, S., & Marabelli, M. (2015). Strategic opportunities (and challenges) of algorithmic decisionmaking: A call for action on the long-term societal effects of ‘datification’. The Journal of Strategic Information Systems. https://doi.org/10.1016/j.jsis.2015.02.001.

    Google Scholar 

  • Nyes, K. (2016). White house to data scientists: We need you. Computer world. Retrieved from http://www.computerworld.com/article/3125660/big-data/white-house-to-data-scientists-we-need-you.html. Accessed 20 August 2017.

  • Pascalev, M. (2017). Privacy exchanges: Restoring consent in privacy self-management. Ethics and Information Technology, 19(1), 39–48. https://doi.org/10.1007/s10676-016-9410-4.

    Article  Google Scholar 

  • Rowe, F. (2014). What literature review is not: Diversity, boundaries and recommendations. European Journal of Information Systems, 23(3), 241–255.

    Article  Google Scholar 

  • Saltz, J., Dewar, N., & Heckman, R. (2018). Key concepts for a data science ethics curriculum. In Proceedings of the 49th ACM technical symposium on computer science education (pp. 952–957). ACM.

  • Saltz, J., & Stanton, J. (2017). An introduction to data science. Thousand Oaks: SAGE Publications.

    Google Scholar 

  • Sandvig, C., Hamilton, K., Karahalios, K., & Langbort, C. (2014). An algorithm audit. In Data and discrimination: Collected essays. New York: New America, Open Technology Institute.

    Google Scholar 

  • Schwartz, P. M. (2011). Privacy, ethics and analytics. IEEE security and privacy 9(3). IEEE.

  • Shearer, C. (2000). The CRISP-DM model: The new blueprint for data mining. Journal of Data Warehousing, 5(4), 13–22.

    Google Scholar 

  • Someh, I. A., Breidbach, C. F., Davern, M. J., & Shanks, G. G. (2016). Ethical implications of big data analytics. In ECIS (pp. Research-in).

  • Stahl, B. C., Timmermans, J., & Mittelstadt, B. D. (2016). The ethics of computing: A survey of the computing-oriented literature. ACM Computing Surveys (CSUR), 48(4), 55.

    Article  Google Scholar 

  • Stevenson, D. (2014). Locating discrimination in data-based systems. Data and discrimination: Collected essays (16–20). Washington, DC: New America/Open Technology Institute

    Google Scholar 

  • Stoyanovich, J., Howe, B., Abiteboul, S., Miklau, G., Sahuguet, A., & Weikum, G. (2017). Fides: Towards a platform for responsible data science. In SSDBM’17-29th International Conference on Scientific and Statistical Database Management.

  • Sweeney, L. (2013). Discrimination in Online Ad Delivery. ACM Queue 11(3). Association of Computing Machinery.

  • Tene, O., & Polotensky, J. (2012). Privacy in the age of big data. Stanford Law Review.

  • Tiell, S., & Metcalf, J. (2016). The Universal Principles of Data Science Ethics. Accenture Labs. https://www.accenture.com/t20160629T012639__w__/us-en/_acnmedia/PDF-24/Accenture-Universal-Principles-Data-Ethics.pdf.

  • Tractenberg, R. E., Russell, A. J., Morgan, G. J., FitzGerald, K. T., Collmann, J., Vinsel, L., … Dolling, L. M. (2015). Using ethical reasoning to amplify the reach and resonance of professional codes of conduct in training big data scientists. Science and Engineering Ethics, 21(6), 1485–1507.

    Article  Google Scholar 

  • Voronova, L., & Kazantsev, N. (2015). The ethics of big data: Analytical survey. In Business informatics (CBI), 2015 IEEE 17th conference on (Vol. 2, pp. 57–63). IEEE.

  • Wielki, J. (2015). The social and ethical challenges connected with the big data phenomenon. Polish Journal of Management Studies, 11(2), 192–202.

    Google Scholar 

  • Wiener, N. (1954). The human use of human beings. New York: Doubleday.

    Google Scholar 

  • Zwitter, A. (2014). Big data ethics. Big Data & Society, 1(2), 2053951714559253.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jeffrey S. Saltz.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Saltz, J.S., Dewar, N. Data science ethical considerations: a systematic literature review and proposed project framework. Ethics Inf Technol 21, 197–208 (2019). https://doi.org/10.1007/s10676-019-09502-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10676-019-09502-5

Keywords

Navigation