Barriers to academic data science research in the new realm of algorithmic behaviour modification by digital platforms

Greene, Travis; Martens, David; Shmueli, Galit

doi:10.1038/s42256-022-00475-7

Perspective
Published: 18 April 2022

Barriers to academic data science research in the new realm of algorithmic behaviour modification by digital platforms

Nature Machine Intelligence volume 4, pages 323–330 (2022)Cite this article

1083 Accesses
12 Citations
97 Altmetric
Metrics details

Subjects

Abstract

The era of behavioural big data has created new avenues for data science research, with many new contributions stemming from academic researchers. Yet data controlled by platforms have become increasingly difficult for academics to access. Platforms now routinely use algorithmic behaviour modification techniques to manipulate users’ behaviour, leaving academic researchers further isolated in conducting important data science and computational social science research. This isolation results from researchers’ lack of access to human behavioural data and, crucially, to both the data on machine behaviour that triggers and learns from the human data and the platform’s behaviour modification mechanisms. Given the impact of behaviour modification on individual and societal well-being, we discuss the consequences for data science knowledge creation, and encourage academic data scientists to take on new roles in producing research to promote (1) platform transparency and (2) informed public debate around the social purpose and function of digital platforms.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Human users’ behavioural data and related machine data used for BMOD and prediction.

Survey of open science practices and attitudes in the social sciences

Article Open access 05 September 2023

Situated data analysis: a new method for analysing encoded power relationships in social media platforms and apps

Article Open access 17 June 2020

How ethics combine with big data: a bibliometric analysis

Article Open access 04 November 2020

References

Shmueli, G. Research dilemmas with behavioral big data. Big Data 5, 98–119 (2017).
Article Google Scholar
Olteanu, A., Castillo, C., Diaz, F. & Kıcıman, E. Social data: biases, methodological pitfalls and ethical boundaries. Front. Big Data 2, 13 (2019).
Article Google Scholar
Wu, A. X. & Taneja, H. Platform enclosure of human behavior and its measurement: using behavioral trace data against platform episteme. New Media Soc. 23, 2650–2667 (2020).
Article Google Scholar
Lazer, D. M. et al. Computational social science: obstacles and opportunities. Science 369, 1060–1062 (2020).
Article Google Scholar
Sadowski, J., Viljoen, S. & Whittaker, M. Everyone should decide how their digital data are used—not just tech companies. Nature 595, 169–171 (2021).
Article Google Scholar
Rahwan, I. et al. Machine behaviour. Nature 568, 477–486 (2019).
Article Google Scholar
Bak-Coleman, J. B. et al. Stewardship of global collective behavior. Proc. Natl Acad. Sci. USA 118, e2025764118 (2021).
Article Google Scholar
Srnicek, N. Platform Capitalism (Wiley, 2017).
Helmond, A. The platformization of the web: making web data platform ready. Social Media Soc. 1, 1–11 (2015).
Google Scholar
Zuboff, S. The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power (Profile Books, 2019).
Gauci, J. et al. Horizon: Facebook’s open source applied reinforcement learning platform. Preprint at https://arxiv.org/abs/1811.00260 (2018).
De Cnudde, S. et al. What does your facebook profile reveal about your creditworthiness? Using alternative data for microfinance. J. Oper. Res. Soc. 70, 353–363 (2019).
Article Google Scholar
Kosinski, M., Stillwell, D. & Graepel, T. Private traits and attributes are predictable from digital records of human behavior. Proc. Natl Acad. Sci USA 110, 5802–5805 (2013).
Article Google Scholar
Matz, S. C., Kosinski, M., Nave, G. & Stillwell, D. J. Psychological targeting as an effective approach to digital mass persuasion. Proc. Natl Acad. Sci. USA 114, 12714–12719 (2017).
Article Google Scholar
Gauci, J., Liu, H., Ghavamzadeh, M. & Nahmias, R. Open-sourcing Reagent, a Modular, End-to-end Platform for Building Reasoning Systems https://ai.facebook.com/blog/open-sourcing-reagent-a-platform-for-reasoning-systems/ (2019);
Michie, S. et al. The behavior change technique taxonomy (v1) of 93 hierarchically clustered techniques: building an international consensus for the reporting of behavior change interventions. Ann. Behav. Med. 46, 81–95 (2013).
Article Google Scholar
Milano, S., Mittelstadt, B., Wachter, S. & Russell, C. Epistemic fragmentation poses a threat to the governance of online targeting. Nat. Mach. Intell. 3, 466–472 (2021).
Article Google Scholar
Fogg, B. J. Persuasive Technology: Using Computers to Change What We Think and Do (Morgan Kaufmann, 2002).
Yeung, K. ‘hypernudge’: big data as a mode of regulation by design. Inf. Commun. Soc. 20, 118–136 (2017).
Article Google Scholar
Kaptein, M., Markopoulos, P., De Ruyter, B. & Aarts, E. Personalizing persuasive technologies: explicit and implicit personalization using persuasion profiles. Int. J. Human Comput. Stud. 77, 38–51 (2015).
Article Google Scholar
Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction (MIT Press, 2018).
Chen, M. et al. Top-K off-policy correction for a reinforce recommender system. In Proc. Twelfth ACM International Conference on Web Search and Data Mining 456–464 (ACM, 2019).
Eyal, N. Hooked: How to Build Habit-Forming Products (Penguin, 2014).
Bird, S., Barocas, S., Crawford, K., Diaz, F. & Wallach, H. Exploring or exploiting? Social and ethical implications of autonomous experimentation in AI. In Workshop on Fairness, Accountability and Transparency in Machine Learning (2016); https://ssrn.com/abstract=2846909
Burr, C., Cristianini, N. & Ladyman, J. An analysis of the interaction between intelligent software agents and human users. Minds Mach. 28, 735–774 (2018).
Article Google Scholar
Russell, S. Human Compatible: Artificial Intelligence and the Problem of Control (Penguin, 2019).
Cristianini, N., Scantamburlo, T. & Ladyman, J. The social turn of artificial intelligence. AI Soc. https://doi.org/10.1007/s00146-021-01289-8 (2021).
Milano, S., Taddeo, M. & Floridi, L. Recommender systems and their ethical challenges. AI Soc. 35, 957–967 (2020).
Article Google Scholar
Menczer, F. 4 reasons why social media make us vulnerable to manipulation. In Proc. Fourteenth ACM Conference on Recommender Systems 1 (ACM, 2020); https://doi.org/10.1145/3383313.3418434
Beam, M. A., Hutchens, M. J. & Hmielowski, J. D. Facebook news and (de) polarization: reinforcing spirals in the 2016 US election. Inf. Commun. Soc. 21, 940–958 (2018).
Article Google Scholar
Bidar, M. Liberals to ‘Moscow Mitch,’ conservatives to QAnon: Facebook researchers saw how its algorithms led to misinformation. CBS News Online (25 October 2021); https://www.cbsnews.com/news/facebook-algorithm-news-feed-conservatives-liberals-india/
Saar-Tsechansky, M., Melville, P. & Provost, F. Active feature-value acquisition. Manag. Sci. 55, 664–684 (2009).
Article Google Scholar
Saar-Tsechansky, M. & Provost, F. Handling missing values when applying classification models. J. Mach. Learn. Res. 8, 1623–1657 (2007).
MATH Google Scholar
Yahav, I., Shmueli, G. & Mani, D. A tree-based approach for addressing self-selection in impact studies with big data. MIS Q. 40, 819–848 (2016).
Article Google Scholar
Athey, S. & Imbens, G. Recursive partitioning for heterogeneous causal effects. Proc. Natl Acad. Sci. USA 113, 7353–7360 (2016).
Article MathSciNet Google Scholar
Martens, D., Provost, F., Clark, J. & de Fortuny, E. J. Mining massive fine-grained behavior data to improve predictive analytics. MIS Q. 40, 869–888 (2016).
Article Google Scholar
Ramon, Y., Martens, D., Provost, F. & Evgeniou, T. A comparison of instance-level counterfactual explanation algorithms for behavioral and textual data: SEDC, LIME-C and SHAP-C. Adv. Data Anal. Classif 14, 801–819 (2020).
Article MathSciNet Google Scholar
Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1, 206–215 (2019).
Article Google Scholar
Walker, D. & Muchnik, L. Design of randomized experiments in networks. Proc. IEEE 102, 1940–1951 (2014).
Article Google Scholar
Hadad, V., Hirshberg, D. A., Zhan, R., Wager, S. & Athey, S. Confidence intervals for policy evaluation in adaptive experiments. Proc. Natl Acad. Sci. USA 118, e2014602118 (2021).
Article MathSciNet Google Scholar
Wachter, S., Mittelstadt, B. & Russell, C. Why fairness cannot be automated: bridging the gap between EU non-discrimination law and AI. Comput. Law Security Rev. 41, 105567 (2021).
Article Google Scholar
Hill, S. et al. Network-based marketing: identifying likely adopters via consumer networks. Stat. Sci. 21, 256–276 (2006).
Article MathSciNet Google Scholar
Tobback, E., Bellotti, T., Moeyersoms, J., Stankova, M. & Martens, D. Bankruptcy prediction for SMES using relational data. Decision Support Syst. 102, 69–81 (2017).
Article Google Scholar
Stephens-Davidowitz, S. & Pabon, A. Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are (Harper Collins, 2017).
Robertson, R. E., Olteanu, A., Diaz, F., Shokouhi, M. & Bailey, P. ‘I can’t reply with that’: characterizing problematic email reply suggestions. In Proc. 2021 CHI Conference on Human Factors in Computing Systems Vol. 724, 1–18 (2021).
Praet, S. et al. I Like, Therefore I Am. Predictive Modeling to Gain Insights in Political Preference in a Multi-party System. Research paper 1–34 (University of Antwerp, Faculty of Business and Economics, 2018).
Bapna, R., Ramaprasad, J., Shmueli, G. & Umyarov, A. One-way mirrors in online dating: a randomized field experiment. Manag. Sci. 62, 3100–3122 (2016).
Article Google Scholar
Pentland, A. Social Physics: How Good Ideas Spread—the Lessons from a New Science (Penguin, 2014).
Matz, S. C. & Netzer, O. Using big data as a window into consumers’ psychology. Curr. Opin. Behav. Sci. 18, 7–12 (2017).
Google Scholar
King, G. & Persily, N. A new model for industry-academic partnerships. PS Polit. Sci. Polit. 53, 703–709 (2020).
Article Google Scholar
Verbeke, W., Martens, D. & Baesens, B. Social network analysis for customer churn prediction. Appl. Soft Comput. 14, 431–446 (2014).
Article Google Scholar
Kramer, A. D., Guillory, J. E. & Hancock, J. T. Experimental evidence of massive-scale emotional contagion through social networks. Proc. Natl Acad. Sci. USA 111, 8788–8790 (2014).
Article Google Scholar
Li, L., Chu, W., Langford, J. & Wang, X. Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms. In Proc. Fourth ACM International Conference on Web Search and Data Mining 297–306 (ACM, 2011).
Jeunen, O. Revisiting offline evaluation for implicit-feedback recommender systems. In Proc. 13th ACM Conference on Recommender Systems 596–600 (ACM, 2019).
Weller, K. & Kinder-Kurlanda, K. E. A manifesto for data sharing in social media research. In Proc. 8th ACM Conference on Web Science 166–172 (ACM, 2016).
Bastos, M. & Walker, S. T. Facebook’s data lockdown is a disaster for academic researchers. The Conversation Online (11 April 2018); https://theconversation.com/facebooks-data-lockdown-is-a-disaster-for-academic-researchers-94533
Mattu, S., Yin, L., Waller, A. & Keegan, J. How we built a Facebook inspector. The Markup (5 January 2021); https://themarkup.org/citizen-browser/2021/01/05/how-we-built-a-facebook-inspector
Messing, S. et al. Dataverse (Social Science One, 2020); https://socialscience.one/facebook-dataverse
Schnabel, T., Swaminathan, A., Singh, A., Chandak, N. & Joachims, T. Recommendations as treatments: debiasing learning and evaluation. In Proc. International Conference on Machine Learning 1670–1679 (PMLR, 2016).
Lee, D., Hosanagar, K. & Nair, H. S. Advertising content and consumer engagement on social media: evidence from Facebook. Manag. Sci. 64, 5105–5131 (2018).
Article Google Scholar
Verma, S., Dickerson, J. & Hines, K. Counterfactual explanations for machine learning: a review. Preprint at https://arxiv.org/abs/2010.10596 (2020).
Puiutta, E. & Veith, E. M. Explainable reinforcement learning: a survey. In Proc. International Cross-Domain Conference for Machine Learning and Knowledge Extraction 77–95 (Springer, 2020).
Schneider, C., Weinmann, M. & Vom Brocke, J. Digital nudging: guiding online user choices through interface design. Commun. ACM 61, 67–73 (2018).
Article Google Scholar
Lardinois, F. Microsoft finally starts doing something with LinkedIn by integrating it into Office 365. Tech Crunch (25 September 2017); https://techcrunch.com/2017/09/25/microsoft-finally-starts-doing-something-with-linkedin-by-integrating-it-into-office-365/
de Myttenaere, A., Le Grand, B., Golden, B. & Rossi, F. Reducing offline evaluation bias in recommendation systems. In Proc. 23rd Annual Belgian-Dutch Conference on Machine Learning (Benelearn 2014) 55–62 (2014).
Summary Judgment Opinion (ACLU, 2020); https://www.aclu.org/legal-document/summary-judgment-opinion-0
Gorwa, R. What is platform governance? Inf. Commun. Soc. 22, 854–871 (2019).
Article Google Scholar
Gorwa, R., Binns, R. & Katzenbach, C. Algorithmic content moderation: technical and political challenges in the automation of platform governance. Big Data Soc. 7, 2053951719897945 (2020).
Article Google Scholar
McGuigan, L. This tool lets you confuse Google’s ad network, and a test shows it works. MIT Technology Review (6 January 2021); https://www.technologyreview.com/2021/01/06/1015784/adsense-google-surveillance-adnauseam-obfuscation/
Yao, S. et al. Measuring recommender system effects with simulated users. Preprint at https://arxiv.org/abs/2101.04526 (2021).
Tufekci, Z. Big questions for social media big data: representativeness, validity and other methodological pitfalls. In Proc. International AAAI Conference on Web and Social Media Vol. 8 (AAAI, 2014).
Horwitz, J. Facebook seeks shutdown of NYU research project into political ad targeting.Wall Street Journal (23 October 2020); https://www.wsj.com/articles/facebook-seeks-shutdown-of-nyu-research-project-into-political-ad-targeting-11603488533
Activities that Require IRB Review (UCI, accessed 24 February 2022); https://research.uci.edu/compliance/human-research-protections/researchers/activities-irb-review.html
Shmueli, G. & Tafti, A. How to ‘improve’ prediction of human behavior using behavior modification. Preprint at https://arxiv.org/abs/2008.12138 (2020).
Fried, I. Scoop: Google CEO pledges to investigate exit of top AI ethicist. Axios (9 December 2020); https://www.axios.com/sundar-pichai-memo-timnit-gebru-exit-18b0efb0-5bc3-41e6-ac28-2956732ed78b.html
Google fires Margaret Mitchell, another top researcher on its AI ethics team. The Guardian (20 February 2021); https://www.theguardian.com/technology/2021/feb/19/google-fires-margaret-mitchell-ai-ethics-team
Dave, P. & Dastin, J. Google told its scientists to ‘strike a positive tone’ in AI research—documents. Reuters (23 December 2020); https://www.reuters.com/article/us-alphabet-google-research-focus-idUSKBN28X1CB
Kitchin, R. Thinking critically about and researching algorithms. Inf. Commun. Soc. 20, 14–29 (2017).
Article Google Scholar
Boka, Z. Facebook’s research ethics board needs to stay far away from Facebook. Wired Magazine (23 June 2016); https://www.wired.com/2016/06/facebooks-research-ethics-board-needs-stay-far-away-facebook/
Bietti, E. From ethics washing to ethics bashing: a view on tech ethics from within moral philosophy. In Proc. 2020 Conference on Fairness, Accountability and Transparency 210–219 (ACM, 2020).
Li, L., Chu, W., Langford, J. & Schapire, R. E. A contextual-bandit approach to personalized news article recommendation. In Proc. 19th International Conference on World Wide Web 661–670 (2010).
Van Dijck, J., Poell, T. & De Waal, M. The Platform Society: Public Values in a Connective World (Oxford Univ. Press, 2018).
Haugen, F. Statement of Frances Haugen. Whistleblower Aid (4 October 2021); https://www.commerce.senate.gov/services/files/FC8A558E-824E-4914-BEDB-3A7B1190BD49

Download references

Acknowledgements

We thank C. Rudin, F. Provost and T. Evgeniou for their valuable feedback and suggestions.

Author information

Authors and Affiliations

National Tsing Hua University, Institute of Service Science, Hsinchu, Taiwan
Travis Greene & Galit Shmueli
University of Antwerp, Department of Engineering Management, Antwerp, Belgium
David Martens

Authors

Travis Greene
View author publications
You can also search for this author in PubMed Google Scholar
David Martens
View author publications
You can also search for this author in PubMed Google Scholar
Galit Shmueli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Galit Shmueli.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Machine Intelligence thanks Maytal Saar-Tsechansky and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Greene, T., Martens, D. & Shmueli, G. Barriers to academic data science research in the new realm of algorithmic behaviour modification by digital platforms. Nat Mach Intell 4, 323–330 (2022). https://doi.org/10.1038/s42256-022-00475-7

Download citation

Received: 22 January 2021
Accepted: 02 March 2022
Published: 18 April 2022
Issue Date: April 2022
DOI: https://doi.org/10.1038/s42256-022-00475-7

This article is cited by

Algorithmic profiling as a source of hermeneutical injustice
- Silvia Milano
- Carina Prunkl
Philosophical Studies (2024)
Harnessing human and machine intelligence for planetary-level climate action
- Ramit Debnath
- Felix Creutzig
- Emily Shuckburgh
npj Climate Action (2023)
Digital Domination and the Promise of Radical Republicanism
- Bernd Hoeksema
Philosophy & Technology (2023)