Skip to main content
Log in

Epistemology Goes AI: A Study of GPT-3’s Capacity to Generate Consistent and Coherent Ordered Sets of Propositions on a Single-Input-Multiple-Outputs Basis

  • Published:
Minds and Machines Aims and scope Submit manuscript

Abstract

The more we rely on digital assistants, online search engines, and AI systems to revise our system of beliefs and increase our body of knowledge, the less we are able to resort to some independent criterion, unrelated to further digital tools, in order to asses the epistemic reliability of the outputs delivered by them. This raises some important questions to epistemology in general and pressing questions to applied to epistemology in particular. In this paper, we propose an experimental method for the assessment of GPT-3’s capacity to generate consistent and coherent sets of outputs. When several outputs to one and the same input are very repetitive they tend to be consistent with each other, that is they do not contradict each other. But consistency does not make the set of outputs as a whole more informative than the outputs considered individually. We argue that the less informative a set of outputs is, the less coherent it is. We establish a conceptual distinction between consistency and coherence in the light of what some epistemologists refer to as a coherence theories of truth and justification. While much attention has been given to GPT-3’s capacity to produce internally coherent individual outputs, we argue, instead, that more attention should be given to its capacity to produce consistent and coherent outputs generated on a single-input-multiple-outputs basis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Notes

  1. See also Erik Olsson (Olsson, 2011, pp. 258–61). For critical discussion of BonJour’s coherentism, see The Current State of the Coherence Theory: Critical Essays on the Epistemic Theories of Keith Lehrer and Laurence BonJour, with Replies (Bender, 2012).

  2. Some philosophers such as, for instance, Paul Grice, might prefer to refer to this kind of mismatch between P’ and P not so much in terms of lack of coherence, but as a case in which P’, as an answer P, is “conversationally unsuitable” (Grice, 1991, p. 26). In this article, we use the concept of coherence (or lack of coherence) to cover also these particular cases in which an output is conversationally unsuitable to the input. This understanding of coherence (or lack of coherence) is well-known in the literature on textlinguistics (Toolan, 2009, pp. 44, 53).

  3. John Austin’s well-known account of the distinction between statements (which constitute a particular use of sentences) and sentences in general appears right at the beginning of How To Do Things With Words (Austin, 2009, pp. 1–2). As early as 1913, though, Adolf Reinach offered a sophisticated (and often underappreciated) account of the distinction between sentences that function as statements and sentences that function, for instance, as questions, or requests, or commands, or promises etc. (Reinach, 1983, pp. 18–28).

  4. We are not claiming that it is fair to assess the performance of the lecturer by assessing the performance of the class as a whole on a single-input-multiple-outputs basis. This issue has been recently addressed, for instance, by Cathy O’Neil (O’Neil, 2016, pp. 3–5).

  5. One reason for using this question in our experiment is that one of the authors has included basically the same question in examination papers in introductory courses to epistemology to both philosophy and law students for more than twenty years, thus having a clear grasp of most frequent acceptable and non-acceptable answers to this question. The three authors discussed among themselves the standards of correctness for the answers before proceeding to the individual assessment of the outputs delivered by GPT-3 (see please footnote 6). Another reason for using this question in our experiment is that it addresses the very concepts of coherence and truth, as discussed in this paper.

  6. For each set of outputs, we generated one corresponding MS-Excel spreadsheet named respectively N10T1 (N stands for cardinality = 10; T1 stands for temperature = 1.0, T125 for temperature = 1.25, T15 for temperature = 1.5), N10T125, N10T15, N15T1, N15T125, N15T15, N5T1, N5T125, N5T15. All spreadsheets and the code used to generate and analyse the data are available in the OSF (Open Science Framework, created: 13 June 2022) platform at: https://osf.io/kvswp/?view_only=ef3cd135ed4b4e218b7707f42b251caa. See please folder named “outputs” to access the 90 unique outputs GPT-3 generated for the proposed query.

  7. Each author made a personal assessment on the grade. In order to ensure intercoder reliability, the authors initially coded a single set and conducted a meeting for uniformization of parameters on the coding, including grading and repetitiveness. We established that a perfect answer (i.e. attributed grade 5) should correctly establish that (1) correspondence theory of truth argues truth is assessed based on correspondence between reality and a statement; (2) coherence theory of truth establishes that a statement is true if it coheres with other statements accepted as true. After that, they did not discuss their grading until all outputs had been assessed individually so as to avoid influencing each other’s assessment. The authors agreed on a 1 point grade decrease for answers that did not account for the conceptual distinction between consistency and coherence correctly. In this process, the authors did not consider sentences that remained incomplete due to being cut to fit the preestablished max. number of tokens.

  8. We used the all-distilroberta-v1 model offered by HuggingFace and available at: https://huggingface.co/sentence-transformers/all-distilroberta-v1. This model was trained on a diverse corpus of English text available online, which included the Wikipedia, news articles, and the dataset used to train GPT-2 (GPT-3’s predecessor).

  9. This is defined as: \(1-cosine({output}_{1}, {output}_{2})\). Thus, the cosine distance is 0 when the two outputs are equal (cosine equals 1), and 2, the maximum, when the angle between them is of 180 degrees (cosine equals − 1).

  10. The authors would like to thank Dario Teixeira (Federal University of the State of Rio de Janeiro) and Fabio Shecaira (Federal University of Rio de Janeiro) for comments on a previous draft. The authors would also like to thank two anonymous reviewers for their insightful comments and invaluable suggestions on previous versions of the manuscript.

References

  • Assael, Y. (2022). ‘Restoring and attributing ancient texts using deep neural networks’, Nature, 603(7900), pp. 280–283. https://doi.org/10.1038/s41586-022-04448-z

  • Austin, J. (2009). How to Do Things with Words. Edited by J.O. Urmson. Cambridge, Mass: Harvard Univ. Press.

  • Bellert, I. (1970). ‘On a condition of the coherence of texts’, Semiotica, 2(4). https://doi.org/10.1515/semi.1970.2.4.335

  • Bender, J. (Ed.). (2012). The Current State of the Coherence Theory: Critical Essays on the Epistemic Theories of Keith Lehrer and Laurence BonJour, with Replies Dordrecht: Kluwer. https://search.ebscohost.com/login.aspx?direct=true&scope=site&db=nlebk&db=nlabk&AN=2849508 (Accessed: 12 June 2022).

  • Beta Writer, & Schoenenberger, H. (2019). Lithium-ion batteries: A machine-generated Summary of Current Research. Springer Berlin Heidelberg.

  • BonJour, L. (1985). The structure of empirical knowledge. Harvard University Press.

  • ChrisHMSFT and mrbullwinkle (2022). How to generate text with Azure OpenAI - Azure OpenAI. https://docs.microsoft.com/en-us/azure/cognitive-services/openai/how-to/completions (Accessed: 26 August 2022).

  • Coady, D. (2016). ‘Applied Epistemology’, in K. Lippert-Rasmussen, K. Brownlee, and D. Coady (Eds.) A Companion to Applied Philosophy. Chichester: John Wiley & Sons, pp. 49–60. https://doi.org/10.1002/9781118869109.ch4

  • Coady, D., & Fricker, M. (2017). ‘Introduction to special issue on applied epistemology’, Journal of Applied Philosophy, 34(2), pp. 153–156. https://doi.org/10.1111/japp.12207

  • Copi, I. M. (1986). Introduction to Logic (7th ed.). London: Macmillan.

  • de Araujo, M. (2006). Descartes on mathematical truths: Coherence and correspondence in the refutation of scepticism. History of Philosophy Quarterly, 23, 319–337.

    Google Scholar 

  • Dehouche, N. (2021). ‘Plagiarism in the age of massive Generative Pre-trained Transformers (GPT-3)’, Ethics in Science and Environmental Politics, 21, pp. 17–23. https://doi.org/10.3354/esep00195

  • Elgin, C. Z. (2020). Epistemic virtues in understanding. In H. Battaly (Ed.), The Routledge Handbook of Virtue Epistemology (pp. 330–399). Routledge.

  • Elkins, K., & Chun, J. (2020). ‘Can GPT-3 pass a writer’s Turing Test?’, Journal of Cultural Analytics [Preprint]. https://doi.org/10.22148/001c.17212

  • Floridi, L., & Chiriatti, M. (2020). ‘GPT-3: Its Nature, Scope, Limits, and Consequences’, Minds and Machines, 30(4), pp. 681–694. https://doi.org/10.1007/s11023-020-09548-1

  • George, A., & Walsh, T. (2022). ‘Artificial intelligence is breaking patent law’, Nature, 605(7911), pp. 616–618. https://doi.org/10.1038/d41586-022-01391-x

  • Grice, H. P. (1991). Logic and conversation. Studies in the way of words (pp. 22–40). Harvard University Press.

  • Guarino, S. (2020). ‘Beyond fact-checking: network analysis tools for monitoring disinformation in social media’, in Complex Networks and Their Applications VIII. Cham: Springer International Publishing (Studies in Computational Intelligence), pp. 436–447. https://doi.org/10.1007/978-3-030-36687-2_36

  • Gunn, H. K., & Lynch, M. P. (2019). Googling. In D. Coady, & J. Chase (Eds.), The Routledge Handbook of Applied Epistemology (pp. 41–53). Routledge, Taylor & Francis Group (Routledge handbooks in philosophy.

  • Jumper, J. (2021). ‘Highly accurate protein structure prediction with AlphaFold’, Nature, 596(7873), pp. 583–589. https://doi.org/10.1038/s41586-021-03819-2

  • Kahneman, D., Sibony, O., & Sunstein, C. R. (2021). Noise: A flaw in Human Judgment. William Collins.

  • Lackey, J. (Ed.). (2021). Applied Epistemology. Oxford University Press.

  • Leibig, C. (2022). ‘Combining the strengths of radiologists and AI for breast cancer screening: a retrospective analysis’, The Lancet Digital Health, 4(7), pp. e507–e519. https://doi.org/10.1016/S2589-7500(22)00070-X

  • Lemmon, E. J. (1994). Beginning logic. Chapman & Hall.

  • Litjens, G. (2017). ‘A survey on deep learning in medical image analysis’, Medical Image Analysis, 42, pp. 60–88. https://doi.org/10.1016/j.media.2017.07.005

  • Liu, Y. (2019). ‘RoBERTa: A Robustly Optimized BERT Pretraining Approach’. https://doi.org/10.48550/ARXIV.1907.11692

  • Maynez, J. (2020). ‘On faithfulness and factuality in abstractive summarization’, arXiv, p. 14. https://doi.org/10.48550/ARXIV.2005.00661

  • O’Neil, C. (2016). Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. First edition. New York: Crown.

  • Olsson, E. J. (2011). Coherentism. In S. Bernecker, & D. Pritchard (Eds.), The Routledge Companion to Epistemology (pp. 257–267). Routledge (Routledge philosophy companions.

  • Plisson, J., Lavrac, N., & Mladenic, D. (2004). ‘A rule based approach to word lemmatization’, in Proceedings of IS04.

  • Quine, W. V. (1987). Quiddities: An intermittently philosophical Dictionary. Belknap Press of Harvard Univ.

  • Reinach, A. (1983). The apriori Foundations of the Civil Law. Along with the lecture ‘Concerning Phenomenology’. Translated by J. Crosby. Texas: Aletheia.

  • Rescher, N. (1973). The coherence theory of Truth. University Press of America.

  • Sanh, V. (2020). ‘DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter’. arXiv. http://arxiv.org/abs/1910.01108 (Accessed: 11 June 2022).

  • Toolan, M. (2009). ‘Coherence’, in P. Hühn et al. (Eds.) Handbook of Narratology. Walter de Gruyter, pp. 44–62. https://doi.org/10.1515/9783110217445

  • van Dijk, T. A. (1973). ‘Text grammar and text logic’, in J.S. Petöfi and H. Rieser (Eds.) Studies in Text Grammar. Dordrecht: Springer Netherlands. https://doi.org/10.1007/978-94-010-2636-9

  • Yule, G. (1985). The study of Language: An introduction. Cambridge.

  • Zagzebski, L. T. (1996). Virtues of the mind: An Inquiry into the nature of Virtue and the ethical foundations of knowledge. Cambridge University Press.

  • Ziem, A., & Schwerin, C. (2014). Frames of understanding in text and discourse: Theoretical foundations and descriptive applications. John Benjamins Publishing Company. (volume 48).

Download references

Funding

The corresponding author benefited from financial support provided by FAPERJ (Research Support Foundation of the State of Rio de Janeiro, Brazil, Grant Nr. 202.643/2019) and CNPq (National Council for Scientific and Technological Development, Brazil, Grant No. 305,050/2018–4).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marcelo de Araujo.

Ethics declarations

Conflict of Interest

The authors declare no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

de Araujo, M., de Almeida, G. & Nunes, J.L. Epistemology Goes AI: A Study of GPT-3’s Capacity to Generate Consistent and Coherent Ordered Sets of Propositions on a Single-Input-Multiple-Outputs Basis. Minds & Machines 34, 2 (2024). https://doi.org/10.1007/s11023-024-09660-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11023-024-09660-6

Keywords

Navigation