Automatic deception detection in Italian court cases

Fornaciari, Tommaso; Poesio, Massimo

doi:10.1007/s10506-013-9140-4

Automatic deception detection in Italian court cases

Published: 21 February 2013

Volume 21, pages 303–340, (2013)
Cite this article

Artificial Intelligence and Law Aims and scope Submit manuscript

Tommaso Fornaciari¹ &
Massimo Poesio²

1692 Accesses
35 Citations
13 Altmetric
2 Mentions
Explore all metrics

Abstract

Effective methods for evaluating the reliability of statements issued by witnesses and defendants in hearings would be an extremely valuable support to decision-making in court and other legal settings. In recent years, methods relying on stylometric techniques have proven most successful for this task; but few such methods have been tested with language collected in real-life situations of high-stakes deception, and therefore their usefulness outside lab conditions still has to be properly assessed. In this study we report the results obtained by using stylometric techniques to identify deceptive statements in a corpus of hearings collected in Italian courts. The defendants at these hearings were condemned for calumny or false testimony, so the falsity of (some of) their statements is fairly certain. In our experiments we replicated the methods used in previous studies but never before applied to high-stakes data, and tested new methods. We also considered the effect of a number of variables including in particular the homogeneity of the dataset. Our results suggest that accuracy at deception detection clearly above chance level can be obtained with real-life data as well.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fighting the Fake: A Forensic Linguistic Analysis to Fake News Detection

Article 28 April 2022

Explanation in Computational Stylometry

From Case Law to Ratio Decidendi

Notes

To be precise, Art. 372 reads:

Chiunque, deponendo come testimone innanzi all’Autorità Giudiziaria, afferma il falso o nega il vero, ovvero tace, in tutto o in parte, ciò che sa intorno ai fatti sui quali è interrogato, è punito con la reclusione da due a sei anni.

I.e., this article punishes who, in front of the Judicial Authority, says the false or denies the truth, or does not reveal what he knows about the investigated facts.
Specifically, Art. 368 reads:

Chiunque, con denunzia, querela, richiesta o istanza, anche se anonima o sotto falso nome, diretta all’Autorità Giudiziaria o ad altra Autorità che a quella abbia obbligo di riferirne, incolpa di un reato taluno che egli sa innocente, ovvero simula a carico di lui le tracce di un reato, è punito con la reclusione da due a sei anni.

I.e., this article is violated whenever an individual tries to shift the blame for some crime on someone who he knows being innocent.
When in doubt, side with the accused.
In particular, until 2005 the hearings were mainly recorded on tapes, which were used to be re-employed several times once the transcription was carried out. Therefore the audio tracks of the earliest hearings are definitively lost. Since 2006, instead, the audio tracks are recorded on CD-rom, and an attempt to get them is in process.
http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/DecisionTreeTagger.html.
Because our utterances are transcriptions of spoken language, the punctuation marks were inserted by the transcriber. They seemed nevertheless essential to understand the meaning of many utterances, hence their inclusion.
The LIWC for several languages can be obtained from http://www.liwc.net.
“They”, “Passive” and “Formal”, respectively.
Here and in the rest of the paper we indicate the highest accuracy achieved in bold.
“xxxxx” substitutes an anonymized token, such as proper names or surnames, names of places and so on.
http://paleo.di.unipi.it/it/parse.

References

Adams SH (1996) Statement analysis: what do suspects’ words really reveal? FBI Law Enforc Bull 65(10):12–20
Google Scholar
Alparone F, Caso S, Agosti A, Rellini A (2004) The Italian LIWC2001 dictionary. LIWC.net, Austin
Artstein R, Poesio M (2008) Inter-coder agreement for computational linguistics. Comput Linguist 34(4):555–596
Article Google Scholar
Bachenko J, Fitzpatrick E, Schonwetter M (2008) Verification and implementation of language-based deception indicators in civil and criminal narratives. In: Proceedings of the 22nd international conference on computational Linguistics—volume 1, COLING ‘08, pp 41–48, Stroudsburg, PA, USA. Association for Computational Linguistics
Bond CF, De Paulo BM (2006) Accuracy of deception judgments. Pers Soc Psychol Rev 10(3):214–234
Article Google Scholar
Buller D, Burgoon J (1996) Interpersonal deception theory. Commun Theory 6:203–242
Article Google Scholar
Chinchor N (1992) Muc-4 evaluation metrics. In: Proceedings of the 4th conference on message understanding, MUC4 ’92, pp 22–29, Stroudsburg, PA, USA. Association for Computational Linguistics
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
Google Scholar
Coulthard M (2004) Author identification, idiolect, and linguistic uniqueness. Appl Linguist 25(4):431–447
Article Google Scholar
Davatzikos C, Ruparel K, Fan Y, Shen D, Acharyya M, Loughead J, Gur R, Langleben D (2005) Classifying spatial patterns of brain activity with machine learning methods: application to lie detection. NeuroImage 28(3):663–668
Article Google Scholar
De Paulo BM, Lindsay JJ, Malone BE, Muhlenbruck L, Charlton K, Cooper H (2003) Cues to deception. Psychol Bull 129(1):74–118
Article Google Scholar
Ekman P (2001) Telling lies: clues to deceit in the marketplace, politics, and marriage. W.W. Norton
Feng S, Banerjee R, Choi Y (2012) Syntactic stylometry for deception detection. In: Proceedings of the 50th annual meeting of the association for computational linguistics (volume 2: Short Papers), pp 171–175, Jeju Island, Korea. Association for Computational Linguistics
Fitzpatrick E, Bachenko J (2009) Building a forensic corpus to test language-based indicators of deception. Lang Comput 71(1):183–196
Google Scholar
Fitzpatrick E, Bachenko J (2012) Building a data collection for deception research. In: Proceedings of the workshop on computational approaches to deception detection, pp 31–38, Avignon, France. Association for Computational Linguistics
Forman G (2003) An extensive empirical study of feature selection metrics for text classification. J Mach Learn Res 3:1289–1305
MATH Google Scholar
Fornaciari T, Poesio M (2011) Sincere and deceptive statements in Italian criminal proceedings. In: Proceedings of the international association of forensic linguists 10th biennial conference, pp 126–138, Cardiff, Wales, UK
Frank MG, Feeley TH (2003) To catch a liar: challenges for research in lie detection training. J Appl Commun Res 31(1):58–75
Article Google Scholar
Frank MG, Menasco MA, O’Sullivan M (2008) Human behavior and deception detection. In: Voeller JG (ed) Wiley handbook of science and technology for homeland security. Wiley, New York
Google Scholar
Ganis G, Kosslyn S, Stose S, Thompson W, Yurgelun-Todd D (2003) Neural correlates of different types of deception: an fMRI investigation. Cereb Cortex 13(8):830–836
Article Google Scholar
Giannone C, Basili R, Del Vescovo C, Naggar P, Moschitti A (2009) Kernel-based relation extraction from investigative data. In: Proceedings of the third workshop on analytics for noisy unstructured text data, AND ’09, pp 93–100, New York, NY, USA. ACM
Gokhmann S, Hancock J, Prabhu P, Ott M, Cardie C (2012) In search of a gold standard in studies of deception. In: Fitzpatrick E, Bachenko J, Fornaciari T (eds) Proceedings of the EACL workshop on computational approaches to deception detection, pp 23–30
Hancock JT, Curry LE, Goorha S, Woodworth M (2008) On lying and being lied to: a linguistic analysis of deception in computer-mediated communication. Discourse Process 45(1):1–23
Article Google Scholar
Hauch V, Blandón-Gitlin I, Masip J, Sporer SL (2012) Linguistic cues to deception assessed by computer programs: a meta-analysis. In: Fitzpatrick E, Bachenko J, Fornaciari T (eds) Proceedings of the workshop on computational approaches to deception detection, pp 1–4, Avignon
Ireland ME, Slatcher RB, Eastwick PW, Scissors LE, Finkel EJ, Pennebaker JW (2011) Language style matching predicts relationship initiation and stability. Psychol Sci 22(1):39–44
Article Google Scholar
Jensen ML, Meservy TO, Burgoon JK, Nunamaker JF (2010) Automatic, multimodal evaluation of human interaction. Group Decis Negot 19(4):367–389
Article Google Scholar
Karatzoglou A, Meyer D, Hornik K (2006) Support vector machines in r. J Stat Softw 15(9):1–28
Google Scholar
Koppel M, Schler J, Argamon S, Pennebaker J (2006) Effects of age and gender on blogging. In: AAAI 2006 spring symposium on computational approaches to analysing weblogs
Levine TR, Feeley TH, McCornack SA, Hughes M, Harms CM (2005) Testing the effects of nonverbal behavior training on accuracy in deception detection with the inclusion of a bogus training control group. West J Commun 69(3):203–217
Article Google Scholar
Lord RD (1958) Studies in the history of probability and statistics.: Viii. de morgan and the statistical study of literary style. Biometrika 45(1/2):282–282
Article Google Scholar
Lutoslawski W (1898) Principes de stylomtrie. Revue des tudes grecques 41:61–81
Google Scholar
Luyckx K, Daelemans W (2008) Authorship attribution and verification with many authors and limited data. In: Proceedings of the 22nd international conference on computational linguistics—volume 1, COLING ’08, pp 513–520, Stroudsburg, PA, USA. Association for Computational Linguistics
Merikangas JR (2008) Commentary: functional mri lie detection. J Am Acad Psychiatry Law 36(4):499–501
Google Scholar
Mosteller F, Wallace D (1964) Inference and disputed authorship: the federalist. Addison-Wesley, Reading
MATH Google Scholar
Newman ML, Pennebaker JW, Berry DS, Richards JM (2003) Lying words: predicting deception from linguistic styles. Pers Soc Psychol Bull 29(5):665–675
Article Google Scholar
Niederhoffer KG, Pennebaker JW (2002) Linguistic style matching in social interaction. J Lang Soc Psychol 21(4):337–360
Article Google Scholar
Peersman C, Daelemans W, Van Vaerenbergh L (2011) Age and gender prediction on netlog data. Presented at the 21st Meeting of Computational Linguistics in the Netherlands (CLIN21), Ghent, Belgium.
Pennebaker JW, Francis ME, Booth RJ (2001) Linguistic inquiry and word count (LIWC): LIWC2001. Lawrence Erlbaum Associates, Mahwah
Pepe G (ed) (1996) La falsa donazione di Costantino. Tea storica. TEA
Porter S, Woodworth M, Birt AR (2000) Truth, lies, and videotape: an investigation of the ability of federal parole officers to detect deception. Law Hum Behav 24(6):643–658
Article Google Scholar
Sasaki Y (2007) The truth of the F-measure. Teach Tutor mater, pp 1–5
Schmid H (1994) Probabilistic part-of-speech tagging using decision trees. In: Proceedings of international conference on new methods in language processing
Simpson JR (2008) Functional mri lie detection: too good to be true? J Am Acad Psychiatry Law 36(4):491–498
Google Scholar
Solan LM, Tiersma PM (2004) Author identification in american courts. Appl Linguist 25(4):448–465
Article Google Scholar
Stein B, Koppel M, Stamatatos E (2007) Plagiarism analysis, authorship identification, and near-duplicate detection pan’07. SIGIR Forum 41:68–71
Article Google Scholar
Strapparava C, Mihalcea R (2009) The lie detector: explorations in the automatic recognition of deceptive language. In: Proceeding ACLShort ’09—proceedings of the ACL-IJCNLP 2009 conference short papers
Undeutsch U (1967) Beurteilung der Glaubhaftigkeit von Aussagen [Veracity assessment of statements]. In: Undeutsch U (ed) Handbuch der psychologie: vol 11. Forensische Psychologie. Hogrefe, Gottingen, pp 26–181
Undeutsch U (1982) Statement reality analysis. In: Trankell A (ed) Reconstructing the past: the role of psychologists in criminal trials. Kluwer, Deventer, pp 27–56
Undeutsch U (1984) Courtroom evaluation of eyewitness testimony. Appl Psychol 33(1):51–66
Article Google Scholar
Vaassen F, Daelemans W (2011) Automatic emotion classification for interpersonal communication. In: 2nd workshop on computational approaches to subjectivity and sentiment analysis (WASSA 2.011)
Vrij A (2008) Detecting lies and deceit: pitfalls and opportunities. Wiley series in psychology of crime, policing and law, 2nd edition. Wiley, Chichester
Vrji A (2005) Criteria-based content analysis—a qualitative review of the first 37 studies. Psychol Public Policy Law 11(1):3–41
Article Google Scholar
Walczyk JJ, Roper KS, Seemann E, Humphrey AM (2003) Cognitive mechanisms underlying lying to questions: response time as a cue to deception. Appl Cogn Psychol 17(7):755–774
Article Google Scholar
Wang JT, Spezio M, Camerer CF (2010) Pinocchio’s pupil: using eyetracking and pupil dilation to understand truth telling and deception in sender-receiver games. Am Econ Rev 100(3):984–1007
Article Google Scholar
Yang Y, Liu X (1999) A re-examination of text categorization methods. In: Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’99. ACM, New York, pp 42–49
Yang Y, Pedersen JO (1997) A comparative study on feature selection in text categorization. CiteSeerX—Scientific Literature Digital Library and Search Engine [http://citeseerx.ist.psu.edu/oai2] (United States)
Zhou L, Shi Y, Zhang D (2008) A statistical language modeling approach to online deception detection. IEEE Trans Knowl Data Eng 20(8):1077–1081
Article Google Scholar

Download references

Acknowledgments

To create DeCour has been very complex, and it would not have been possible without the kind collaboration of a lot of people. Many thanks to Dr. Francesco Scutellari, President of the Court of Bologna, to Dr. Heinrich Zanon, President of the Court of Bolzano, to Dr. Francesco Antonio Genovese, President of the Court of Prato and to Dr. Sabino Giarrusso, President of the Court of Trento.

Author information

Authors and Affiliations

Center for Mind/Brain Sciences, University of Trento, Trento, Italy
Tommaso Fornaciari
School for Computer Science and Electronic Engineering, University of Essex, Colchester, UK
Massimo Poesio

Authors

Tommaso Fornaciari
View author publications
You can also search for this author in PubMed Google Scholar
Massimo Poesio
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tommaso Fornaciari.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fornaciari, T., Poesio, M. Automatic deception detection in Italian court cases. Artif Intell Law 21, 303–340 (2013). https://doi.org/10.1007/s10506-013-9140-4

Download citation

Published: 21 February 2013
Issue Date: September 2013
DOI: https://doi.org/10.1007/s10506-013-9140-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic deception detection in Italian court cases

Abstract

Access this article

Similar content being viewed by others

Fighting the Fake: A Forensic Linguistic Analysis to Fake News Detection

Explanation in Computational Stylometry

From Case Law to Ratio Decidendi

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Automatic deception detection in Italian court cases

Abstract

Access this article

Similar content being viewed by others

Fighting the Fake: A Forensic Linguistic Analysis to Fake News Detection

Explanation in Computational Stylometry

From Case Law to Ratio Decidendi

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation