Skip to main content

Biased Language Detection in Court Decisions

  • Conference paper
  • First Online:
Book cover Intelligent Data Engineering and Automated Learning – IDEAL 2020 (IDEAL 2020)

Abstract

The Portuguese Commission for Citizenship and Gender Equality advocates that equality between men and women is a fundamental principle of the Portuguese Constitution. While court decisions should reflect this principle, a preliminary analysis in cases of gender violence reveals that this is not always the case. Based on the extensive literature on subjectivity, modality and bias in Linguistics and in tandem with AI and Natural Language Processing (NLP) techniques, the research proposed in this paper aims to study the linguistic formulations that convey bias in court decisions. The goal is to develop a linguistic model and, subsequently, a tool to automatically detect gender bias in this text genre. A corpus of a set of legal sentences on gender violence has been extracted from the public access database of the Portuguese Ministry of Justice (IGFEJ), which can be subject to a manual annotation process according to a typology of biased categories and structures. By exploiting the corpus in a supervised machine learning approach while following the most recent advances in NLP, we aim to deliver the aforementioned tool for biased language detection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.cig.gov.pt/a-cig/missao/.

  2. 2.

    Diário da República, Resolution No. 61/2018, of 21 May.

  3. 3.

    https://www.cig.gov.pt/documentacao-de-referencia/doc/portugal-mais-igual/.

  4. 4.

    http://www.dgsi.pt/.

  5. 5.

    https://genderbiasnlp.talp.cat/gebnlp2020/shared-task/.

  6. 6.

    https://pan.webis.de/semeval19/semeval19-web/.

  7. 7.

    http://doraemon.iis.sinica.edu.tw/emotionlines/challenge.html.

References

  1. Aroyo, L., Welty, C.: Truth is a lie: crowd truth and the seven myths of human annotation. AI Mag. 36(1), 15–24 (2015). https://doi.org/10.1609/aimag.v36i1.2564

    Article  Google Scholar 

  2. Benveniste, E.: Problèmes de linguistique gènèrale, vol. I. Gallimard, Paris (1966)

    Google Scholar 

  3. Beukeboom, C., Burgers, C.: Linguistic bias. In: Giles, H., Harwood, J. (eds.) Oxford Research Encyclopedia of Communication, pp. 1–21. Oxford Research Encyclopedias. Oxford University Press, Oxford (2017). https://doi.org/10.1093/acrefore/9780190228613.013.439

  4. Briz, A., Albelda Marco, M.: Una propuesta teórica y metodológica para el análisis de la atenuación lingüística en español y portugués. la base de un proyecto común (es.por.atenuaciÓn). Onomázein Revista de lingüística, filología y traducción 28, 288–319 (12 2013). https://doi.org/10.7764/onomazein.28.21

  5. Conrad, A., Wiebe, J., Hwa, R.: Recognizing arguing subjectivity and argument tags. In: Proceedings Workshop on Extra-Propositional Aspects of Meaning in Computational Linguistics, Jeju, Republic of Korea, pp. 80–88. ACL, July 2012

    Google Scholar 

  6. Cruz, A.F., Rocha, G., Lopes Cardoso, H.: On sentence representations for propaganda detection: From handcrafted features to word embeddings. In: Proceedings 2nd Workshop on Natural Language Processing for Internet Freedom: Censorship, Disinformation, and Propaganda, Hong Kong, China, pp. 107–112. ACL, November 2019. https://doi.org/10.18653/v1/D19-5015

  7. Cruz, A.F., Rocha, G., Lopes Cardoso, H.: On document representations for detection of biased news articles. In: Proceedings 35th Annual ACM Symposium on Applied Computing, SAC 2020, New York, NY, USA, pp. 892–899. Association for Computing Machinery (2020). https://doi.org/10.1145/3341105.3374025

  8. De-Arteaga, M., et al.: Bias in bios: a case study of semantic representation bias in a high-stakes setting. In: Proceedings Conference on Fairness, Accountability, and Transparency, pp. 120–128. ACM, New York, NY, USA (2019). https://doi.org/10.1145/3287560.3287572

  9. Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018), http://arxiv.org/abs/1810.04805

  10. Hyland, K.: Hedging in Scientific Research Articles. John Benjamins Publishing, Amsterdam (1998)

    Book  Google Scholar 

  11. Hyland, K.: Constructing proximity: relating to readers in popular and professional science. J. Engl. Acad. Purp. 9(2), 116–127 (2010). https://doi.org/10.1016/j.jeap.2010.02.003

    Article  MathSciNet  Google Scholar 

  12. Karttunen, L.: Implicative verbs. Language 47(2), 340–358 (1971)

    Article  Google Scholar 

  13. Kerbrat-Orecchioni, C.: L’énonciation: De la subjectivité dans le langage. Armand Colin (1980)

    Google Scholar 

  14. Kiparsky, P., Kiparsky, C.: Fact. In: Bierwisch, M., Heidolph, K.F. (eds.) Progress in Linguistics, pp. 143–173. Mouton Publishers, The Hague (1970)

    Google Scholar 

  15. Marques, A., Duarte, I.M., Pinto, A.G., Pinho, C.: A construção da identidade da mulher em revistas do Estado Novo. Ex aequo 39, 71–88 (2019)

    Google Scholar 

  16. Menegatti, M., Rubini, M.: Gender bias and sexism in language. In: Nussbaum, J.L. (ed.) Oxford Research Encyclopedia of Communication. Oxford University Press, Oxford (2017)

    Google Scholar 

  17. Oliveira, F.: Modalidade e modo. In: Raposo, E.P. (ed.) Gramática Portuguesa, vol. I. Editorial Caminho, Lisbon (2004)

    Google Scholar 

  18. Oliveira, F., Mendes, A.: Modalidade. In: Raposo, E.P. (ed.) Gramática do Português, vol. I. Fundação Calouste Gulbenkian, Lisbon (2013)

    Google Scholar 

  19. Pang, B., Lee, L.: Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval 2, 1–135 (2008). https://doi.org/10.1561/1500000011

    Article  Google Scholar 

  20. Pinto, A.: A retórica do eu e do outro - the othering: A gramática da identidade no discurso político. Estudos do Discurso: Caminhos e Tendências, p. 25, January 2016

    Google Scholar 

  21. Pinto, A.G.: A construção ideológica da mulher num acórdão sobre violência doméstica. In: Savoir et pouvoir dans un monde polycentrique: les discours aux prismes des langues, des cultures et des espaces : Congrès DNC3-ALED (2019)

    Google Scholar 

  22. Prabhakaran, V., Hutchinson, B., Mitchell, M.: Perturbation sensitivity analysis to detect unintended model biases. In: Proceedings 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, pp. 5740–5745. ACL, November 2019. https://doi.org/10.18653/v1/D19-1578

  23. Recasens, M., Danescu-Niculescu-Mizil, C., Jurafsky, D.: Linguistic models for analyzing and detecting biased language. In: Proceedings 51st Annual Meeting of the Association for Computational Linguistics, Sofia, Bulgaria, vol. 1, pp. 1650–1659. ACL, August 2013

    Google Scholar 

  24. Riloff, E., Wiebe, J.: Learning extraction patterns for subjective expressions. In: Proceedings 2003 Conference on Empirical Methods in Natural Language Processing, pp. 105–112 (2003)

    Google Scholar 

  25. Ruder, S., Peters, M.E., Swayamdipta, S., Wolf, T.: Transfer learning in natural language processing. In: Proceedings 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Tutorials, pp. 15–18. ACL (2019)

    Google Scholar 

  26. Ruder, S., Vulić, I., Søgaard, A.: A survey of cross-lingual word embedding models. J. Artif. Intell. Res. 65(1), 569–630 (2019). https://doi.org/10.1613/jair.1.11640

    Article  MathSciNet  MATH  Google Scholar 

  27. Sousa-Silva, R.: Computational forensic linguistics: an overview of computational applications in forensic contexts. Language and Law 5, 118–143 (2018)

    Google Scholar 

  28. Sousa Silva, R., Laboreiro, G., Sarmento, L., Grant, T., Oliveira, E., Maia, B.: ‘twazn me!!!;(’automatic authorship analysis of micro-blogging messages. In: Muñoz, R., Montoyo, A., Métais, E. (eds.) NLDB 2011. LNCS, vol. 6716, pp. 161–168. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22327-3_16

    Chapter  Google Scholar 

  29. Teixeira, J., Pinto, A.: Marcas de cortesia no género textual acórdão e o seu contributo para a construção da relação interacional. REDIS: Revista de Estudos do Discurso 7, 142–162 (2018). https://doi.org/10.21747/21833958/red7a6

  30. Wada, T., Iwata, T., Matsumoto, Y.: Unsupervised multilingual word embedding with limited resources using neural language models. In: Proceedings 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, pp. 3113–3124. ACL, July 2019. https://doi.org/10.18653/v1/P19-1300

  31. Wiebe, J., Wilson, T., Bruce, R., Bell, M., Martin, M.: Learning subjective language. Comput. Linguist. 30(3), 277–308 (2004). https://doi.org/10.1162/0891201041850885

    Article  Google Scholar 

  32. Yan, Y., et al.: Modeling annotator expertise: learning when everybody knows a bit of something. In: Teh, Y.W., Titterington, M. (eds.) Proceedings 13th International Conference on Artificial Intelligence and Statistics. Proceedings Machine Learning Research, vol. 9, pp. 932–939 (2010)

    Google Scholar 

Download references

Acknowledgements.

This research is partially supported by CLUP(FCT/UID/ LIN/00022/2019), by LIACC (FCT/UID/CEC/0027/2020) and by project DAR-GMINTS (POCI/01/ 0145/FEDER/031460), funded by Fundação para a Ciência e a Tecnologia (FCT).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Henrique Lopes Cardoso .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pinto, A.G., Cardoso, H.L., Duarte, I.M., Warrot, C.V., Sousa-Silva, R. (2020). Biased Language Detection in Court Decisions. In: Analide, C., Novais, P., Camacho, D., Yin, H. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2020. IDEAL 2020. Lecture Notes in Computer Science(), vol 12490. Springer, Cham. https://doi.org/10.1007/978-3-030-62365-4_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-62365-4_38

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-62364-7

  • Online ISBN: 978-3-030-62365-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics