Abstract
The Portuguese Commission for Citizenship and Gender Equality advocates that equality between men and women is a fundamental principle of the Portuguese Constitution. While court decisions should reflect this principle, a preliminary analysis in cases of gender violence reveals that this is not always the case. Based on the extensive literature on subjectivity, modality and bias in Linguistics and in tandem with AI and Natural Language Processing (NLP) techniques, the research proposed in this paper aims to study the linguistic formulations that convey bias in court decisions. The goal is to develop a linguistic model and, subsequently, a tool to automatically detect gender bias in this text genre. A corpus of a set of legal sentences on gender violence has been extracted from the public access database of the Portuguese Ministry of Justice (IGFEJ), which can be subject to a manual annotation process according to a typology of biased categories and structures. By exploiting the corpus in a supervised machine learning approach while following the most recent advances in NLP, we aim to deliver the aforementioned tool for biased language detection.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
Diário da República, Resolution No. 61/2018, of 21 May.
- 3.
- 4.
- 5.
- 6.
- 7.
References
Aroyo, L., Welty, C.: Truth is a lie: crowd truth and the seven myths of human annotation. AI Mag. 36(1), 15–24 (2015). https://doi.org/10.1609/aimag.v36i1.2564
Benveniste, E.: Problèmes de linguistique gènèrale, vol. I. Gallimard, Paris (1966)
Beukeboom, C., Burgers, C.: Linguistic bias. In: Giles, H., Harwood, J. (eds.) Oxford Research Encyclopedia of Communication, pp. 1–21. Oxford Research Encyclopedias. Oxford University Press, Oxford (2017). https://doi.org/10.1093/acrefore/9780190228613.013.439
Briz, A., Albelda Marco, M.: Una propuesta teórica y metodológica para el análisis de la atenuación lingüística en español y portugués. la base de un proyecto común (es.por.atenuaciÓn). Onomázein Revista de lingüística, filología y traducción 28, 288–319 (12 2013). https://doi.org/10.7764/onomazein.28.21
Conrad, A., Wiebe, J., Hwa, R.: Recognizing arguing subjectivity and argument tags. In: Proceedings Workshop on Extra-Propositional Aspects of Meaning in Computational Linguistics, Jeju, Republic of Korea, pp. 80–88. ACL, July 2012
Cruz, A.F., Rocha, G., Lopes Cardoso, H.: On sentence representations for propaganda detection: From handcrafted features to word embeddings. In: Proceedings 2nd Workshop on Natural Language Processing for Internet Freedom: Censorship, Disinformation, and Propaganda, Hong Kong, China, pp. 107–112. ACL, November 2019. https://doi.org/10.18653/v1/D19-5015
Cruz, A.F., Rocha, G., Lopes Cardoso, H.: On document representations for detection of biased news articles. In: Proceedings 35th Annual ACM Symposium on Applied Computing, SAC 2020, New York, NY, USA, pp. 892–899. Association for Computing Machinery (2020). https://doi.org/10.1145/3341105.3374025
De-Arteaga, M., et al.: Bias in bios: a case study of semantic representation bias in a high-stakes setting. In: Proceedings Conference on Fairness, Accountability, and Transparency, pp. 120–128. ACM, New York, NY, USA (2019). https://doi.org/10.1145/3287560.3287572
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018), http://arxiv.org/abs/1810.04805
Hyland, K.: Hedging in Scientific Research Articles. John Benjamins Publishing, Amsterdam (1998)
Hyland, K.: Constructing proximity: relating to readers in popular and professional science. J. Engl. Acad. Purp. 9(2), 116–127 (2010). https://doi.org/10.1016/j.jeap.2010.02.003
Karttunen, L.: Implicative verbs. Language 47(2), 340–358 (1971)
Kerbrat-Orecchioni, C.: L’énonciation: De la subjectivité dans le langage. Armand Colin (1980)
Kiparsky, P., Kiparsky, C.: Fact. In: Bierwisch, M., Heidolph, K.F. (eds.) Progress in Linguistics, pp. 143–173. Mouton Publishers, The Hague (1970)
Marques, A., Duarte, I.M., Pinto, A.G., Pinho, C.: A construção da identidade da mulher em revistas do Estado Novo. Ex aequo 39, 71–88 (2019)
Menegatti, M., Rubini, M.: Gender bias and sexism in language. In: Nussbaum, J.L. (ed.) Oxford Research Encyclopedia of Communication. Oxford University Press, Oxford (2017)
Oliveira, F.: Modalidade e modo. In: Raposo, E.P. (ed.) Gramática Portuguesa, vol. I. Editorial Caminho, Lisbon (2004)
Oliveira, F., Mendes, A.: Modalidade. In: Raposo, E.P. (ed.) Gramática do Português, vol. I. Fundação Calouste Gulbenkian, Lisbon (2013)
Pang, B., Lee, L.: Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval 2, 1–135 (2008). https://doi.org/10.1561/1500000011
Pinto, A.: A retórica do eu e do outro - the othering: A gramática da identidade no discurso político. Estudos do Discurso: Caminhos e Tendências, p. 25, January 2016
Pinto, A.G.: A construção ideológica da mulher num acórdão sobre violência doméstica. In: Savoir et pouvoir dans un monde polycentrique: les discours aux prismes des langues, des cultures et des espaces : Congrès DNC3-ALED (2019)
Prabhakaran, V., Hutchinson, B., Mitchell, M.: Perturbation sensitivity analysis to detect unintended model biases. In: Proceedings 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, pp. 5740–5745. ACL, November 2019. https://doi.org/10.18653/v1/D19-1578
Recasens, M., Danescu-Niculescu-Mizil, C., Jurafsky, D.: Linguistic models for analyzing and detecting biased language. In: Proceedings 51st Annual Meeting of the Association for Computational Linguistics, Sofia, Bulgaria, vol. 1, pp. 1650–1659. ACL, August 2013
Riloff, E., Wiebe, J.: Learning extraction patterns for subjective expressions. In: Proceedings 2003 Conference on Empirical Methods in Natural Language Processing, pp. 105–112 (2003)
Ruder, S., Peters, M.E., Swayamdipta, S., Wolf, T.: Transfer learning in natural language processing. In: Proceedings 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Tutorials, pp. 15–18. ACL (2019)
Ruder, S., Vulić, I., Søgaard, A.: A survey of cross-lingual word embedding models. J. Artif. Intell. Res. 65(1), 569–630 (2019). https://doi.org/10.1613/jair.1.11640
Sousa-Silva, R.: Computational forensic linguistics: an overview of computational applications in forensic contexts. Language and Law 5, 118–143 (2018)
Sousa Silva, R., Laboreiro, G., Sarmento, L., Grant, T., Oliveira, E., Maia, B.: ‘twazn me!!!;(’automatic authorship analysis of micro-blogging messages. In: Muñoz, R., Montoyo, A., Métais, E. (eds.) NLDB 2011. LNCS, vol. 6716, pp. 161–168. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22327-3_16
Teixeira, J., Pinto, A.: Marcas de cortesia no género textual acórdão e o seu contributo para a construção da relação interacional. REDIS: Revista de Estudos do Discurso 7, 142–162 (2018). https://doi.org/10.21747/21833958/red7a6
Wada, T., Iwata, T., Matsumoto, Y.: Unsupervised multilingual word embedding with limited resources using neural language models. In: Proceedings 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, pp. 3113–3124. ACL, July 2019. https://doi.org/10.18653/v1/P19-1300
Wiebe, J., Wilson, T., Bruce, R., Bell, M., Martin, M.: Learning subjective language. Comput. Linguist. 30(3), 277–308 (2004). https://doi.org/10.1162/0891201041850885
Yan, Y., et al.: Modeling annotator expertise: learning when everybody knows a bit of something. In: Teh, Y.W., Titterington, M. (eds.) Proceedings 13th International Conference on Artificial Intelligence and Statistics. Proceedings Machine Learning Research, vol. 9, pp. 932–939 (2010)
Acknowledgements.
This research is partially supported by CLUP(FCT/UID/ LIN/00022/2019), by LIACC (FCT/UID/CEC/0027/2020) and by project DAR-GMINTS (POCI/01/ 0145/FEDER/031460), funded by Fundação para a Ciência e a Tecnologia (FCT).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Pinto, A.G., Cardoso, H.L., Duarte, I.M., Warrot, C.V., Sousa-Silva, R. (2020). Biased Language Detection in Court Decisions. In: Analide, C., Novais, P., Camacho, D., Yin, H. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2020. IDEAL 2020. Lecture Notes in Computer Science(), vol 12490. Springer, Cham. https://doi.org/10.1007/978-3-030-62365-4_38
Download citation
DOI: https://doi.org/10.1007/978-3-030-62365-4_38
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-62364-7
Online ISBN: 978-3-030-62365-4
eBook Packages: Computer ScienceComputer Science (R0)