An Annotation Schema for the Detection of Social Bias in Legal Text Corpora

Gumusel, Ece; Malic, Vincent Quirante; Donaldson, Devan Ray; Ashley, Kevin; Liu, Xiaozhong

doi:10.1007/978-3-030-96957-8_17

Ece Gumusel⁹,
Vincent Quirante Malic⁹,
Devan Ray Donaldson⁹,
Kevin Ashley¹⁰ &
…
Xiaozhong Liu⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 13192))

Included in the following conference series:

International Conference on Information

1880 Accesses
2 Altmetric

Abstract

The rapid advancement of artificial intelligence in recent years has led to an increase in its use in legal contexts . At the same time, a growing body of research has expressed concerns that AI trained on large datasets may learn and model undesirable social biases. In this paper, we investigate the extent to which such social biases are inherent in a real-world legal corpus. We train a word2vec word embedding model on case law data and find evidence that NLP methods make undesirable distinctions between legally equivalent entities that vary only by race. Since legal AI applications that model such distinctions risk perpetuating these inequalities when used, we argue that the development of such applications must incorporate a means to detect and mitigate such biases. To this end, we propose an annotation schema that identifies and categorizes deviations from legal equivalence, so that debiasing may be more systematically incorporated into legal AI development. Future directions for research are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Caselaw Access Project (2018). https://case.law/
Angwin, J., Larson, J., Mattu, S., Kirchner, L.: Machine bias. ProPublica 23(2016), 139–159 (2016)
Google Scholar
Bolukbasi, T., Chang, K.W., Zou, J., Saligrama, V., Kalai, A.: Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. arXiv arXiv:1607.06520 [cs, stat] (July 2016)
Bullinaria, J.A., Levy, J.P.: Extracting semantic representations from word co-occurrence statistics: a computational study. Behav. Res. Meth. 39(3), 510–526 (2007)
Article Google Scholar
Caliskan, A., Bryson, J.J., Narayanan, A.: Semantics derived automatically from language corpora contain human-like biases. Science 356(6334), 183–186 (2017). https://doi.org/10.1126/science.aal4230. arXiv arXiv:1608.07187
Chalkidis, I., Androutsopoulos, I., Aletras, N.: Neural legal judgment prediction in English. arXiv preprint arXiv:1906.02059 (2019)
Chang, F., McCabe, E., Lee, J.: Mining the Harvard Caselaw Access Project. SSRN Scholarly Paper ID 3529257, Social Science Research Network, Rochester, NY (September 2020). https://doi.org/10.2139/ssrn.3529257. https://papers.ssrn.com/abstract=3529257
Duan, X., et al.: CJRC: a reliable human-annotated benchmark dataset for Chinese judicial reading comprehension. In: Sun, M., Huang, X., Ji, H., Liu, Z., Liu, Y. (eds.) CCL 2019. LNCS (LNAI), vol. 11856, pp. 439–451. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32381-3_36
Chapter Google Scholar
Katz, D.M., Bommarito, M.J., Blackman, J.: A general approach for predicting the behavior of the Supreme Court of the United States. PLoS ONE 12(4), e0174698 (2017)
Article Google Scholar
Kurita, K., Vyas, N., Pareek, A., Black, A.W., Tsvetkov, Y.: Measuring bias in contextualized word representations. arXiv preprint arXiv:1906.07337 (2019)
Lapesa, G., Evert, S.: Evaluating neighbor rank and distance measures as predictors of semantic priming. In: Proceedings of the 4th Annual Workshop on Cognitive Modeling and Computational Linguistics (CMCL), pp. 66–74 (2013)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Rice, D., Rhodes, J.H., Nteta, T.: Racial bias in legal language. Res. Polit. 6(2), 2053168019848930 (2019)
Google Scholar
Teruel, M., Cardellino, C., Cardellino, F., Alemany, L.A., Villata, S.: Legal text processing within the MIREL project. In: 1st Workshop on Language Resources and Technologies for the Legal Knowledge Graph, p. 42 (2018)
Google Scholar
Tsurel, D., Doron, M., Nus, A., Dagan, A., Guy, I., Shahaf, D.: E-commerce dispute resolution prediction. In: Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pp. 1465–1474 (2020)
Google Scholar
Zhong, H., Xiao, C., Tu, C., Zhang, T., Liu, Z., Sun, M.: How does NLP benefit legal system: a summary of legal artificial intelligence. arXiv preprint arXiv:2004.12158 (2020)

Download references

Acknowledgments

This research is supported by a grant from the Indiana University Racial Justice Research Fund.

Author information

Authors and Affiliations

Indiana University Bloomington, Bloomington, USA
Ece Gumusel, Vincent Quirante Malic, Devan Ray Donaldson & Xiaozhong Liu
University of Pittsburgh, Pittsburgh, USA
Kevin Ashley

Authors

Ece Gumusel
View author publications
You can also search for this author in PubMed Google Scholar
Vincent Quirante Malic
View author publications
You can also search for this author in PubMed Google Scholar
Devan Ray Donaldson
View author publications
You can also search for this author in PubMed Google Scholar
Kevin Ashley
View author publications
You can also search for this author in PubMed Google Scholar
Xiaozhong Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ece Gumusel .

Editor information

Editors and Affiliations

Humboldt-Universität zu Berlin, Berlin, Germany
Malte Smits

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gumusel, E., Malic, V.Q., Donaldson, D.R., Ashley, K., Liu, X. (2022). An Annotation Schema for the Detection of Social Bias in Legal Text Corpora. In: Smits, M. (eds) Information for a Better World: Shaping the Global Future. iConference 2022. Lecture Notes in Computer Science(), vol 13192. Springer, Cham. https://doi.org/10.1007/978-3-030-96957-8_17

Download citation

DOI: https://doi.org/10.1007/978-3-030-96957-8_17
Published: 23 February 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-96956-1
Online ISBN: 978-3-030-96957-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

An Annotation Schema for the Detection of Social Bias in Legal Text Corpora