Abstract
There is a discrepancy in the contractual relationship of online terms of use (ToS), as supplier companies impose a series of clauses on their consumers. Some studies have already proven that it is possible to detect potentially non-compliant clauses with European consumer legislation. However, the work carried out to date has largely focused on European legislation and English-language documents. In this work, we present an annotation guideline that maps Brazilian consumer legislation into 10 categories and 3 levels of potential compliance. We also introduced a corpus in Portuguese, with clauses annotated from the guideline. We analyzed the performance of a classifier trained with our corpus and obtained results similar to initial studies in English for the tasks of detecting potentially non-compliant clauses and categorizing potential non-compliant clauses. The results of our work highlight a promising path to developing methods capable of analyzing ToS in Portuguese, and which can be replicated to other fields of Consumer Law.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Corpus available in https://doi.org/10.5281/zenodo.12702424.
- 2.
References
Braun, D.: I beg to differ: how disagreement is handled in the annotation of legal machine learning data sets. Artif. Intell. Law 32, 839–862 (2023). https://doi.org/10.1007/s10506-023-09369-4
Braun, D., Matthes, F.: Clause topic classification in German and English standard form contracts. In: Malmasi, S., Rokhlenko, O., Ueffing, N., Guy, I., Agichtein, E., Kallumadi, S. (eds.) Proceedings of the Fifth Workshop on e-Commerce and NLP (ECNLP 5), Dublin, Ireland May 2022, pp. 199–209. Association for Computational Linguistics (2022). https://doi.org/10.18653/v1/2022.ecnlp-1.23. https://aclanthology.org/2022.ecnlp-1.23
Chakraborty, A., Shankar, R., Marsden, J.R.: An empirical analysis of consumer-unfriendly e-commerce terms of service agreements: implications for customer satisfaction and business survival. Electron. Commer. Res. Appl. 53, 101151 (2022). https://doi.org/10.1016/j.elerap.2022.101151. https://www.sciencedirect.com/science/article/pii/S1567422322000357
Contissa, G., et al.: Towards consumer-empowering artificial intelligence. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, June 2018, pp. 5150–5157. International Joint Conferences on Artificial Intelligence Organization (2018). https://doi.org/10.24963/ijcai.2018/714
Correia, F.A., et al.: Fine-grained legal entity annotation: a case study on the Brazilian supreme court. Inf. Process. Manage. 59(1), 102794 (2022). https://doi.org/10.1016/j.ipm.2021.102794. https://www.sciencedirect.com/science/article/pii/S0306457321002727
Guarino, A., Lettieri, N., Malandrino, D., Zaccagnino, R.: A machine learning-based approach to identify unlawful practices in online terms of service: analysis, implementation and evaluation. Neural Comput. Appl. 33(24), 17569–17587 (2021). https://doi.org/10.1007/s00521-021-06343-6
Hovy, E., Lavid, J.: Towards a ‘science’ of corpus annotation: a new methodological challenge for corpus linguistics. Int. J. Transl. 22(1), 13–36 (2010)
Landis, J.R., Koch, G.G.: The measurement of observer agreement for categorical data. Biometrics 33(1), 159–174 (1977). http://www.jstor.org/stable/2529310
Liepina, R., et al.: GDPR privacy policies in CLAUDETTE: challenges of omission, context and multilingualism. In: Proceedings of the Third Workshop on Automated Semantic Analysis of Information in Legal Text, ASAIL 2019 (2019). https://ceur-ws.org/Vol-2385/paper9.pdf
Lippi, M., et al.: Consumer protection requires artificial intelligence. Nat. Mach. Intell. 1(4), 168–169 (2019). https://doi.org/10.1038/s42256-019-0042-3
Lippi, M., et al.: Automated detection of unfair clauses in online consumer contracts. In: Wyner, A., Casini, G. (eds.) Legal Knowledge and Information Systems, vol. 302, pp. 145–154. IOS Press (2017). https://doi.org/10.3233/978-1-61499-838-9-145. https://hdl.handle.net/1814/68540
Lippi, M., et al.: CLAUDETTE: an automated detector of potentially unfair clauses in online terms of service. Artif. Intell. Law 27(2), 117–139 (2019). https://doi.org/10.1007/s10506-019-09243-2
Loos, M., Luzak, J.: Wanted: a bigger stick. On unfair terms in consumer contracts with online service providers. J. Consum. Policy 39(1), 63–90 (2016). https://doi.org/10.1007/s10603-015-9303-7
Micklitz, H.W., Pałka, P., Panagis, Y.: The empire strikes back: digital control of unfair terms of online services. J. Consum. Policy 40, 367–388 (2017)
Pereira, A.H.: Terminologia do Direito do consumidor: análise das motivações da variação terminológica. Master’s thesis, Paulista State University (2018)
Rocha, I.M., Tocchini, M., de Barros, R.M., Garcia, A.F., Silva, J. de O. e., Zular, F., Maranhão, J.: Guidelines claudinha consumer law, July 2024. https://doi.org/10.5281/zenodo.11206647
Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. 34(1), 1–47 (2002). https://doi.org/10.1145/505282.505283
Senado Federal, C.d.E.T.: Código de defesa do consumidor e normas correlatas. Brasília, 2nd edn. (2017)
Sundareswara, S.N., Srinath, M., Wilson, S., Giles, C.L.: A large-scale exploration of terms of service documents on the web. In: Proceedings of the 21st ACM Symposium on Document Engineering, DocEng 2021. Association for Computing Machinery, New York (2021). https://doi.org/10.1145/3469096.3474940
Tkachenko, M., Malyuk, M., Holmanyuk, A., Liubimov, N.: Label studio: data labeling software (2020–2022). https://github.com/heartexlabs/label-studio
Vapnik, V.N.: The vicinal risk minimization principle and the SVMs. In: The Nature of Statistical Learning Theory. Statistics for Engineering and Information Science. Springer, New York (2000). https://doi.org/10.1007/978-1-4757-3264-1_9
Zeni, N., Kiyavitskaya, N., Mich, L., Cordy, J.R., Mylopoulos, J.: GaiusT: supporting the extraction of rights and obligations for regulatory compliance. Requirements Eng. 20(1), 1–22 (2015). https://doi.org/10.1007/s00766-013-0181-8
Acknowledgement
This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001. We would like to thank Instituto Lawgorithm for partially supporting our research. We also thank the USP-AWS agreement for the cloud environment support to the development of our model.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Disclosure of Interests
The authors have no competing interests to declare that are relevant to the content of this article.
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Tocchini, M. et al. (2025). Detection of Potentially Non-compliant Clauses in Online ToS in Portuguese. In: Santos, M.F., Machado, J., Novais, P., Cortez, P., Moreira, P.M. (eds) Progress in Artificial Intelligence. EPIA 2024. Lecture Notes in Computer Science(), vol 14967. Springer, Cham. https://doi.org/10.1007/978-3-031-73497-7_23
Download citation
DOI: https://doi.org/10.1007/978-3-031-73497-7_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-73496-0
Online ISBN: 978-3-031-73497-7
eBook Packages: Computer ScienceComputer Science (R0)