Skip to main content

Detection of Potentially Non-compliant Clauses in Online ToS in Portuguese

  • Conference paper
  • First Online:
Progress in Artificial Intelligence (EPIA 2024)

Abstract

There is a discrepancy in the contractual relationship of online terms of use (ToS), as supplier companies impose a series of clauses on their consumers. Some studies have already proven that it is possible to detect potentially non-compliant clauses with European consumer legislation. However, the work carried out to date has largely focused on European legislation and English-language documents. In this work, we present an annotation guideline that maps Brazilian consumer legislation into 10 categories and 3 levels of potential compliance. We also introduced a corpus in Portuguese, with clauses annotated from the guideline. We analyzed the performance of a classifier trained with our corpus and obtained results similar to initial studies in English for the tasks of detecting potentially non-compliant clauses and categorizing potential non-compliant clauses. The results of our work highlight a promising path to developing methods capable of analyzing ToS in Portuguese, and which can be replicated to other fields of Consumer Law.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Corpus available in https://doi.org/10.5281/zenodo.12702424.

  2. 2.

    https://spacy.io.

References

  1. Braun, D.: I beg to differ: how disagreement is handled in the annotation of legal machine learning data sets. Artif. Intell. Law 32, 839–862 (2023). https://doi.org/10.1007/s10506-023-09369-4

    Article  Google Scholar 

  2. Braun, D., Matthes, F.: Clause topic classification in German and English standard form contracts. In: Malmasi, S., Rokhlenko, O., Ueffing, N., Guy, I., Agichtein, E., Kallumadi, S. (eds.) Proceedings of the Fifth Workshop on e-Commerce and NLP (ECNLP 5), Dublin, Ireland May 2022, pp. 199–209. Association for Computational Linguistics (2022). https://doi.org/10.18653/v1/2022.ecnlp-1.23. https://aclanthology.org/2022.ecnlp-1.23

  3. Chakraborty, A., Shankar, R., Marsden, J.R.: An empirical analysis of consumer-unfriendly e-commerce terms of service agreements: implications for customer satisfaction and business survival. Electron. Commer. Res. Appl. 53, 101151 (2022). https://doi.org/10.1016/j.elerap.2022.101151. https://www.sciencedirect.com/science/article/pii/S1567422322000357

  4. Contissa, G., et al.: Towards consumer-empowering artificial intelligence. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, June 2018, pp. 5150–5157. International Joint Conferences on Artificial Intelligence Organization (2018). https://doi.org/10.24963/ijcai.2018/714

  5. Correia, F.A., et al.: Fine-grained legal entity annotation: a case study on the Brazilian supreme court. Inf. Process. Manage. 59(1), 102794 (2022). https://doi.org/10.1016/j.ipm.2021.102794. https://www.sciencedirect.com/science/article/pii/S0306457321002727

  6. Guarino, A., Lettieri, N., Malandrino, D., Zaccagnino, R.: A machine learning-based approach to identify unlawful practices in online terms of service: analysis, implementation and evaluation. Neural Comput. Appl. 33(24), 17569–17587 (2021). https://doi.org/10.1007/s00521-021-06343-6

    Article  Google Scholar 

  7. Hovy, E., Lavid, J.: Towards a ‘science’ of corpus annotation: a new methodological challenge for corpus linguistics. Int. J. Transl. 22(1), 13–36 (2010)

    Google Scholar 

  8. Landis, J.R., Koch, G.G.: The measurement of observer agreement for categorical data. Biometrics 33(1), 159–174 (1977). http://www.jstor.org/stable/2529310

  9. Liepina, R., et al.: GDPR privacy policies in CLAUDETTE: challenges of omission, context and multilingualism. In: Proceedings of the Third Workshop on Automated Semantic Analysis of Information in Legal Text, ASAIL 2019 (2019). https://ceur-ws.org/Vol-2385/paper9.pdf

  10. Lippi, M., et al.: Consumer protection requires artificial intelligence. Nat. Mach. Intell. 1(4), 168–169 (2019). https://doi.org/10.1038/s42256-019-0042-3

    Article  Google Scholar 

  11. Lippi, M., et al.: Automated detection of unfair clauses in online consumer contracts. In: Wyner, A., Casini, G. (eds.) Legal Knowledge and Information Systems, vol. 302, pp. 145–154. IOS Press (2017). https://doi.org/10.3233/978-1-61499-838-9-145. https://hdl.handle.net/1814/68540

  12. Lippi, M., et al.: CLAUDETTE: an automated detector of potentially unfair clauses in online terms of service. Artif. Intell. Law 27(2), 117–139 (2019). https://doi.org/10.1007/s10506-019-09243-2

    Article  Google Scholar 

  13. Loos, M., Luzak, J.: Wanted: a bigger stick. On unfair terms in consumer contracts with online service providers. J. Consum. Policy 39(1), 63–90 (2016). https://doi.org/10.1007/s10603-015-9303-7

  14. Micklitz, H.W., Pałka, P., Panagis, Y.: The empire strikes back: digital control of unfair terms of online services. J. Consum. Policy 40, 367–388 (2017)

    Article  Google Scholar 

  15. Pereira, A.H.: Terminologia do Direito do consumidor: análise das motivações da variação terminológica. Master’s thesis, Paulista State University (2018)

    Google Scholar 

  16. Rocha, I.M., Tocchini, M., de Barros, R.M., Garcia, A.F., Silva, J. de O. e., Zular, F., Maranhão, J.: Guidelines claudinha consumer law, July 2024. https://doi.org/10.5281/zenodo.11206647

  17. Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. 34(1), 1–47 (2002). https://doi.org/10.1145/505282.505283

    Article  Google Scholar 

  18. Senado Federal, C.d.E.T.: Código de defesa do consumidor e normas correlatas. Brasília, 2nd edn. (2017)

    Google Scholar 

  19. Sundareswara, S.N., Srinath, M., Wilson, S., Giles, C.L.: A large-scale exploration of terms of service documents on the web. In: Proceedings of the 21st ACM Symposium on Document Engineering, DocEng 2021. Association for Computing Machinery, New York (2021). https://doi.org/10.1145/3469096.3474940

  20. Tkachenko, M., Malyuk, M., Holmanyuk, A., Liubimov, N.: Label studio: data labeling software (2020–2022). https://github.com/heartexlabs/label-studio

  21. Vapnik, V.N.: The vicinal risk minimization principle and the SVMs. In: The Nature of Statistical Learning Theory. Statistics for Engineering and Information Science. Springer, New York (2000). https://doi.org/10.1007/978-1-4757-3264-1_9

  22. Zeni, N., Kiyavitskaya, N., Mich, L., Cordy, J.R., Mylopoulos, J.: GaiusT: supporting the extraction of rights and obligations for regulatory compliance. Requirements Eng. 20(1), 1–22 (2015). https://doi.org/10.1007/s00766-013-0181-8

    Article  Google Scholar 

Download references

Acknowledgement

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001. We would like to thank Instituto Lawgorithm for partially supporting our research. We also thank the USP-AWS agreement for the cloud environment support to the development of our model.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Matheus Tocchini .

Editor information

Editors and Affiliations

Ethics declarations

Disclosure of Interests

The authors have no competing interests to declare that are relevant to the content of this article.

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tocchini, M. et al. (2025). Detection of Potentially Non-compliant Clauses in Online ToS in Portuguese. In: Santos, M.F., Machado, J., Novais, P., Cortez, P., Moreira, P.M. (eds) Progress in Artificial Intelligence. EPIA 2024. Lecture Notes in Computer Science(), vol 14967. Springer, Cham. https://doi.org/10.1007/978-3-031-73497-7_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-73497-7_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-73496-0

  • Online ISBN: 978-3-031-73497-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics