Skip to main content

Banking Regulation Classification in Portuguese

  • Conference paper
  • First Online:
Computational Processing of the Portuguese Language (PROPOR 2022)

Abstract

Products, services, among many other things in life have a quality standard, are inclusive, or do not harm customers. Regulations required from their manufacturers or providers make it possible. This type of requirement also exists in the finance sector. Governments, international agencies, or civil institutions are responsible for creating, applying, and inspecting these regulations. Regulators from all spheres (federal, state, and municipal) constantly demand changes in the finance sector to meet current needs adequately. This paper presents the constant evolution of a banking compliance application in Brazil. It aims to classify the relevance or irrelevance of regulatory documents published by more than 100 Brazilian regulators, affecting the businesses of more than 40 departments of Banco do Brasil. The application uses a hybrid strategy, combining machine learning and rules for a binary classification challenge involving each company department. This work also presents a particular type of corpus imbalance called The Imbalance Within Class.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.nltk.org/howto/portuguese_en.html.

  2. 2.

    https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html.

  3. 3.

    https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html.

  4. 4.

    https://simpletransformers.ai/.

References

  1. O’Halloran, S., Maskey, S., McAllister, G., Park, D.K., Chen, K.: Data science and political economy: application to financial regulatory structure. RSF Russell Sage Found. J. Soc. Sci. 2, 87–109 (2016)

    Google Scholar 

  2. Morgan, D.P.: Rating banks: risk and uncertainty in an opaque industry. Am. Econ. Rev. 92, 874–888 (2002)

    Google Scholar 

  3. de Lima, A.J.D., Ferreira, L.N., Brandi-vinicius, V.R.: The rise of risk: a word on financial stability regulation

    Google Scholar 

  4. Leo, M., Sharma, S., Maddulety, K.: Machine learning in banking risk management: a literature review. Risks 7, 29 (2019)

    Article  Google Scholar 

  5. Kumar, B.S., Ravi, V.: A survey of the applications of text mining in financial domain. Knowl.-Based Syst. 114, 128–147 (2016)

    Google Scholar 

  6. El-Haj, M., Rayson, P., Walker, M., Young, S., Simaki, V.: In search of meaning: lessons, resources and next steps for computational analysis of financial discourse. J. Bus. Finance Account. 46, 265–306 (2019)

    Article  Google Scholar 

  7. Gonçalves, T., Quaresma, P.: A preliminary approach to the multilabel classification problem of Portuguese juridical documents. In: Pires, F.M., Abreu, S. (eds.) EPIA 2003. LNCS (LNAI), vol. 2902, pp. 435–444. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-24580-3_50

    Chapter  Google Scholar 

  8. de Araujo, P.H.L., de Campos, T.E., Braz, F.A., da Silva, N.C.: VICTOR: a dataset for Brazilian legal documents classification. In: Proceedings of The 12th Language Resources and Evaluation Conference, pp. 1449–1458 (2020)

    Google Scholar 

  9. Rodríguez, M.M., Bezerra, L.D.: Processamento de linguagem natural para reconhecimento de entidades nomeadas em textos jurídicos de atos administrativos (portarias). Revista de Engenharia e Pesquisa Aplicada 5, 67–77 (2020)

    Article  Google Scholar 

  10. Faria de Azevedo, R., et al.: Screening of email box in Portuguese with SVM at Banco do Brasil. In: Quaresma, P., Vieira, R., Aluísio, S., Moniz, H., Batista, F., Gonçalves, T. (eds.) PROPOR 2020. LNCS (LNAI), vol. 12037, pp. 153–163. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-41505-1_15

    Chapter  Google Scholar 

  11. O’Halloran, S., Maskey, S., McAllister, G., Park, D.K., Chen, K.: Big data and the regulation of financial markets. In: Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015, pp. 1118–1124 (2015)

    Google Scholar 

  12. Neill, J.O., Buitelaar, P., Robin, C., Brien, L.O.: Classifying sentential modality in legal language: a use case in financial regulations, acts and directives. In: Proceedings of the 16th edition of the International Conference on Artificial Intelligence and Law, pp. 159–168 (2017)

    Google Scholar 

  13. Wong, K.Y.: Learning regulatory compliance data for data governance in financial services industry by machine learning models (2020)

    Google Scholar 

  14. Gogas, P., Papadimitriou, T., Agrapetidou, A.: Forecasting bank failures and stress testing: a machine learning approach. Int. J. Forecast. 34, 440–455 (2018)

    Article  Google Scholar 

  15. Suss, J., Treitel, H.: Predicting bank distress in the UK with machine learning (2019)

    Google Scholar 

  16. Petropoulos, A., Siakoulis, V., Stavroulakis, E., Vlachogiannakis, N.E.: Predicting bank insolvencies using machine learning techniques. Int. J. Forecast. 36, 1092–1113 (2020)

    Article  Google Scholar 

  17. Jagtiani, J., Vermilyea, T., Wall, L.D.: The roles of big data and machine learning in bank supervision. Forthcoming, Banking Perspectives (2018)

    Google Scholar 

  18. Polyzos, S., Samitas, A., Kampouris, E.: Economic stimulus through bank regulation: government responses to the COVID-19 crisis. J. Int. Fin. Mark. Inst. Money, 101444 (2021)

    Google Scholar 

  19. Howe, J.S.T., Khang, L.H., Chai, I.E.: Legal area classification: a comparative study of text classifiers on Singapore supreme court judgments. arXiv preprint arXiv:1904.06470 (2019)

  20. Park, K.Y., Lee, Y.J., Kim, S.: Deciphering monetary policy board minutes through text mining approach: the case of Korea. Bank of Korea WP 1 (2019)

    Google Scholar 

  21. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  22. Souza, F., Nogueira, R., Lotufo, R.: BERTimbau: pretrained BERT models for Brazilian Portuguese. In: 9th Brazilian Conference on Intelligent Systems, BRACIS, Rio Grande do Sul, Brazil, 20–23 October (2020, to appear)

    Google Scholar 

  23. Leite, J.A., Silva, D.F., Bontcheva, K., Scarton, C.: Toxic language detection in social media for Brazilian Portuguese: new dataset and multilingual analysis. arXiv preprint arXiv:2010.04543 (2020)

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

de Azevedo, R.F. et al. (2022). Banking Regulation Classification in Portuguese. In: Pinheiro, V., et al. Computational Processing of the Portuguese Language. PROPOR 2022. Lecture Notes in Computer Science(), vol 13208. Springer, Cham. https://doi.org/10.1007/978-3-030-98305-5_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-98305-5_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-98304-8

  • Online ISBN: 978-3-030-98305-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics