skip to main content
10.1145/3570991.3571067acmotherconferencesArticle/Chapter ViewAbstractPublication PagescodsConference Proceedingsconference-collections
short-paper

FLUEnT: Financial Language Understandability Enhancement Toolkit

Published: 04 January 2023 Publication History

Abstract

Over the years, promising returns have enticed the masses to invest in the stock markets. However, most people do not have the financial knowledge needed for making investment decisions. Even seasoned investors find it difficult to grasp all the available information. This is primarily due to the ever-changing market dynamics and information overload. Natural Language Processing based automated systems are the rescue to such problems. In this paper, we present the Financial Language Understandability Enhancement Toolkit (FLUEnT) for processing financial text. It consists of eight different tools for tasks like hypernym detection, numeral claim analysis, readability assessment, sustainability assessment, etc. The objective of the toolkit is to empower the masses and enable investors in making data-driven decisions. It is open-source under MIT license and is openly accessible from Colab and HuggingFace.1,

References

[1]
Abubakar Abid, Ali Abdalla, Ali Abid, Dawood Khan, Abdulrahman Alfozan, and James Zou. 2019. Gradio: Hassle-Free Sharing and Testing of ML Models in the Wild. arXiv preprint arXiv:1906.02569(2019).
[2]
Abejide Ade-Ibijola. 2016. FINCHAN: A Grammar-Based Tool for Automatic Comprehension of Financial Instant Messages. In Proceedings of the Annual Conference of the South African Institute of Computer Scientists and Information Technologists(Johannesburg, South Africa) (SAICSIT ’16). Association for Computing Machinery, New York, NY, USA, Article 1, 10 pages. https://doi.org/10.1145/2987491.2987518
[3]
Dogu Araci. 2019. FinBERT: Financial Sentiment Analysis with Pre-trained Language Models. arxiv:1908.10063 [cs.CL] https://arxiv.org/abs/1908.10063
[4]
Chung-Chi Chen, Hen-Hsen Huang, Yu-Lieh Huang, Hiroya Takamura, and Hsin-Hsi Chen. 2022. Overview of the ntcir-16 finnum-3 task: investor’s and manager’s fine-grained claim detection. In Proceedings of the 16th NTCIR Conference on Evaluation of Information Access Technologies. NII, Tokyo, Japan, 87–91. http://research.nii.ac.jp/ntcir/workshop/OnlineProceedings16/pdf/ntcir/01-NTCIR16-OV-FINNUM-ChenC.pdf
[5]
Ankush Chopra and Sohom Ghosh. 2021. Term Expansion and FinBERT fine-tuning for Hypernym and Synonym Ranking of Financial Terms. In Proceedings of the Third Workshop on Financial Technology and Natural Language Processing (FinNLP@IJCAI 2021). -, Online, 46–51. https://aclanthology.org/2021.finnlp-1.8
[6]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186. https://doi.org/10.18653/v1/N19-1423
[7]
Boris Galitsky and Dmitry Ilvovsky. 2019. On a Chatbot Conducting a Virtual Dialogue in Financial Domain. In Proceedings of the First Workshop on Financial Technology and Natural Language Processing. Macao, China, 99–101. https://aclanthology.org/W19-5517
[8]
Sohom Ghosh and Sudip Kumar Naskar. 2022. Detecting context-based in-claim numerals in Financial Earnings Conference Calls. International Journal of Information Technology – (2022). https://doi.org/10.1007/s41870-022-00952-7
[9]
Sohom Ghosh and Sudip Kumar Naskar. 2022. FiNCAT-2: An enhanced Financial Numeral Claim Analysis Tool. Software Impacts 12(2022), 100288. https://doi.org/10.1016/j.simpa.2022.100288
[10]
Sohom Ghosh and Sudip Kumar Naskar. 2022. FiNCAT: Financial Numeral Claim Analysis Tool. In Companion Proceedings of the Web Conference 2022 (WWW ’22 Companion) (Virtual Event, Lyon, France). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3487553.3524635
[11]
Sohom Ghosh and Sudip Kumar Naskar. 2022. Lipi at the ntcir-16 finnum-3 task: ensembling transformer based models to detect in-claim numerals in financial conversations. In Proceedings of the 16th NTCIR Conference on Evaluation of Information Access Technologies. NII, Tokyo, Japan, 92–94. http://research.nii.ac.jp/ntcir/workshop/OnlineProceedings16/pdf/ntcir/02-NTCIR16-FINNUM-GhoshS.pdf
[12]
Sohom Ghosh and Sudip Kumar Naskar. 2022. Ranking Environment, Social And Governance Related Concepts And Assessing Sustainability Aspect Of Financial Texts. In Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP@IJCAI-ECAI 2022). -, Vienna, Austria, 87–92. https://mx.nthu.edu.tw/ chungchichen/FinNLP2022_IJCAI/14.pdf
[13]
Sohom Ghosh, Shovon Sengupta, Sudip Naskar, and Sunny Kumar Singh. 2021. FinRead: A Transfer Learning Based Tool to Assess Readability of Definitions of Financial Terms. In Proceedings of the 18th International Conference on Natural Language Processing (ICON). NLP Association of India (NLPAI), National Institute of Technology Silchar, Silchar, India, 658–659. https://aclanthology.org/2021.icon-main.81
[14]
Sohom Ghosh, Shovon Sengupta, Sudip Kumar Naskar, and Sunny Kumar Singh. 2022. FinRAD: Financial Readability Assessment Dataset - 13,000+ Definitions of Financial Terms for Measuring Readability. In Proceedings of the The 4th Financial Narrative Processing Workshop (FNP@LREC2022). European Language Resources Association, Marseille, France, 1–9. http://lrec-conf.org/proceedings/lrec2022/workshops/FNP/pdf/2022.fnp-1.1.pdf
[15]
Maarten Grootendorst. 2020. KeyBERT: Minimal keyword extraction with BERT.https://doi.org/10.5281/zenodo.4461265
[16]
Allen Huang, Hui Wang, and Yi Yang. 2020. FinBERT—A Large Language Model Approach to Extracting Information from Financial Text. http://dx.doi.org/10.2139/ssrn.3910214
[17]
Juyeon Kang, Mehdi Kchouk, Sandra Bellato, Mei Gan, and Ismail El Maarouf. 2022. FinSim4-ESG Shared Task: Learning Semantic Similarities for the Financial Domain. Extended edition to ESG insights. In Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP@IJCAI-ECAI 2022). -, Vienna, Austria, 57–63. https://mx.nthu.edu.tw/ chungchichen/FinNLP2022_IJCAI/9.pdf
[18]
Juyeon Kang, Ismail El Maarouf, Sandra Bellato, and Mei Gan. 2021. FinSim-3: The 3rd Shared Task on Learning Semantic Similarities for the Financial Domain. In Proceedings of the Third Workshop on Financial Technology and Natural Language Processing. -, Online, 31–35. https://aclanthology.org/2021.finnlp-1.5
[19]
Jean Lee, Hoyoul Luis Youn, Nicholas Stevens, Josiah Poon, and Soyeon Caren Han. 2021. FedNLP: An Interpretable NLP System to Decode Federal Reserve Communications. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (Virtual Event, Canada) (SIGIR ’21). Association for Computing Machinery, New York, NY, USA, 2560–2564. https://doi.org/10.1145/3404835.3462785
[20]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. https://arxiv.org/abs/1907.11692
[21]
Yu-Wen Liu, Liang-Chih Liu, Chuan-Ju Wang, and Ming-Feng Tsai. 2016. FIN10K: A Web-Based Information System for Financial Report Analysis and Visualization. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management (Indianapolis, Indiana, USA) (CIKM ’16). Association for Computing Machinery, New York, NY, USA, 2441–2444. https://doi.org/10.1145/2983323.2983328
[22]
Lefteris Loukas, Manos Fergadiotis, Ion Androutsopoulos, and Prodromos Malakasiotis. 2021. EDGAR-CORPUS: Billions of Tokens Make The World Go Round. In Proceedings of the Third Workshop on Economics and Natural Language Processing. Association for Computational Linguistics, Punta Cana, Dominican Republic, 13–18. https://doi.org/10.18653/v1/2021.econlp-1.2
[23]
Alexandra Luccioni, Emily Baylor, and Nicolas Duchene. 2020. Analyzing sustainability reports using natural language processing, In Tackling Climate Change with Machine Learning workshop at NeurIPS 2020. arXiv preprint arXiv:2011.08073. https://arxiv.org/abs/2011.08073
[24]
Pekka Malo, Ankur Sinha, Pyry Takala, Pekka Korhonen, and Jyrki Wallenius. 2013. FinancialPhraseBank-v1.0. https://www.researchgate.net/publication/251231364_FinancialPhraseBank-v10
[25]
Tatiana Passali, Alexios Gidiotis, Efstathios Chatzikyriakidis, and Grigorios Tsoumakas. 2021. Towards Human-Centered Summarization: A Case Study on Financial News. In Proceedings of the First Workshop on Bridging Human–Computer Interaction and Natural Language Processing. Association for Computational Linguistics, Online, 21–27. https://www.aclweb.org/anthology/2021.hcinlp-1.4
[26]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.). Curran Associates, Inc., 8024–8035. http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf
[27]
Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 3982–3992. https://doi.org/10.18653/v1/D19-1410
[28]
Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108(2019).
[29]
Ming-Feng Tsai and Chuan-Ju Wang. 2012. Visualization on Financial Terms via Risk Ranking from Financial Reports. In Proceedings of COLING 2012: Demonstration Papers. The COLING 2012 Organizing Committee, Mumbai, India, 447–452. https://aclanthology.org/C12-3056
[30]
Karolin Winter, Manuel Gall, and Stefanie Rinderle-Ma. 2020. RegMiner: Taming the Complexity of Regulatory Documents for Digitalized Compliance Management. In Proceedings of the Best Dissertation Award, Doctoral Consortium, and Demonstration & Resources Track at BPM 2020. 112–116. http://ceur-ws.org/Vol-2673/paperDR10.pdf
[31]
Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander M. Rush. 2019. HuggingFace’s Transformers: State-of-the-art Natural Language Processing. https://doi.org/10.48550/ARXIV.1910.03771
[32]
Jingqing Zhang, Yao Zhao, Mohammad Saleh, and Peter Liu. 2020. PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization. In Proceedings of the 37th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 119), Hal Daumé III and Aarti Singh (Eds.). PMLR, 11328–11339. https://proceedings.mlr.press/v119/zhang20ae.html

Cited By

View all
  • (2024)Demystifying Financial Texts Using Natural Language ProcessingProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3680258(5451-5454)Online publication date: 21-Oct-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
CODS-COMAD '23: Proceedings of the 6th Joint International Conference on Data Science & Management of Data (10th ACM IKDD CODS and 28th COMAD)
January 2023
357 pages
ISBN:9781450397971
DOI:10.1145/3570991
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 January 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. financial text processing
  2. natural language processing
  3. toolkit

Qualifiers

  • Short-paper
  • Research
  • Refereed limited

Conference

CODS-COMAD 2023

Acceptance Rates

Overall Acceptance Rate 197 of 680 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)41
  • Downloads (Last 6 weeks)1
Reflects downloads up to 19 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Demystifying Financial Texts Using Natural Language ProcessingProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3680258(5451-5454)Online publication date: 21-Oct-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media