skip to main content
10.1145/3543873.3587578acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

A Chinese Fine-grained Financial Event Extraction Dataset

Authors Info & Claims
Published:30 April 2023Publication History

ABSTRACT

The existing datasets are mostly composed of official documents, statements, news articles, and so forth. So far, only a little attention has been paid to the numerals in financial social comments. Therefore, this paper presents CFinNumAttr, a financial numeral attribute dataset in Chinese via annotating the stock reviews and comments collected from social networking platform. We also conduct several experiments on the CFinNumAttr dataset with state-of-the-art methods to discover the importance of the financial numeral attributes. The experimental results on the CFinNumAttr dataset show that the numeral attributes in social reviews or comments contain rich semantic information, and the numeral clue extraction and attribute classification tasks can make a great improvement in financial text understanding.

Skip Supplemental Material Section

Supplemental Material

References

  1. Chung-Chi Chen, Hen-Hsen Huang, and Hsin-Hsi Chen. 2020. NLP in FinTech Applications: Past, Present and Future. CoRR abs/2005.01320 (2020). arXiv:2005.01320https://arxiv.org/abs/2005.01320Google ScholarGoogle Scholar
  2. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. CoRR abs/1810.04805 (2018). arXiv:1810.04805http://arxiv.org/abs/1810.04805Google ScholarGoogle Scholar
  3. Cuiyun Han, Jinchuan Zhang, Xinyu Li, Guojin Xu, Weihua Peng, and Zengfeng Zeng. 2022. DuEE-Fin: A Large-Scale Dataset for Document-Level Event Extraction. In Natural Language Processing and Chinese Computing, Wei Lu, Shujian Huang, Yu Hong, and Xiabing Zhou (Eds.). Cham, 172–183.Google ScholarGoogle Scholar
  4. Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google ScholarGoogle Scholar
  5. John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In Proceedings of the Eighteenth International Conference on Machine Learning(ICML ’01). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 282–289.Google ScholarGoogle Scholar
  6. Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. 2019. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. CoRR abs/1909.11942 (2019). arXiv:1909.11942http://arxiv.org/abs/1909.11942Google ScholarGoogle Scholar
  7. Guozheng Li, Peng Wang, Jiafeng Xie, Ruilong Cui, and Zhenkai Deng. 2022. FEED: A Chinese Financial Event Extraction Dataset Constructed by Distant Supervision. In Proceedings of the 10th International Joint Conference on Knowledge Graphs (Virtual Event, Thailand) (IJCKG ’21). Association for Computing Machinery, New York, NY, USA, 45–53. https://doi.org/10.1145/3502223.3502229Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Xiao Liu, Zhunchen Luo, and Heyan Huang. 2018. Jointly Multiple Events Extraction via Attention-based Graph Information Aggregation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 1247–1256. https://doi.org/10.18653/v1/D18-1156Google ScholarGoogle ScholarCross RefCross Ref
  9. Yuxuan Liu, Maofu Liu, and Mengjie Wu. 2022. Numeral Tense Detection in Chinese Financial News(WWW ’22). Association for Computing Machinery, New York, NY, USA, 604–609. https://doi.org/10.1145/3487553.3524639Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Junxiang Ren, Sibo Wang, Ruilin Song, Yuejiao Wu, Yizhou Gao, Borong An, Zhen Cheng, and Guoqiang Xu. 2022. IREE: A Fine-Grained Dataset for Chinese Event Extraction in Investment Research. In Knowledge Graph and Semantic Computing: Knowledge Graph Empowers the Digital Economy, Maosong Sun, Guilin Qi, Kang Liu, Jiadong Ren, Bin Xu, Yansong Feng, Yongbin Liu, and Yubo Chen (Eds.). Springer Nature Singapore, Singapore, 205–210.Google ScholarGoogle Scholar
  11. Shun Zheng, Wei Cao, Wei Xu, and Jiang Bian. 2019. Doc2EDAG: An End-to-End Document-level Framework for Chinese Financial Event Extraction. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 337–346. https://doi.org/10.18653/v1/D19-1032Google ScholarGoogle ScholarCross RefCross Ref
  12. Liu Zhuang, Lin Wayne, Shi Ya, and Zhao Jun. 2021. A Robustly Optimized BERT Pre-training Approach with Post-training. In Proceedings of the 20th Chinese National Conference on Computational Linguistics. Chinese Information Processing Society of China, Huhhot, China, 1218–1227. https://aclanthology.org/2021.ccl-1.108Google ScholarGoogle Scholar

Index Terms

  1. A Chinese Fine-grained Financial Event Extraction Dataset

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      WWW '23 Companion: Companion Proceedings of the ACM Web Conference 2023
      April 2023
      1567 pages
      ISBN:9781450394192
      DOI:10.1145/3543873

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 30 April 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate1,899of8,196submissions,23%

      Upcoming Conference

      WWW '24
      The ACM Web Conference 2024
      May 13 - 17, 2024
      Singapore , Singapore

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format