skip to main content
10.1145/3594315.3594362acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccaiConference Proceedingsconference-collections
research-article

Research on enterprise text classification methods of BiLSTM and CNN based on BERT

Published: 02 August 2023 Publication History

Abstract

The traditional enterprise text data classification method ignores the context of the text. Each word is independent from each other and cannot represent semantic information. The text description and classification effect is poor, and the feature engineering needs human intervention, so the generalization ability is not strong. In view of the low efficiency and accuracy of enterprise text data classification, this paper proposes a bidirectional encoder representation based on Transformer (BERT) The enterprise text classification model BBLC-ATT based on convolutional neural networks (CNN) and bi-directional long short-term memory (BiLSTM) neural networks and attention mechanism (Attention). The model uses BERT training word vector and combines the features of CNN and BiLSTM to capture local potential features and context information. Secondly, the feature vectors extracted from the hybrid network layer are input into the self-attention layer to extract the syntactic and semantic features between words in the enterprise text sentences. Finally, this paper compares BBLC-ATT model with traditional deep learning model in terms of accuracy, accuracy, recall and F1 value. The experimental results show that the BBLC-ATT model is superior to other models in all evaluation indicators, and the accuracy rate is increased by 3.28% - 15.86%.

References

[1]
Ismaïl Biskri, Abdelghani Achouri, Louis Rompré, Steve Descoteaux, and Boucif Amar Bensaber. 2013. Computer-assisted reading: getting help from text classification and maximal association rules. Journal of Advances in Information Technology 4, 4 (2013), 157–165.
[2]
Cheng Zilin; Thunder calm; Yang Xiaogang; Wang Zhuo; Zeng Fei; Sun Hongjuan; Sun Wenfeng; Zhang Yanqi. 2017. Classification of trades in national economy.
[3]
Chen, Jiahao, Zhang, and Jiayi. 2019. An Industry Classification Model of Small and Medium-sized Enterprises based on TF-IDF Characteristics.
[4]
Yiran Cui and Chaobing Huang. 2021. A Chinese Text Classification Method Based on BERT and Convolutional Neural Network. In 2021 7th International Conference on Systems and Informatics (ICSAI). 1–6.
[5]
Armand Joulin, Edouard Grave, Piotr Bojanowski, and Tomas Mikolov. 2016. Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759 (2016).
[6]
Aurangzeb Khan, Baharum Baharudin, Lam Hong Lee, and Khairullah Khan. 2010. A review of machine learning algorithms for text-documents classification. Journal of advances in information technology 1, 1 (2010), 4–20.
[7]
Y. Kim. 2014. Convolutional Neural Networks for Sentence Classification. Eprint Arxiv (2014).
[8]
Xie Hongyu & An Weigang Liang Zhijian. 2020. Text classification based on BiGRU and Bayesian classifier. Computer engineering and design381-385 (2020). https://doi.org/10.16208/j.issn1000-7024.2020.02.013
[9]
P. Liu, X. Qiu, and X. Huang. 2016. Recurrent Neural Network for Text Classification with Multi-Task Learning.
[10]
Yang Liu. 2019. Design and implementation of an enterprise portrait system. Master’s thesis. Hebei Normal University.
[11]
Ch A. S. Murty and Parag H. Rughani. 2022. Dark Web Text Classification by Learning through SVM Optimization. Journal of Advances in Information Technology 13, 6 (2022), 624–631.
[12]
Z. Shu, D. Zheng, X. Hu, and Y. Ming. 2015. Bidirectional Long Short-Term Memory Networks for Relation Classification.
[13]
Ruba Skaik and Diana Inkpen. 2021. Suicide ideation estimators within Canadian provinces using machine learning tools on social media text. Journal of Advances in Information Technology 12, 4 (2021).
[14]
Lambert M. Surhone, Mariam T. Tennoe, and Susan F. Henssonow. 2010. Long Short Term Memory. Betascript Publishing (2010).
[15]
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin. 2017. Attention Is All You Need. arXiv (2017).
[16]
M. Vicari and M. Gaspari. 2021. Analysis of news sentiments using natural language processing and deep learning. Springer London3 (2021).
[17]
Kui Zhang and Hu et al.2021. Sentiment Analysis of Chinese Product Reviews Based on BERT Word Vector and Hierarchical Bidirectional LSTM. In 2021 IEEE International Conference on Computer Science, Artificial Intelligence and Electronic Engineering (CSAIEE). 9–14.
[18]
Peng Zhou, Wei Shi, and Jun et al. Tian. 2016. Attention-based bidirectional long short-term memory networks for relation classification. In Proceedings of the 54th annual meeting of the association for computational linguistics (volume 2: Short papers). 207–212.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICCAI '23: Proceedings of the 2023 9th International Conference on Computing and Artificial Intelligence
March 2023
824 pages
ISBN:9781450399029
DOI:10.1145/3594315
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 August 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Attention model
  2. BERT model
  3. BiLSTM model
  4. CNN model
  5. Natural language processing
  6. Text content classification

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICCAI 2023

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 68
    Total Downloads
  • Downloads (Last 12 months)40
  • Downloads (Last 6 weeks)3
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media