Skip to main content

Multi-class Short Text Classification Using Ensemble of Deep Learning Classifier

  • Conference paper
  • First Online:
Intelligent Computing & Optimization (ICO 2022)

Abstract

With the substantial outgrowth of e-commerce, social media and online news portals have witnessed a great wave in expressing views through short text. Most textual contents are unstructured and messy forms, which are impractical and cumbersome to organize or manipulate by human experts. Therefore, developing an automatic short text classification model concerning low-resource languages, including Bengali, is critical. Moreover, the crucial barrier to classifying short text in Bengali is the unavailability of text corpora, scarcity of linguistics tools, a limited number of words in the text, and a lack of dependencies between the words. This paper presents a short text classification model using the ensemble of four base deep learning classifiers (Neural Network (NN), Convolutional Neural Network (CNN), Bidirectional Long Short Term Memory (BiLSTM), and Bidirectional Gated Recurrent Unit (BiGRU)). Additionally, a corpus of around 0.13 million Bengali texts is developed for short text classification into six categories (e.g., international, national, sports, amusement, technology, and politics). The evaluation results on the developed corpus demonstrated that the proposed method outperformed all the baselines machine learning and deep learning models by obtaining the highest weighted f1-score of \(84.4\%\).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bhuiyan, M.R., Keya, M., Masum, A.K.M., Hossain, S.A., Abujar, S.: An approach for Bengali news headline classification using LSTM. In: Hassanien, A.E., Bhattacharyya, S., Chakrabati, S., Bhattacharya, A., Dutta, S. (eds.) Emerging Technologies in Data Mining and Information Security. AISC, vol. 1286, pp. 299–308. Springer, Singapore (2021). https://doi.org/10.1007/978-981-15-9927-9_30

    Chapter  Google Scholar 

  2. Dhar, P., Abedin, M., et al.: Bengali news headline categorization using optimized machine learning pipeline. Int. J. Inf. Eng. Electron. Bus. 13(1) (2021)

    Google Scholar 

  3. Hossain, M.R., Hoque, M.M.: Semantic meaning based Bengali web text categorization using deep convolutional and recurrent neural networks (DCRNNs). In: Misra, R., Kesswani, N., Rajarajan, M., Bharadwaj, V., Patel, A. (eds.) ICIoTCT 2020. AISC, vol. 1382, pp. 494–505. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-76736-5_45

    Chapter  Google Scholar 

  4. Kandhro, I.A., et al.: Classification of Sindhi headline news documents based on TF-IDF text analysis scheme. Indian J. Sci. Technol. 12, 33 (2019)

    Article  Google Scholar 

  5. Khan, M.B.: Urdu news classification using application of machine learning algorithms on news headline. IJCSNS 21(2), 229 (2021)

    Google Scholar 

  6. Khushbu, S.A., Masum, A.K.M., Abujar, S., Hossain, S.A.: Neural network based Bengali news headline multi classification system: selection of features describes comparative performance. In: 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT), pp. 1–6. IEEE (2020)

    Google Scholar 

  7. Lu, Z., Liu, W., Zhou, Y., Hu, X., Wang, B.: An effective approach for Chinese news headline classification based on multi-representation mixed model with attention and ensemble learning. In: Huang, X., Jiang, J., Zhao, D., Feng, Y., Hong, Yu. (eds.) NLPCC 2017. LNCS (LNAI), vol. 10619, pp. 339–350. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73618-1_29

    Chapter  Google Scholar 

  8. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  9. Qiu, X., Gong, J., Huang, X.: Overview of the NLPCC 2017 shared task: Chinese news headline categorization. In: Huang, X., Jiang, J., Zhao, D., Feng, Y., Hong, Yu. (eds.) NLPCC 2017. LNCS (LNAI), vol. 10619, pp. 948–953. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73618-1_85

    Chapter  Google Scholar 

  10. Silva, J., Coheur, L., Mendes, A.C., Wichert, A.: From symbolic to sub-symbolic information in question classification. Artif. Intell. Rev. 35(2), 137–154 (2011)

    Article  Google Scholar 

  11. Yin, Z., Tang, J., Ru, C., Luo, W., Luo, Z., Ma, X.: A semantic representation enhancement method for Chinese news headline classification. In: Huang, X., Jiang, J., Zhao, D., Feng, Y., Hong, Yu. (eds.) NLPCC 2017. LNCS (LNAI), vol. 10619, pp. 318–328. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73618-1_27

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammed Moshiul Hoque .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jannat, M., Hossain, E., Hoque, M.M., Rahaman, M.A. (2023). Multi-class Short Text Classification Using Ensemble of Deep Learning Classifier. In: Vasant, P., Weber, GW., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds) Intelligent Computing & Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol 569. Springer, Cham. https://doi.org/10.1007/978-3-031-19958-5_45

Download citation

Publish with us

Policies and ethics