Skip to main content

BSIL: A Brain Storm-Based Framework for Imbalanced Text Classification

  • Conference paper
  • First Online:
Book cover Natural Language Processing and Chinese Computing (NLPCC 2019)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11839))

Abstract

All neural networks are not always effective in processing imbalanced datasets when dealing with text classification due to most of them designed under a balanced assumption. In this paper, we present a novel framework named BSIL to improve the capability of neural networks in imbalanced text classification built on brain storm optimization (BSO). With our framework BSIL, the simulation of human brainstorming process of BSO can sample imbalanced datasets in a reasonable way. Firstly, we present an approach to generate multiple relatively balanced subsets of an imbalanced dataset by applying scrambling segmentation and global random sampling in BSIL. Secondly, we introduce a parallel method to train a classifier for a subset efficiently. Finally, we propose a decision-making layer to accept “suggestions” of all classifiers in order to achieve the most reliable prediction result. The experimental results show that BSIL associated with CNN, RNN and Self-attention model can performs better than those models in imbalanced text classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Al-Stouhi, S., Reddy, K.: Transfer learning for class imbalance problems with inadequate data. Knowl. Inf. Syst. 48(1), 201–228 (2016)

    Article  Google Scholar 

  2. Charte, F., Rivera, J., del Jesus, J., Herrera, F.: REMEDIAL-HwR: tackling multilabel imbalance through label decoupling and data resampling hybridization. Neurocomputing 326, 110–122 (2019)

    Article  Google Scholar 

  3. Charte, F., Rivera, J., del Jesus, J., Herrera, F.: Addressing imbalance in multilabel classification: measures and random resampling algorithms. Neurocomputing 163, 3–16 (2015)

    Article  Google Scholar 

  4. Chen, W., Cao, Y., Sun, Y., Liu, Q., Li, Y.: Improving brain storm optimization algorithm via simplex search. arXiv, CoRR abs/1712.03166 (2017)

    Google Scholar 

  5. Cheng, S., Qin, Q., Chen, J., Shi, Y.: Brain storm optimization algorithm: a review. Artif. Intell. Rev. 46(4), 445–458 (2016)

    Article  Google Scholar 

  6. Datta, S., Nag, S., Mullick, S., Das, S.: Diversifying support vector machines for boosting using kernel perturbation: Applications to class imbalance and small disjuncts. arXiv, CoRR abs/1712.08493 (2017)

    Google Scholar 

  7. He, H., Garcia, A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 9, 1263–1284 (2008)

    Google Scholar 

  8. Khan, H., Hayat, M., Bennamoun, M., Sohel, A., Togneri, R.: Cost-sensitive learning of deep feature representations from imbalanced data. IEEE Trans. Neural Netw. Learn. Syst. 29(8), 3573–3587 (2018)

    Article  Google Scholar 

  9. Kubat, M., Holte, C., Matwin, S.: Machine learning for the detection of oil spills in satellite radar images. Mach. Learn. 30(2–3), 195–215 (1998)

    Article  Google Scholar 

  10. Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text classification. In: Proceedings of AAAI 2015, pp. 2267–2273 (2015)

    Google Scholar 

  11. Lin, C., Tsai, F., Hu, H., Jhang, S.: Clustering-based undersampling in class-imbalanced data. Inf. Sci. 409, 17–26 (2017)

    Article  Google Scholar 

  12. Moreo A., Esuli A., Sebastiani F.: Distributional random oversampling for imbalanced text classification. In: Proceedings of SIGIR 2016, pp. 805–808 (2016)

    Google Scholar 

  13. Sun Y., Kamel M., Wang Y.: Boosting for learning multiple classes with imbalanced class distribution. In: Proceedings of ICDM 2017, pp. 592–602 (2006)

    Google Scholar 

  14. Wang, J., Chen, Y., Hao, S., Feng, W., Shen, Z.: Balanced distribution adaptation for transfer learning. In: Proceedings of ICDM 2017, pp. 1129–1134 (2017)

    Google Scholar 

  15. Wang, S., Minku, L., Yao, X.: Resampling-based ensemble methods for online class imbalance learning. IEEE Trans. Knowl. Data Eng. 27(5), 1356–1368 (2015)

    Article  Google Scholar 

Download references

Acknowledgments

This work is supported by the National Key Research and Development Program of China (2017YFB1401200, 2017YFC0908401) and the National Natural Science Foundation of China (61672377). Xiaowang Zhang is supported by the Peiyang Young Scholars in Tianjin University (2019XRX-0032).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaowang Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tian, J., Chen, S., Zhang, X., Feng, Z. (2019). BSIL: A Brain Storm-Based Framework for Imbalanced Text Classification. In: Tang, J., Kan, MY., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2019. Lecture Notes in Computer Science(), vol 11839. Springer, Cham. https://doi.org/10.1007/978-3-030-32236-6_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-32236-6_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-32235-9

  • Online ISBN: 978-3-030-32236-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics