Skip to main content

Text Classification Feature Extraction Method Based on Deep Learning for Unbalanced Data Sets

  • Conference paper
  • First Online:
Advanced Hybrid Information Processing (ADHIP 2020)

Abstract

In order to fully realize the classified search of text data information, a text classification feature extraction method for imbalanced data sets based on deep learning is proposed. With the help of trestle automatic encoder and depth confidence network, the preliminary definition of text semantic category conditions is completed, and the text semantic classification processing based on depth learning algorithm is realized. On this basis, pre-processing and debugging of text parameters are implemented, and the dimensionality reduction standards related to the text features of the data set to be extracted are established through the expression of the characteristic behavior. The experimental results show that with the application of the new classification feature extraction method, the number of correctly classified documents starts to increase substantially, which meets the practical application requirements for the classification and search of text data information.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chen, W., Liu, X., Lu, M.: Feature extraction of deep topic model for multi-label text classification. Pattern Recogn. Artif. Intell. 32(9), 785–792 (2019)

    Google Scholar 

  2. Wang, Y., He, Y., Zou, H., et al.: WordNG-Vec: a word vector model applied to CNN text classification. J. Chin. Comput. Syst. 40(03), 37–40 (2019)

    Google Scholar 

  3. Song, C., Chen, X., Niu, Q.: Improved feature selection method based on CHI for text categorization. Microelectron. Comput. 35(09), 80–84 (2018)

    Google Scholar 

  4. Han, D., Wang, C., Xiao, M.: Multi-label text classification method based on rotating forest and AdaBoost classifier. Appl. Res. Comput. 35(12), 141–144 (2018)

    Google Scholar 

  5. Yin, Y., Yang, W., Yang, H., et al.: KNN text classification algorithm based on search improvement. Comput. Eng. Des. 39(09), 231–236 (2018)

    Google Scholar 

  6. Tong, X., Guo, P., Xu, P., et al.: Fusing hyperspectral features and image deep features for classification and retrieval of meat. Sci. Technol. Food Ind. 39(23), 261–266+272 (2018)

    Google Scholar 

  7. Xuan, Q., Fang, B., Wang, J., et al.: Pearl multi-feature classification method based on support vector machine. J. Zhejiang Univ. Technol. 46(05), 5–12 (2018)

    Google Scholar 

  8. Hua, S., Hu, S., Gao, L., et al.: Research on fish counting and species recognition system of fishway based on image feature extraction. Water Power 44(12), 90–9 +128 (2018)

    Google Scholar 

  9. Lv, W., Deng, W., Chu, J., et al.: Arrhythmia classification based on feature selection method of S-transform. J. Data Acquis. Process. 33(2), 306–316 (2018)

    Google Scholar 

  10. Liu, S., Bai, W., Liu, G., et al.: Parallel fractal compression method for big video data. Complexity 2018, 2016976 (2018). https://doi.org/10.1155/2018/2016976

    Article  MATH  Google Scholar 

  11. Liu, S., Yang, G. (eds.): ADHIP 2018. LNICST, vol. 279. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-19086-6

    Book  Google Scholar 

  12. Sridharan, K., Sivakumar, P.: A systematic review on techniques of feature selection and classification for text mining. Int. J. Bus. Inf. Syst. 28(4), 504–518 (2018)

    Google Scholar 

  13. Ferreira, C.H.P., De Franca, F.O., Medeiros, D.R.: Combining multiple views from a distance based feature extraction for text classification. In: IEEE Congress on Evolutionary Computation, pp. 1–8. IEEE (2018)

    Google Scholar 

  14. Liu, S., Li, Z., Zhang, Y., et al.: Introduction of key problems in long-distance learning and training. Mobile Netw. Appl. 24(1), 1–4 (2019)

    Article  Google Scholar 

  15. Saikia, L.P., Singh, S.: Feature extraction and performance measure of requirement engineering (RE) document using text classification technique, pp. 1–6 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Li Lin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lin, L., Guo, Sx. (2021). Text Classification Feature Extraction Method Based on Deep Learning for Unbalanced Data Sets. In: Liu, S., Xia, L. (eds) Advanced Hybrid Information Processing. ADHIP 2020. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 347. Springer, Cham. https://doi.org/10.1007/978-3-030-67871-5_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-67871-5_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-67870-8

  • Online ISBN: 978-3-030-67871-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics