Text Classification Feature Extraction Method Based on Deep Learning for Unbalanced Data Sets

Lin, Li; Guo, Shu-xin

doi:10.1007/978-3-030-67871-5_29

Li Lin¹⁷ &
Shu-xin Guo¹⁸

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 347))

Included in the following conference series:

International Conference on Advanced Hybrid Information Processing

497 Accesses
2 Citations

Abstract

In order to fully realize the classified search of text data information, a text classification feature extraction method for imbalanced data sets based on deep learning is proposed. With the help of trestle automatic encoder and depth confidence network, the preliminary definition of text semantic category conditions is completed, and the text semantic classification processing based on depth learning algorithm is realized. On this basis, pre-processing and debugging of text parameters are implemented, and the dimensionality reduction standards related to the text features of the data set to be extracted are established through the expression of the characteristic behavior. The experimental results show that with the application of the new classification feature extraction method, the number of correctly classified documents starts to increase substantially, which meets the practical application requirements for the classification and search of text data information.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Chen, W., Liu, X., Lu, M.: Feature extraction of deep topic model for multi-label text classification. Pattern Recogn. Artif. Intell. 32(9), 785–792 (2019)
Google Scholar
Wang, Y., He, Y., Zou, H., et al.: WordNG-Vec: a word vector model applied to CNN text classification. J. Chin. Comput. Syst. 40(03), 37–40 (2019)
Google Scholar
Song, C., Chen, X., Niu, Q.: Improved feature selection method based on CHI for text categorization. Microelectron. Comput. 35(09), 80–84 (2018)
Google Scholar
Han, D., Wang, C., Xiao, M.: Multi-label text classification method based on rotating forest and AdaBoost classifier. Appl. Res. Comput. 35(12), 141–144 (2018)
Google Scholar
Yin, Y., Yang, W., Yang, H., et al.: KNN text classification algorithm based on search improvement. Comput. Eng. Des. 39(09), 231–236 (2018)
Google Scholar
Tong, X., Guo, P., Xu, P., et al.: Fusing hyperspectral features and image deep features for classification and retrieval of meat. Sci. Technol. Food Ind. 39(23), 261–266+272 (2018)
Google Scholar
Xuan, Q., Fang, B., Wang, J., et al.: Pearl multi-feature classification method based on support vector machine. J. Zhejiang Univ. Technol. 46(05), 5–12 (2018)
Google Scholar
Hua, S., Hu, S., Gao, L., et al.: Research on fish counting and species recognition system of fishway based on image feature extraction. Water Power 44(12), 90–9 +128 (2018)
Google Scholar
Lv, W., Deng, W., Chu, J., et al.: Arrhythmia classification based on feature selection method of S-transform. J. Data Acquis. Process. 33(2), 306–316 (2018)
Google Scholar
Liu, S., Bai, W., Liu, G., et al.: Parallel fractal compression method for big video data. Complexity 2018, 2016976 (2018). https://doi.org/10.1155/2018/2016976
Article MATH Google Scholar
Liu, S., Yang, G. (eds.): ADHIP 2018. LNICST, vol. 279. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-19086-6
Book Google Scholar
Sridharan, K., Sivakumar, P.: A systematic review on techniques of feature selection and classification for text mining. Int. J. Bus. Inf. Syst. 28(4), 504–518 (2018)
Google Scholar
Ferreira, C.H.P., De Franca, F.O., Medeiros, D.R.: Combining multiple views from a distance based feature extraction for text classification. In: IEEE Congress on Evolutionary Computation, pp. 1–8. IEEE (2018)
Google Scholar
Liu, S., Li, Z., Zhang, Y., et al.: Introduction of key problems in long-distance learning and training. Mobile Netw. Appl. 24(1), 1–4 (2019)
Article Google Scholar
Saikia, L.P., Singh, S.: Feature extraction and performance measure of requirement engineering (RE) document using text classification technique, pp. 1–6 (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Engineering, Jimei University, Xiamen, 361021, China
Li Lin
Jilin University of Finance and Economics, Changchun, 130117, China
Shu-xin Guo

Authors

Li Lin
View author publications
You can also search for this author in PubMed Google Scholar
Shu-xin Guo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Li Lin .

Editor information

Editors and Affiliations

Hunan Normal University, Changsha, China
Shuai Liu
Hunan Normal University, Changsha, China
Liyun Xia

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lin, L., Guo, Sx. (2021). Text Classification Feature Extraction Method Based on Deep Learning for Unbalanced Data Sets. In: Liu, S., Xia, L. (eds) Advanced Hybrid Information Processing. ADHIP 2020. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 347. Springer, Cham. https://doi.org/10.1007/978-3-030-67871-5_29

Download citation

DOI: https://doi.org/10.1007/978-3-030-67871-5_29
Published: 03 February 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67870-8
Online ISBN: 978-3-030-67871-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics