Skip to main content

Chinese Text Clustering Algorithm Based on Multi-agent Optimization System

  • Conference paper
  • First Online:
Exploration of Novel Intelligent Optimization Algorithms (ISICA 2021)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1590))

Included in the following conference series:

  • 476 Accesses

Abstract

In this paper, we propose a global-to-local searching-based Binary Particle Swarm Optimization (GSBPSO) based on Binary Particle Swarm Optimization (BPSO). The GSBPSO, which enables the particle swarm algorithm to have strong global search capability in the early stage of the algorithm and strong local search capability in the late stage of the algorithm. For text clustering, this paper first uses document frequency for feature coarse selection, then GSBPSO algorithm for feature reselection to further reduce feature redundancy, and finally uses Spherical K-means (SKM) algorithm for final clustering of text. The simulation experiments of Chinese text clustering algorithm based on GSBPSO particle swarm algorithm and SKM using Chinese dataset from Fudan University show that the GSBPSO algorithm can compress the high-dimensional and sparse text feature matrix with a compression ratio of 47%. By clustering the text matrices before and after feature selection separately, the experiments show that the F-value and NMI values of the clustering algorithm are improved to different degrees on the dataset after feature reselection by GSBPSO algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Insu, C., Chang, K.W.: Detecting and analyzing politically-themed stocks using text mining techniques and transfer entropy—focus on the Republic of Korea’s case. Entropy 23(6), 734 (2021)

    Google Scholar 

  2. Xiali, T., Ying, X.: Text data clustering algorithm incorporating new feature selection mechanism. Comput. Eng. Design 42(03), 734–741 (2021)

    Google Scholar 

  3. Li, L., et al.: Document image classification: progress over two decades. Neurocomputing 453, 223–240 (2021)

    Article  Google Scholar 

  4. Sihui, W., Shiping, C.: Self-attention-based Bi-LSTM with TFIDF for spam SMS recognition. Comput. Syst. Appl. 29(09), 171–177 (2020)

    Google Scholar 

  5. Eberhart, R., Kennedy, J.: A new optimizer using particle swarm theory: MHS'95. In: Proceedings of the Sixth International Symposium on Micro Machine and Human Science (1995)

    Google Scholar 

  6. Jianhua, L., Ronghua, Y., Shuihua, S.: Analysis of discrete binary particle swarm optimization. J. Nanjing University (Natural Science) 47(5), 504–513 (2011)

    MATH  Google Scholar 

  7. Wenhua, D., Cuizhen, J., Tingting, H.: Research on text feature extraction method based on hybrid parallel genetic clustering. Comput. Sci. 9, 187–190 (2008)

    Google Scholar 

  8. Liu, J.: A text retrieval method and validity verification based on kmeans clustering algorithm and LDA topic model. Inf. Sci. 35(02), 16–21+26 (2017)

    Google Scholar 

  9. Ibrahim, C., et al.: Two stages K-means and PSO-based method for optimal allocation of multiple parallel DRPs application & deployment. IET Smart Grid 3(2), 216–225 (2020)

    Article  Google Scholar 

  10. Dhillo, I.S., Modha, D.S.: Concept decompositions for large sparse text data using clustering. Mach. Learn. 42, 143–175 (2001)

    Article  Google Scholar 

  11. Banerjee, A., Dhillon, I., Ghosh, J., et al.: Generative Model-based Clustering of Directional Data: Conference on Knowledge Discovery in Data (2003)

    Google Scholar 

Download references

Acknowledgments

This work is supported by the Natural Science Foundation of Guangdong Province of China with the Grant No.2020A1515010784, Key-Area Research and Development Program of Guangdong Province with No. 2019B020219003.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yishu Lei .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, K. et al. (2022). Chinese Text Clustering Algorithm Based on Multi-agent Optimization System. In: Li, K., Liu, Y., Wang, W. (eds) Exploration of Novel Intelligent Optimization Algorithms. ISICA 2021. Communications in Computer and Information Science, vol 1590. Springer, Singapore. https://doi.org/10.1007/978-981-19-4109-2_28

Download citation

  • DOI: https://doi.org/10.1007/978-981-19-4109-2_28

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-19-4108-5

  • Online ISBN: 978-981-19-4109-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics