Abstract
The research on network public opinion has attracted more and more attention. To accurately find the hot spots in online public opinion data and analyze their heat, this paper studies the hot spot mining work of Weibo public opinion data. Considering the defects of the traditional K-means++ clustering algorithm in the initial point optimization, the Word2Vec model proposes a hot spot discovery improvement algorithm for the network public opinion data WPK-means++ (Word to vector Penalty factor K-means++). The algorithm introduces the penalty factor to make up for the problem that K-means++ is applied to the scattered text data of hot topics that are affected by outlier points, reduces the invalid coverage of the initial clustering center of the text clustering algorithm, and verifies the accuracy and efficiency of the final clustering results through the analysis of comparative experiments. The original dataset is preprocessed using Chinese word segmentation and removal of stopping words, and the text modeling of the preprocessed result set is carried out by a word embedding model. Finally, the Weibo public opinion data set was used as the corpus.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Kolose, S., et al.: Cluster size prediction for military clothing using 3D body scan data. Appl. Ergon. 96(2), 103487–103497 (2021)
Solli, R., et al.: Unsupervised learning for identifying events in active target experiments. Nucl. Instrum. Methods Phys. Res., Sect. A 1010, 165461 (2021)
Chandel, A.K., et al.: Apple powdery mildew infestation detection and mapping using a high-resolution visible and multispectral aerial imaging technique. Scientia Horticulturae 287, 110228 (2021)
Kazuo, A.: CPI-model-based analysis of sparse K-means clustering algorithms. Int. J. Data Sci. Anal. 12, 229–248 (2021)
Marek, A.: Detection and classification of malicious flows in software-defined networks using data mining techniques. Sensors. 21(9), 2972 (2021)
Daoud, A.S. et al.: Improving arabic document clustering using k-means algorithm and particle swarm optimization. In: Conference 2017, IntelliSys, pp.879–885. IEEE Xplore (2017)
Alharbi, A.R.: Enhancing topic clustering for Arabic security news based on K-means and topic modeling. IET Netw. 10(2), 278–294 (2021)
Zhang, H., Liu, C., Zhang, M., Zhu, R.: A hot spot clustering method based on improved K-means algorithm. In: Conference 2017, (ICCWAMTIP), pp. 32–35. IEEE (2017)
Zhang, W., Lu, J.: An online water army detection method based on network hot events. In: Conference 2018,ICMTMA, pp. 191–193. IEEE Computer Society (2018)
Huang, C., Zhu, Z.: Complex communication application identification and private network mining technology under a large-scale network. Neural Comput. Appl. 33(9), 3871–3879 (2020). https://doi.org/10.1007/s00521-020-05442-0
Acknowledgments
This work was supported in part by the Research Project of Fundamental Scientific Research Business Expenses of Provincial Colleges and Universities in Hebei Province 2021QNJS04.
Funding
Research Project of Fundamental Scientific Research Business Expenses of Provincial Colleges and Universities in Hebei Province 2021QNJS04.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Xie, C., Han, Y., Mu, Y., Wen, X. (2022). Research on Hot Spot Mining Technology for Network Public Opinion. In: Tan, Y., Shi, Y. (eds) Data Mining and Big Data. DMBD 2022. Communications in Computer and Information Science, vol 1745. Springer, Singapore. https://doi.org/10.1007/978-981-19-8991-9_7
Download citation
DOI: https://doi.org/10.1007/978-981-19-8991-9_7
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-8990-2
Online ISBN: 978-981-19-8991-9
eBook Packages: Computer ScienceComputer Science (R0)