Abstract
In today’s era of information expansion, there exist a lot of real-time news every day, and there will be a time difference between potential real-time hotspots and hot news. It is time-consuming for IT operators to read all the real-time news and find the hot spots by human effort. Therefore, the prediction of hot news is particularly important for we-media operation. Therefore, this paper attempts to use natural language processing technology to predict hot news and to assist operators to create articles about the hot news. For hot news prediction, this paper proposes a method of data preprocessing and uses several models based on convolutional neural networks (CNNs), text convolutional neural networks (TextCNNs), long-short term memory (LSTM), bi-directional long short-term memory (BiLSTM) to train. We also build a new model—CNN Distinct and BiLSTM Extract, which can be called CDBE for short—to obtain better performance. We evaluate several training models and analyze them by multiple evaluation indexes. In addition, we apply the method proposed in this paper to actual operation work, and the result shows that such can greatly reduce operators’ pressure, save the working time for creating articles about hot news, and greatly improve their work efficiency.
Similar content being viewed by others
REFERENCES
Ling, L., Wu, Y., and Yuanlun, W, Long text classification based on attention mechanism, J. Comput. Appl., 2018, vol. 38, no. 5, pp. 1272–1277.
Kim, Y., Convolutional neural networks for sentence classification, PhD Thesis, Waterloo: Univ. of Waterloo, 2015.
Yao, L., Mao, C., and Luo, Y., Graph convolutional networks for text classification, Proc. AAAI Conf. Artif. Intell., 2019, vol. 33, no. 1, pp. 7370–7377. https://doi.org/10.1609/aaai.v33i01.33017370
Wenhui, L. and Yupeng, Q., Spam message recognition method based on word vector and convolution neural network, J. Comput. Appl., 2018, vol. 38, no. 9, pp. 2469–2476.
Ning, B., Junwei, W., and Feng, H., Spam message classification based on the naive Bayes classification algorithm, IAENG Int. J. Comput. Sci., 2019, vol. 46, no. 1, pp. 46–53.
Lu, L., Yang, W., and Yang, Y., and Chen, M., Chinese short text classification method combining semantic extension and convolutional neural network, J. Comput. Appl., 2017, pp. 3498–3503.
Lecun, Y., Bottou, L., Bengio, Y., and Haffner, P., Gradient-based learning applied to document recognition, Proc. IEEE, 1998, vol. 86, no. 11, pp. 2278–2324. https://doi.org/10.1109/5.726791
Chollet, F., Xception: Deep learning with depthwise separable convolutions, IEEE Conf. on Computer Vision and Pattern Recognition, Honolulu, Hawaii, 2017, IEEE, 2017, pp. 1800–1807. https://doi.org/10.1109/CVPR.2017.195
Zheng, H.-T., Chen, J.-Y., Yao, X., Sangaiah, A.K., Jiang, Y., and Zhao, C.-Z., Clickbait convolutional neural network, Symmetry, 2018, vol. 10, no. 5, p. 138. https://doi.org/10.3390/sym10050138
Wang, R., Li, Z., Cao, J., Chen, T., and Wang, L., Convolutional recurrent neural networks for text classification, Int. Joint Conf. on Neural Networks, Budapest, 2019, IEEE, 2019, pp. 1–6. https://doi.org/10.1109/IJCNN.2019.8852406
Qianjiang, G., Text classification based on cyclic neural network model, PhD Dissertation, Wuhan: Huazhong Univ. of Science and Technology, 2016.
Malhotra, P., Vig, L., Shroff, G., and Agarwal, P., Long short term memory networks for anomaly detection in time series, 23rd European Symp. on Artificial Neural Networks, Computational Intelligence and Machine Learning, Bruges, 2015.
Jingxue, L., Fanrong, M., and Yong, Z., Short text classification algorithm based on character level convolution neural network, Comput. Eng. Appl., 2019, vol. 55, no. 5, pp. 141–148.
Keneshloo, Y., Wang, S., Han, E.-S., and Ramakrishnan, N., Predicting the popularity of news articles, Proc. of the 2016 SIAM Int. Conf. on Data Mining (SDM), Miami, 2016, Venkatasubramanian, S.C. and Meira, W., Eds., SIAM, 2016. https://doi.org/10.1137/1.9781611974348.50
Wu, X. and Zhao, T., Application research and prospect of natural language processing technology in social communication, Comput. Sci., 2020, vol. 47, no. 6, pp. 184–193.
Chen, T., Xu, R., He, Y., and Wang, X. Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN, Expert Syst. Appl., 2017, vol. 72, pp. 221–230. https://doi.org/10.1016/j.eswa.2016.10.065
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., and Dean, J., Distributed representations of words and phrases and their compositionality, Proc. 26th Int. Conf. on Neural Information Processing Systems, Lake Tahoe, Nev., 2013, Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., and Weinberger, K.O., Eds., Red Hook, N.Y.: Curran Associates, 2013, vol. 2, pp. 3111–3119.
Mikolov, T., Chen, K., Corrado, G., and Dean, J., Efficient estimation of word representations in vector space, arXiv:1301.3781 [cs.CL]
Sun, J., Shi, W., Yang, Z., Yang, J., and Gui, G., Behavioral modeling and linearization of wideband RF power amplifiers using BiLSTM networks for 5G wireless systems, IEEE Trans. Vehicular Technol., 2019, vol. 68, no. 11, pp. 10348–10356. https://doi.org/10.1109/TVT.2019.2925562
Dai, H.L., Zhong, G.J., You, Z.M., and Dai, H.M., Public opinion sentiment big data analysis ensemble method based on spark, Comput. Sci., 2021, no. 9, p. 48.
Rabiner, L.R., A tutorial on hidden Markov models and selected applications in speech recognition, Proc. IEEE, 1989, vol. 77, no. 2, pp. 257–286. https://doi.org/10.1109/5.18626
Mihalcea, R. and Tarau, P., Textrank: Bringing order into text, Proc. 2004 Conf. on Empirical Methods in Natural Language Processing, Barcelona, 2004, Barcelona: Association for Computing Machinery, 2004, pp. 404–411.
Congying, S., Chaojun, X., and Xiaojiang, Y., Review of TFIDF algorithm, J. Comput. Appl., 2009, vol. 29, no. B06, pp. 167–170.
Funding
This work is supported by the Natural Science Foundation Project of China (61976118) and the Natural Science Foundation Project of Jiangsu Province (BK20180142).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The authors declare that they have no conflicts of interest.
About this article
Cite this article
Yiqin Bao, Sun, Z., Zhao, Q. et al. Hot News Prediction Method Based on Natural Language Processing Technology and Its Application. Aut. Control Comp. Sci. 56, 83–94 (2022). https://doi.org/10.3103/S0146411622010023
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.3103/S0146411622010023