Abstract
Online news platforms have gained huge popularity for online news reading. The topic categories of news are very important for these platforms to target user interests and make personalized recommendations. However, massive news articles are generated everyday, and it too expensive and time-consuming to manually categorize all news. The news bodies usually convey the detailed information of news, and the news titles usually contain summarized and complementary information of news. However, existing news topic prediction methods usually simply aggregate news titles and bodies together and ignore the differences of their characteristics. In this paper, we propose a title-aware neural news topic prediction approach to classify the topic categories of online news articles. In our approach, we propose a multi-view learning framework to incorporate news titles and bodies as different views of news to learn unified news representations. In the title view, we learn title representations from words via a long-short term memory (LSTM) network, and use attention mechanism to select important words according to their contextual representations. In the body view, we propose to use a hierarchical LSTM network to first learn sentence representations from words, and then learn body representations from sentences. In addition, we apply attention networks at both word and sentence levels to recognize important words and sentences. Besides, we use the representation vector of news title to initialize the hidden states of the LSTM networks for news body to capture the summarized news information condensed by news titles. Extensive experiments on a real-world dataset validate that our approach can achieve good performance in news topic prediction and consistently outperform many baseline methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Adi, A.O., Çelebi, E.: Classification of 20 news group with Naïve Bayes classifier. In: SIU, pp. 2150–2153. IEEE (2014)
Bracewell, D.B., Yan, J., Ren, F., Kuroiwa, S.: Category classification and topic discovery of Japanese and English news articles. ENTCS 225, 51–65 (2009)
Cecchini, D., Na, L.: Chinese news classification. In: BigComp, pp. 681–684 (2018)
Conneau, A., Schwenk, H., Barrault, L., Lecun, Y.: Very deep convolutional networks for text classification. In: EACL, pp. 1107–1116 (2017)
Das, A.S., Datar, M., Garg, A., Rajaram, S.: Google news personalization: scalable online collaborative filtering. In: WWW, pp. 271–280. ACM (2007)
Dilrukshi, I., De Zoysa, K., Caldera, A.: Twitter news classification using SVM. In: ICCSE, pp. 287–291. IEEE (2013)
Du, J., Gui, L., Xu, R., He, Y.: A convolutional attention model for text classification. In: Huang, X., Jiang, J., Zhao, D., Feng, Y., Hong, Y. (eds.) NLPCC 2017. LNCS (LNAI), vol. 10619, pp. 183–195. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73618-1_16
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. In: EACL, vol. 2, pp. 427–431 (2017)
Kaur, G., Bajaj, K.: News classification using neural networks. Commun. Appl. Electron 5(1), 42–45 (2016)
Kim, Y.: Convolutional neural networks for sentence classification. In: EMNLP, pp. 1746–1751 (2014)
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text classification. In: AAAI, pp. 2267–2273. AAAI Press (2015)
Lang, K.: Newsweeder: learning to filter netnews. In: Machine Learning Proceedings, pp. 331–339. Elsevier (1995)
Lange, L., Alonso, O., Strötgen, J.: The power of temporal features for classifying news articles. In: WWW, pp. 1159–1160. ACM (2019)
Li, C., Zhan, G., Li, Z.: News text classification based on improved Bi-LSTM-CNN. In: ITME, pp. 890–893. IEEE (2018)
Lichman, M., et al.: UCI machine learning repository (2013)
Lu, Z., Liu, W., Zhou, Y., Hu, X., Wang, B.: An effective approach for Chinese news headline classification based on multi-representation mixed model with attention and ensemble learning. In: Huang, X., Jiang, J., Zhao, D., Feng, Y., Hong, Y. (eds.) NLPCC 2017. LNCS (LNAI), vol. 10619, pp. 339–350. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73618-1_29
Majeed, F., Asif, M.W., Hassan, M.A., Abbas, S.A., Lali, M.I.: Social media news classification in healthcare communication. J. Med. Imaging Health Inform. 9(6), 1215–1223 (2019)
Okura, S., Tagami, Y., Ono, S., Tajima, A.: Embedding-based news recommendation for millions of users. In: KDD, pp. 1933–1942. ACM (2017)
Peng, H., et al.: Large-scale hierarchical text classification with recursively regularized deep graph-CNN. In: WWW, pp. 1063–1072 (2018)
Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: EMNLP, pp. 1532–1543 (2014)
Qiu, X., Gong, J., Huang, X.: Overview of the NLPCC 2017 shared task: Chinese news headline categorization. In: Huang, X., Jiang, J., Zhao, D., Feng, Y., Hong, Y. (eds.) NLPCC 2017. LNCS (LNAI), vol. 10619, pp. 948–953. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73618-1_85
Sawaf, H., Zaplo, J., Ney, H.: Statistical classification methods for Arabic news articles. In: Arabic Natural Language Processing in ACL2001. Citeseer (2001)
Tenenboim, L., Shapira, B., Shoval, P.: Ontology-based classification of news in an electronic newspaper. In: Advanced Research in Artificial Intelligence, p. 89 (2008)
Yang, B., Sun, J.T., Wang, T., Chen, Z.: Effective multi-label active learning for text classification. In: KDD, pp. 917–926. ACM (2009)
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: NAACL, pp. 1480–1489 (2016)
Yin, Z., Tang, J., Ru, C., Luo, W., Luo, Z., Ma, X.: A semantic representation enhancement method for Chinese news headline classification. In: Huang, X., Jiang, J., Zhao, D., Feng, Y., Hong, Y. (eds.) NLPCC 2017. LNCS (LNAI), vol. 10619, pp. 318–328. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73618-1_27
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: NIPS, pp. 649–657 (2015)
Zhou, C., Sun, C., Liu, Z., Lau, F.: A C-LSTM neural network for text classification. arXiv preprint arXiv:1511.08630 (2015)
Zhu, F., Dong, X., Song, R., Hong, Y., Zhu, Q.: A multiple learning model based voting system for news headline classification. In: Huang, X., Jiang, J., Zhao, D., Feng, Y., Hong, Y. (eds.) NLPCC 2017. LNCS (LNAI), vol. 10619, pp. 797–806. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73618-1_69
Acknowledgments
The authors would like to thank Microsoft News for providing technical support and data in the experiments, and Jiun-Hung Chen (Microsoft News) and Ying Qiao (Microsoft News) for their support and discussions. This work was supported by the National Key Research and Development Program of China under Grant number 2018YFC1604002, the National Natural Science Foundation of China under Grant numbers U1836204, U1705261, U1636113, U1536201, and U1536207, and the Tsinghua University Initiative Scientific Research Program.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Wu, C., Wu, F., Qi, T., Huang, Y., Xie, X. (2019). Title-Aware Neural News Topic Prediction. In: Sun, M., Huang, X., Ji, H., Liu, Z., Liu, Y. (eds) Chinese Computational Linguistics. CCL 2019. Lecture Notes in Computer Science(), vol 11856. Springer, Cham. https://doi.org/10.1007/978-3-030-32381-3_15
Download citation
DOI: https://doi.org/10.1007/978-3-030-32381-3_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32380-6
Online ISBN: 978-3-030-32381-3
eBook Packages: Computer ScienceComputer Science (R0)