Abstract
The Chinese-Vietnamese bilingual news topic representations are generated from Chinese-Vietnamese bilingual news texts describing the same topic into concise Chinese sentences that can correctly describe the topic. However, there is a semantic gap between Chinese and Vietnamese, and the association relationship between multiple documents in multiple languages is complicated, which makes it challenging to generate concise and correct topic representations. In this paper, we propose a cross-language topic representation method based on heterogeneous graphs. The method first uses a heterogeneous graph containing sentences and entity nodes to represent bilingual Chinese-Vietnamese news texts and effectively models the complex association relationships between multiple texts in multiple languages through graph attention networks (GAT). The topic encoder is then used to encode topic words into cues for topic representation generation, and the decoder side constraints are incorporated to generate the correct topic representation. The experimental results show that the proposed method improves the ROUGE value by up to 3.5 compared with the baseline method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Yang, S., Tang, Y.: News topic detection based on capsule semantic graph. J. Big Data Mining Anal. 5(2), 98–109 (2022). https://doi.org/10.26599/BDMA.2021.9020023
Zheng X.: A Topic Detection Method Based on Word-attention Networks. J. J. Data Inform. Sci.6(04), 139–163(2021). https://doi.org/10.2478/JDIS-2021-0032
Li, J.: A comparative study of keyword extraction algorithms for english texts. J. Intell. Syst. 30(1), 808–815 (2021). https://doi.org/10.1515/jisys-2021-0040
Zheng, L., Jin, P., Zhao, J., Yue, L.: A Fine-Grained Approach for Extracting Events on Microblogs. In: International Conference on Database and Expert Systems Applications. Springer International Publishing (2014)
Liutong.: Algorithm research of text key word extraction based on complex networks. J. Appl. Res. Comput. 33(2), 5 (2016). https://doi.org/10.3969/j.issn.1001-3695.2016.02.010
Myronenko A., Song X., MÁ Carreiraperpiñán.: Advances In Neural Information Processing Systems (2007)
Kataria, S.S., Mitra, P., Caragea, C., Giles, C.L.: Context sensitive topic models for author influence in document networks. IJCAI 2011, In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence, Barcelona, Catalonia, Spain, July 16–22, 2011. AAAI Press (2011)
Kim, H.K., et al.: Bag-of-concepts: Comprehending document representation through clustering words in distributed representation. J. Neurocomput. 29, 336–352 (2017) . https://doi.org/10.1016/j.neucom.2017.05.046
Wang, W., Shi, C., Yu, Xi., et al.: An extractive topic brief representation generation method to event. J. Shandong Univ. (Natural Science), 56(5), 11 (2021)
Jiang, B., Li, Z., Chen, H., et al.: Latent Topic Text Representation Learning on Statistical Manifolds. J. IEEE Trans. Neural Netw. Learning Syst. 29, 1–12(2018). https://doi.org/10.1109/TNNLS.2018.2808332
Li, L., Qin, B., Ren, W., et al.: Document Representation and Feature Combination for Deceptive Spam Review Detection. J. Neurocomput. 254(sep.6):33–41(2017). https://doi.org/10.1016/j.neucom.2016.10.080
Silvana, C., Alfio, F., Stefano, M.: Topic Summary Views for Exploration of Large Scholarly Datasets. J. Journal on Data Semantics, 7(3), 155–170 (2018)
Ayana, Shen, S.Q., Chen, Y., Yang, C., Liu, Z.Y., Sun, M.S: Zero-shot cross-lingual neural headline generation. IEEE/ACM Trans. Audio, Speech, Lang. Process. PP(99), 1–1(2018). https://doi.org/10.1109/TASLP.2018.2842432
Zhu, J., Zhou, Y., Zhang, J., Zong, C.: Attend, Translate and Summarize: An Efficient Method for Neural Cross-Lingual Summarization. In: Meeting of the Association for Computational Linguistics, Association for Computational Linguistics (2020)
Lei, L., Wei, H., Jia, Y., Yu, L., Wan, S.: CIST System Report for ACL MultiLing 2013 - Track 1: Multilingual Multi-document Summarization. In: Proceedings of the MultiLing 2013 Workshop on Multilingual Multi-document Summarization (2013)
Steinberger, J.: The UWB Summariser at Multiling-2013. Proceedings of the MultiLing 2013 Workshop on Multilingual Multi-document Summarization (2013)
Litvak, M., Last, M.: Multilingual Single-Document Summarization with MUSE. In: Proceedings of the MultiLing 2013 Workshop on Multilingual Multi-document Summarization (2013)
Conroy, J., Davis, S.T., Kubina, J., Liu, Y.K., O’Leary, D.P., Schlesinger, J.D.: Multilingual Summarization: Dimensionality Reduction and a Step Towards Optimal Term Coverage. In: Proceedings of the MultiLing 2013 Workshop on Multilingual Multi-document Summarization. Association for Computational Linguistics (2013)
Abdelkrime, A., Eddine, Z.D., Walid, H.K.: AllSummarizer system at MultiLing 2015: Multilingual single and multi-document summarization. In: Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL)(2015)
Wang, D., Liu, P., Zheng, Y., Qiu, X., Huang, X.: Heterogeneous Graph Neural Networks for Extractive Document Summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (2020)
Zhu, J., Zhou, Y., Zhang, J., Zong, C.: Attend, Translate and Summarize: An Efficient Method for Neural Cross-Lingual Summarization. In: Meeting of the Association for Computational Linguistics, Association for Computational Linguistics (2020)
Lin, C. Y.: ROUGE: A Package for Automatic Evaluation of summaries. In: Proceedings of the Workshop on Text Summarization Branches Out (WAS 2004)(2004)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to Sequence Learning with Neural Networks. MIT Press, NIPS (2014)
Zhu, J., Wang, Q., Wang, Y., Zhou, Y., Zhang, J., Wang, S., et al.: Ncls: neural cross-lingual summarization (2019). https://doi.org/10.18653/v1/D19-1302
Huang, T., Lei, L., Zhang, Y.: Multilingual Multi-document Summarization with Enhanced hLDA Features. J. Springer International Publishing (2016)
Acknowledgements
The research work described in this paper has been supported by the National Natural Science Foundation of China (U21B2027, 61972186, 62266028), Yunnan provincial major science and technology special plan projects (202002AD080001, 202103AA080015, 202202AD080003), Yunnan High and New Technology Industry Project (201606). We thank the three anonymous reviewers for their insightful comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
He, Z., Zhu, E., Yu, Z., Gao, S., Huang, Y., Xia, L. (2023). Representation of Chinese-Vietnamese Bilingual News Topics Based on Heterogeneous Graph. In: Sun, Y., et al. Computer Supported Cooperative Work and Social Computing. ChineseCSCW 2022. Communications in Computer and Information Science, vol 1681. Springer, Singapore. https://doi.org/10.1007/978-981-99-2356-4_19
Download citation
DOI: https://doi.org/10.1007/978-981-99-2356-4_19
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-2355-7
Online ISBN: 978-981-99-2356-4
eBook Packages: Computer ScienceComputer Science (R0)