Abstract
Listed companies with similar or related fundamentals usually influence each other, and these influences are usually reflected in stock prices. For example, the momentum spillover effect in the behavioral finance theory describes the formation of lead-lag effects between the stock prices of related companies. The relationship between listed companies consists of many types, such as relationships in the industry chain, industry information, transaction information, patent sharing degree, equity, etc. We construct a set of industry chain knowledge graph of listed companies to describe the production and supply relationship between the upstream and downstream of listed companies. Then, graph representation learning method is used to study the relevance between listed company entities in the knowledge graph. It includes dimensions such as industry and transaction information of listed companies as weights to optimize the graph representation learning process, and finally calculates the similarity index between listed companies. To evaluate the effectiveness of the method, we conduct a link prediction experiment and construct a stock quantitative investment portfolio based on the similarity index. The result of the quantitative backtest experiment based on China’s stock market data in the last 10 years shows that the graph representation learning method we proposed can be used to study the momentum spillover effect and obtain investment returns.
Similar content being viewed by others
References
Ali, U., & Hirshleifer, D. (2020). Shared analyst coverage: Unifying momentum spillover effects. Journal of Financial Economics, 136(3), 649–675. https://doi.org/10.1016/j.jfineco.2019.10.007
Huang, X. (2019). Mark twain’s cat: Investment experience, categorical thinking, and stock selection. Journal of Financial Economics, 131(2), 404–432. https://doi.org/10.1016/j.jfineco.2018.08.003
Lu, R., Jin, X., Zhang, S., Qiu, M., & Wu, X. (2018). A study on big knowledge and its engineering issues. IEEE Transactions on Knowledge and Data Engineering, 31(9), 1630–1644. https://doi.org/10.1109/TKDE.2018.2866863
Cohen, L., & Frazzini, A. (2008). Economic links and predictable returns. The Journal of Finance, 63(4), 1977–2011. https://doi.org/10.1111/j.1540-6261.2008.01379.x
Parsons, C. A., Sabbatucci, R., & Titman, S. (2020). Geographic lead-lag effects. The Review of Financial Studies, 33(10), 4721–4770. https://doi.org/10.1093/rfs/hhz145
Lee, C. M., Sun, S. T., Wang, R., & Zhang, R. (2019). Technological links and predictable returns. Journal of Financial Economics, 132(3), 76–96. https://doi.org/10.1016/j.jfineco.2018.11.008
Weber, M. (1993). The sociology of religion. Beacon Press.
Hertwig, R., Barron, G., Weber, E. U., & Erev, I. (2004). Decisions from experience and the effect of rare events in risky choice. Psychological Science, 15(8), 534–539. https://doi.org/10.1111/2Fj.0956-7976.2004.00715.x
Kumar, A., & Lee, C. M. (2006). Retail investor sentiment and return comovements. The Journal of Finance, 61(5), 2451–2486. https://doi.org/10.1111/j.1540-6261.2006.01063.x
Oak, S., & Dalbor, M. C. (2008). Institutional investor preferences for lodging stocks. International journal of hospitality management, 27(1), 3–11. https://doi.org/10.1016/j.ijhm.2007.06.003
Wu, G., Zhang, H., Qiu, M., et al. (2013). A decentralized approach for mining event correlations in distributed system monitoring. JPDC, 73(3), 330–340. https://doi.org/10.1016/j.jpdc.2012.09.007
Qiu, M., Chen, Z., Niu, J., Zong, Z., Quan, G., Qin, X., & Yang, L. T. (2015). Data allocation for hybrid memory with genetic algorithm. IEEE Transactions on Emerging Topics in Computing, 3(4), 544–555. https://doi.org/10.1109/TKDE.2018.2866863
Qiu, M., Khisamutdinov, E., et al. (2013). RNA nanotechnology for computer design and in vivo computation. Philosophical Transactions of the Royal Society A. https://doi.org/10.1098/rsta.2012.0310
Gai, K., Qiu, M., Thuraisingham, B., & Tao, L. (2015). Proactive attribute-based secure data schema for mobile cloud in financial industry. In 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and Systems. IEEE, 1332–1337. https://doi.org/10.1109/HPCC-CSS-ICESS.2015.250
Deng, W. T. (2019). Application of the quantitative investment based on machine learning methods. Harbin Institute of Technology.
White, H. (1988). Economic prediction using neural networks: The case of IBM daily stock returns. Neural Networks, IEEE International Conference. https://doi.org/10.1109/ICNN.1988.23959
Tay, F. E., & Cao, L. (2001). Application of support vector machines in financial time series forecasting. Omega-international Journal of Management Science, 29(4), 309–317. https://doi.org/10.1016/S0305-0483(01)00026-3
Feng, F., He, X., Wang, X., Luo, C., Liu, Y., & Chua, T. S. (2019). Temporal relational ranking for stock prediction. ACM Transactions on Information Systems (TOIS), 37(2), 1–30. https://doi.org/10.1145/3309547
Kim, H. (2017). Building a K-Pop knowledge graph using an entertainment ontology. Knowledge Management Research & Practice, 15(2), 305–315. https://doi.org/10.1057/s41275-017-0056-8
Yu, T., Li, J., Yu, Q., Tian, Y., Shun, X., Xu, L., & Gao, H. (2017). Knowledge graph for TCM health preservation: Design, construction, and applications. Artificial intelligence in medicine, 77, 48–52. https://doi.org/10.1016/j.artmed.2017.04.001
Sun, J., Gao, J., Zhang, L., Zhou, M., & Huang, C. (2002). Chinese named entity identification using class-based language model. In COLING 2002: The 19th International Conference on Computational Linguistics. https://doi.org/10.3115/1072228.1072240
Zhang, H. P., Liu, Q., Yu, H. K., Cheng, X., & Bai, S. (2003). Chinese named entity recognition using role model. In International Journal of Computational Linguistics & Chinese Language Processing, 8(2), 29–60. https://doi.org/10.1109/ICMLC.2014.7009718
McCallum, A., & Li, W. (2003). Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL, 4, 188–191. https://doi.org/10.3115/1119176.1119206
Zelenko, D., Aone, C., & Richardella, A. (2003). Kernel Methods for Relation Extraction. Journal of Machine Learning Research, 3(6), 1083–1106. https://doi.org/10.1162/153244303322533205
Zeng, D., Liu, K., Lai, S., Zhou, G., & Zhao, J. (2014). Relation classification via convolutional deep neural network. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, 2335–2344.
Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., & Yakhnenko, O. (2013). Translating embeddings for modeling multi-relational data. Advances in Neural Information Processing Systems, 26.
Perozzi, B., Al-Rfou, R., & Skiena, S. (2014). Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 701–710. https://doi.org/10.1145/2623330.2623732
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013a). Efficient Estimation of Word Representations in Vector Space. In ICLR Workshop Papers.
Grover, A., & Leskovec, J. (2016). node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 855–864. https://doi.org/10.1145/2939672.2939754
Scarselli, F., Gori, M., Tsoi, A. C., Hagenbuchner, M., & Monfardini, G. (2008). The graph neural network model. IEEE transactions on neural networks, 20(1), 61–80. https://doi.org/10.1109/TNN.2008.2005605
Kipf, T. N. & Welling, M. (2016). Semi-supervised classification with graph convolutional networks. In ICLR
Thakur, K., Qiu, M., Gai, K., & Ali, M. L. (2015). An investigation on cyber security threats and security models. In 2015 IEEE 2nd International Conference on Cyber Security and Cloud Computing, 307–311. https://doi.org/10.1109/CSCloud.2015.71
Qiu, M., Zhang, L., Ming, Z., Chen, Z., Qin, X., & Yang, L. T. (2013). Security-aware optimization for ubiquitous computing systems with SEAT graph approach. Journal of Computer and System Sciences, 79(5), 518–529. https://doi.org/10.1016/j.jcss.2012.11.002
Qiu, H., Qiu, M., Memmi, G., Ming, Z., & Liu, M. (2018). A dynamic scalable blockchain based communication architecture for IoT. In International Conference on Smart Blockchain, 159–166. Springer, Cham. https://doi.org/10.1007/978-3-030-05764-0_17
Zhao, H., Chen, M., Qiu, M., Gai, K., & Liu, M. (2016). A novel pre-cache schema for high performance Android system. Future Generation Computer Systems, 56, 766–772. https://doi.org/10.1016/j.future.2015.05.005
Tang, X., Li, K., Qiu, M., & Sha, E. H. M. (2012). A hierarchical reliability-driven scheduling algorithm in grid systems. Journal of Parallel and Distributed Computing, 72(4), 525–535. https://doi.org/10.1016/j.jpdc.2011.12.004
Wang, J., Qiu, M., & Guo, B. (2017). Enabling real-time information service on telehealth system over cloud-based big data platform. Journal of Systems Architecture, 72, 69–79.
Qiu, M., Ming, Z., Li, J., Liu, J., Quan, G., & Zhu, Y. (2013). Informer homed routing fault tolerance mechanism for wireless sensor networks. Journal of Systems Architecture, 59(4–5), 260–270. https://doi.org/10.1016/j.sysarc.2012.12.003
Makarov, I., Kiselev, D., Nikitinsky, N., & Subelj, L. (2021). Survey on graph embeddings and their applications to machine learning problems on graphs. PeerJ Computer Science. https://doi.org/10.7717/peerj-cs.357
Acknowledgements
This work was partially supported by the Technology and Innovation Major Project of the Ministry of Science and Technology of China under Grant 2020AAA0108400 and 2020AAA0108403, and partially supported by the Beijing Natural Science Foundation (Z210002). Portions of this work were presented at the 2021 IEEE 6th International Conference on Smart Cloud (SmartCloud) in 2021, Similarity Analysis of Knowledge Graph-based Company Embedding for Stocks Portfolio.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the Topical Collection on Big Data Security Track
Rights and permissions
About this article
Cite this article
Zhang, B., Yang, C., Zhang, H. et al. Graph Representation Learning for Similarity Stocks Analysis. J Sign Process Syst 94, 1283–1292 (2022). https://doi.org/10.1007/s11265-022-01755-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-022-01755-6