Abstract
Domain-specific knowledge graphs usually have requirements for deeper and more accurate knowledge. Existing knowledge graphs in academics mainly focus on authors, abstracts, keywords, and citations, which help explore themes of papers and analyze relationships between different papers. However, these contents are summarizations and only reveal shallow meanings, not involving cores of scientific papers. Mathematical models, ignored by existing knowledge graphs, are what authors really want to express through papers. Knowledge from mathematical models makes it possible to use knowledge graphs for mathematical derivation, not just literal reasoning. To model this knowledge, we propose a knowledge graph construction framework, named M2R, from Mathematical Models to Resource Description Framework. Mathematical models are usually described in formulae. We first identify formula positions according to pre-defined rules and find out contexts explaining variables in the formulae. Next, we split the formulae and related contexts from PDF papers in the form of images, and employ optical character recognition to identify image contents. Then, regular expressions designed based on sentence patterns are used to extract variable symbols and variable explanations. Finally, the formulae are regarded as relations between the variables to form triples whose subjects and objects are the variables, and predicates are the formulae. Similar triples are fused to generate a final knowledge graph. Experimental results demonstrate that precision of the formula extraction is up to 76.97%. Besides, a convincing case study shows that we can effectively extract formulae and related variables, and construct a knowledge graph about mathematical models of scientific papers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Adel, H., Schütze, H.: Global normalization of convolutional neural networks for joint entity and relation classification. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 1723–1729 (2017)
Al-Khatib, K., Hou, Y., Wachsmuth, H., Jochim, C., Bonin, F., Stein, B.: End-to-end argumentation knowledge graph construction. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 7367–7374 (2020)
Amit, S.: Introducing the knowledge graph: Things, not strings. Official Google Blog (2012)
Berners-Lee, T., Handler, J., Lassila, O.: The semantic web. Sci. Am. 284(5), 34–43 (2003)
Bosselut, A., Rashkin, H., Sap, M., Malaviya, C., Celikyilmaz, A., Choi, Y.: Comet: commonsense transformers for automatic knowledge graph construction. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 4762–4779 (2019)
Buscaldi, D., Dessì, D., Motta, E., Osborne, F., Reforgiato Recupero, D.: Mining scholarly publications for scientific knowledge graph construction. In: European Semantic Web Conference, pp. 8–12 (2019)
Carette, J., Farmer, W.M.: A review of mathematical knowledge management. In: International Conference on Intelligent Computer Mathematics, pp. 233–246 (2009)
Chen, P., Lu, Y., Zheng, V.W., Chen, X., Yang, B.: Knowedu: A system to construct knowledge graph for education. IEEE Access 6, 31553–31563 (2018)
Elhammadi, S., et al.: A high precision pipeline for financial knowledge graph construction. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 967–977 (2020)
Elizarov, A., Kirillovich, A., Lipachev, E., Nevzorova, O.: Digital ecosystem ontomath: Mathematical knowledge analytics and management. In: International Conference on Data Analytics and Management in Data Intensive Domains, pp. 33–46 (2016)
Farmer, W.M.: MKM: A new interdisciplinary field of research. ACM SIGSAM Bullet. 38(2), 47–52 (2004)
Gao, L., Yi, X., Liao, Y., Jiang, Z., Yan, Z., Tang, Z.: A deep learning-based formula detection method for pdf documents. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 01, pp. 553–558 (2017). https://doi.org/10.1109/ICDAR.2017.96
Hai Phong, B., Manh Hoang, T., Le, T.L., Aizawa, A.: Mathematical variable detection in pdf scientific documents. In: Intelligent Information and Database Systems, pp. 694–706 (2019)
Kacem, A., Belaïd, A., Ben Ahmed, M.: Automatic extraction of printed mathematical formulas using fuzzy logic and propagation of context. Int. J. Docum. Anal. Recogn. 4(2), 97–108 (2001)
Li, X., Feng, J., Meng, Y., Han, Q., Wu, F., Li, J.: A unified MRC framework for named entity recognition. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 5849–5859 (2020)
Luan, Y., He, L., Ostendorf, M., Hajishirzi, H.: Multi-task identification of entities, relations, and coreference for scientific knowledge graph construction. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 3219–3232 (2018)
Martinez-Rodriguez, J.L., López-Arévalo, I., Rios-Alvarado, A.B.: Openie-based approach for knowledge graph construction from text. Exp. Syst. Appl. 113, 339–355 (2018)
Ren, F., et al.: Techkg: A large-scale Chinese technology-oriented knowledge graph. arXiv preprint arXiv:1812.06722 (2018)
Saha, A., Pahuja, V., Khapra, M., Sankaranarayanan, K., Chandar, S.: Complex sequential question answering: Towards learning to converse over linked question answer pairs with a knowledge graph. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
Song, W., Duan, Z., Yang, Z., Zhu, H., Zhang, M., Tang, J.: Explainable knowledge graph-based recommendation via deep reinforcement learning. arXiv preprint arXiv:1906.09506 (2019)
Tosi, M.D.L., dos Reis, J.C.: Scikgraph: A knowledge graph approach to structure a scientific field. J. Inf. 15(1), 101109 (2021)
Wang, H., et al.: Ripplenet: Propagating user preferences on the knowledge graph for recommender systems. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pp. 417–426 (2018)
Wang, T., Li, H.: Coreference resolution improves educational knowledge graph construction. In: 2020 IEEE International Conference on Knowledge Graph (ICKG), pp. 629–634 (2020)
Wang, Y., Yu, B., Zhang, Y., Liu, T., Zhu, H., Sun, L.: Tplinker: Single-stage joint extraction of entities and relations through token pair linking. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 1572–1582 (2020)
Wei, Z., Su, J., Wang, Y., Tian, Y., Chang, Y.: A novel cascade binary tagging framework for relational triple extraction. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 1476–1488 (2020)
Yu, B., Tian, X., Luo, W.: Extracting mathematical components directly from pdf documents for mathematical expression recognition and retrieval. In: Advances in Swarm Intelligence, pp. 170–179 (2014)
Zanibbi, R., Blostein, D.: Recognition and retrieval of mathematical expressions. Int. J. Docum. Anal. Recogn. (IJDAR) 15(4), 331–357 (2012)
Acknowledgments.
This work was supported in part by the National Natural Science Foundation of China under Grant No. 61602149, and in part by the Fundamental Research Funds for the Central Universities, China under Grant No. B210202078.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zou, C., Li, X., Wu, P., Xie, H. (2023). M2R: From Mathematical Models to Resource Description Framework. In: Li, B., Yue, L., Tao, C., Han, X., Calvanese, D., Amagasa, T. (eds) Web and Big Data. APWeb-WAIM 2022. Lecture Notes in Computer Science, vol 13422. Springer, Cham. https://doi.org/10.1007/978-3-031-25198-6_18
Download citation
DOI: https://doi.org/10.1007/978-3-031-25198-6_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25197-9
Online ISBN: 978-3-031-25198-6
eBook Packages: Computer ScienceComputer Science (R0)