A Survey of CRF Algorithm Based Knowledge Extraction of Elementary Mathematics in Chinese

Liu, Shuai; He, Tenghui; Dai, Jianhua

doi:10.1007/s11036-020-01725-x

A Survey of CRF Algorithm Based Knowledge Extraction of Elementary Mathematics in Chinese

Published: 03 January 2021

Volume 26, pages 1891–1903, (2021)
Cite this article

Mobile Networks and Applications Aims and scope Submit manuscript

1023 Accesses
85 Citations
Explore all metrics

Abstract

Chinese word segmentation is an important research direction in related research on elementary mathematics knowledge extraction. The speed of segmentation directly affects subsequent applications, and the accuracy of segmentation directly affects corresponding research in the next step. In the machine learning methods for extracting basic mathematical knowledge points, the Conditional Random Field (CRF) model implements new word discovery well, and is increasingly used in knowledge extraction of basic mathematics. This article first introduces the traditional CRF process of named entity recognition. Then, an improved algorithm CRF++for conditional field model is proposed. Since the recognition rate of named entities based on traditional machine learning methods is not high, a post-processing method for entity recognition that automatically generates a dictionary is proposed. After identifying mathematical entities, a pruning strategy combining Viterbi algorithm and rules is proposed to achieve a higher recognition rate of elementary mathematical entities. Finally, several methods of disambiguation after entity recognition are introduced.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Peng P (2019) Natural language processing — Chinese vectorization short text book quantitative study [D]. Central China Normal University, Hubei, p 14–18
Asperti A, Padovani L, Coen CS, Guidi F, Schena I (2003) Mathematical knowledge management in HELM. Ann Math Artif Intel 38(1–3):27–46
Yang D, Yang D, Hang G, Daocheng H, Gao M, Wang Y (2019) Research on knowledge point relation extraction for elementary mathematics [J]. J East China Normal Univ 05:53–65
Google Scholar
Zhu H, Yang L, Wenxue D, Jiamei F (2018) Chinese micro blog named entity recognition based on subject tag and CRF. J Cen China Normal Univ 52(03):316–321
Google Scholar
Anh LT, Arkhipov MY, Burtsev MS (2017) Application of a Hybrid Bi-LSTM-CRF model to the task of Russian Named Entity Recognition[J]. arXiv preprint arXiv:1709.09686
Cao M, Zou Y, Yang D et al (2019) GISCA: Gradient-inductive segmentation network with contextual attention for scene text detection[J]. IEEE Access 7:62805–62816
Wang H, Wei H, Guo J et al (2019) Ancient chinese sentence segmentation based on bidirectional LSTM+ CRF model[J]. J Adv Comput Intell Intell Inform 23(4):719–725
Csurka G, Perronnin F (2011) An efficient approach to semantic segmentation[J]. Int J Comput Vis 95(2):198–212
Article MathSciNet Google Scholar
Jowi SM (2010) M ~ 3 N based integration of Chinese word segmentation and named entity recognition. J Tsinghua Univ (Natural Science Edition) 50(05):758–762 + 767
Google Scholar
Collobert R, Weston J, Bottou L et al (2011) Naturallanguage processing(almost)from scratch[J]. J Mach Learn Res 12(1):2493–2537
MATH Google Scholar
Cui T (2016) Research and implementation of speech recognition system based on HMM[D]. Jilin University, Jilin, p 8–16
Yang HD, Sclaroff S, Lee SW (2009) Sign language spotting with a threshold model based on conditional random fields[J]. IEEE Trans Pattern Anal Mach Intell 31(7):1264–1277
Article Google Scholar
Xu C, Xinrui N (2018) Research on the application of information extraction technology in the construction of mobile learning resources. Res Audio-Vis Educ 39(03):90–95 + 102
Google Scholar
Novick LR, Stull AT, Catley KM Reading Phylogenetic Trees: The Effects of Tree Orientation and Text Processing on Comprehension[J]. Bioence 62(8):757–764
Lin X, Mengjie L (2019) Theoretical model and mechanism of learning analysis in intelligent learning environment [J]. Mod Educ Technol 29(04):19–25
Google Scholar
Shahmirzadi O, Lugowski A, Younge K (2019) Text similarity in vector space models: a comparative study[C]//2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA). IEEE 659–666
Belda NA, Plet C, Smeets RPP (2017) Analysis of faults in multi terminal HVDC grid for definition of test requirements of HVDC circuit breakers[J]. IEEE Trans Power Delivery:1–1
Liangying C, Junmin Z, Wang G, Kun Z (2019) Application of augmented reality in the intervention of children with autism: a case study of lexical cognitive intervention. Mod Educ Technol 29(08):86–92
Google Scholar
Sun G, Li J, Dai J, Song Z, Lang F (2018) Feature selection for IoT based on maximal information coefficient[J]. Futur Gen Comput Syst Int J Esci 89:606–616
Article Google Scholar
Ying CC (2008) In the long run, learning from war. Chinese word segmentation method based on conditional random fields. Intell Mag 05:79–81
Google Scholar
Wang Q, Zhou Y, Ruan T et al (2019) Incorporating dictionaries into deep neural networks for the Chinese clinical named entity recognition.[J]. J Biomed Inform 92:103133
Ping Z, Lianying S, Shuai T, BianJianling WY (2020) Research and application of improved knowledge transfer scene entity recognition algorithm [J]. Data Anal Knowl Discov 4(05):118–125
Google Scholar
Xu Y, Wang Y, Liu T et al (2014) Joint segmentation and named entity recognition using dual decomposition in Chinese discharge summaries[J]. J Am Med Inform Assoc 21(e1):e84–e92
Han H, Wang H, Wang X (2019) The conditional random field model combined with active learning is applied to the automatic identification of legal terms. Data Analys knowl Discover 3(06):66–74
Google Scholar
Casillas A, Ezeiza N, Goenaga I, Pérez A, Soto X (2019) Measuring the effect of different types of unsupervised word representations on medical named entity recognition[J]. Int J Med Inform 129:100–106
Article Google Scholar
Sun G, Chen T, Su Y, Li C (2018) Internet traffic classification based on incremental support vector machines[J]. Mob Netw Appl 23(4):789–796
Article Google Scholar
Zhou G, Chen Y, Feng Y et al (2019) Processing of translation-ambiguous words by chinese–english bilinguals in sentence context[J]. J Psycholinguist Res 48(5):1133–1161
Lin T, Guo C, Jingfeng C, Leilei S (2020) Research on hierarchical relation extraction of domain ontology concepts based on Chinese academic literature [J]. J Inf Sci 39(04):387–398
Google Scholar
Maggini M, Marra G, Melacci S et al (2019) Learning in text streams: Discovery and disambiguation of entity and relation instances[J]. IEEE Transactions on Neural Networks and Learning Systems
Pesaranghader A, Matwin S, Sokolova M et al (2019) deepBioWSD: effective deep neural word sense disambiguation of biomedical text data[J]. J Am Med Inform Assoc 26(5):438–446
Dawn DD, Shaikh SH, Pal RK (2019) A comprehensive review of Bengali word sense disambiguation[J]. Artif Intell Rev:1–31

Download references

Acknowledgements

This work was supported by the Natural Science Foundation of Hunan Province with No.2020JJ4434; Key Scientific Research Projects of Department of Education of Hunan Province with No.19A312; Hunan Provincial Science & Technology Project with No.2018TP1018 and No.2018RS3065; National Natural Science Foundation of China with No.61502254; Open Project Program of the State Key Lab of CAD&CG with No.A1926 of Zhejiang University.

Author information

Authors and Affiliations

Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing, Hunan Normal University, Changsha, 410081, China
Shuai Liu, Tenghui He & Jianhua Dai
College of Information Science and Engineering, Hunan Normal University, Changsha, 410081, China
Shuai Liu, Tenghui He & Jianhua Dai

Authors

Shuai Liu
View author publications
You can also search for this author in PubMed Google Scholar
Tenghui He
View author publications
You can also search for this author in PubMed Google Scholar
Jianhua Dai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shuai Liu.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

ESM 1

(DOCX 28.6 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, S., He, T. & Dai, J. A Survey of CRF Algorithm Based Knowledge Extraction of Elementary Mathematics in Chinese. Mobile Netw Appl 26, 1891–1903 (2021). https://doi.org/10.1007/s11036-020-01725-x

Download citation

Accepted: 06 December 2020
Published: 03 January 2021
Issue Date: October 2021
DOI: https://doi.org/10.1007/s11036-020-01725-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Survey of CRF Algorithm Based Knowledge Extraction of Elementary Mathematics in Chinese

Abstract

Access this article

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Supplementary Information

ESM 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation