Multi-classification of Theses to Disciplines Based on Metadata

Li, Jianling; Yu, Shiwen; Li, Shasha; Yu, Jie

doi:10.1007/978-3-030-32236-6_43

Jianling Li¹³,
Shiwen Yu¹³,
Shasha Li¹³ &
…
Jie Yu¹³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11839))

Included in the following conference series:

CCF International Conference on Natural Language Processing and Chinese Computing

4645 Accesses

Abstract

Thesis classification is fundamental to a wide range of efficient research management. Current thesis classification is limited to major, research direction and classification number manually labeled by students themselves, which lacks standard and accuracy. Furthermore, previous auto-classification studies do not take account of interdisciplinary. This study intends to make a major contribution to Chinese thesis classification by taking advantage of the metadata such as title, keywords in the thesis. We propose a novel hierarchical classification model based on methods in metadata semantic representation and the corresponding similarity calculation. Experiments on 4K+ Theses show our methods have significant effect.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The version we are using is GB/T13745-2009. This taxonomic hierarchy is divided into three levels: first-level disciplines, second-level disciplines, and third-level disciplines.
2.
Wanfang Data is one of the most popular knowledge service platforms in China.
3.
We define the title and keywords of a thesis as its central words.

References

30(5) (2011)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., et al.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems (2013)
Google Scholar
Chen, X., Xu, L., Liu, Z., et al.: Joint learning of character and word embeddings. In: International Conference on Artificial Intelligence. AAAI Press (2015)
Google Scholar
Chen, X., Liu, Z., Sun, M.: A unified model for word sense representation and disambiguation. In: Conference on Empirical Methods in Natural Language Processing (EMNLP 2014), Doha, Qatar, 25–29 October 2014, vol. 2, pp. 1025–1035 (2014)
Google Scholar
Xie, R., Yuan, X., Liu, Z., et al.: Lexical sememe prediction via word embeddings and matrix factorization. In: Twenty-Sixth International Joint Conference on Artificial Intelligence. AAAI Press (2017)
Google Scholar
Sun, Y., Lin, L., Yang, N., Ji, Z., Wang, X.: Radical-enhanced Chinese character embedding. In: Loo, C.K., Yap, K.S., Wong, K.W., Teoh, A., Huang, K. (eds.) ICONIP 2014, Part II. LNCS, vol. 8835, pp. 279–286. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-12640-1_34
Chapter Google Scholar
Hashimoto, K., Tsuruoka, Y.: Adaptive joint learning of compositional and non-compositional phrase embeddings (2016)
Google Scholar
Passos, A., Kumar, V., Mccallum, A.: Lexicon infused phrase embeddings for named entity resolution. Computer Science (2014)
Google Scholar
Utsumi, A., Suzuki, D.: Word vectors and two kinds of similarity. In: International Conference on ACL. DBLP (2006)
Google Scholar
10(1), 79–81 (2011)
Google Scholar
Google Scholar

Download references

Acknowledgements

The research is supported by the National Key Research and Development Program of China (2018YFB1004502) and the National Natural Science Foundation of China (61532001, 61303190).

Author information

Authors and Affiliations

School of Computer Science, National University of Defense Technology, Changsha, 410073, China
Jianling Li, Shiwen Yu, Shasha Li & Jie Yu

Authors

Jianling Li
View author publications
You can also search for this author in PubMed Google Scholar
Shiwen Yu
View author publications
You can also search for this author in PubMed Google Scholar
Shasha Li
View author publications
You can also search for this author in PubMed Google Scholar
Jie Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shasha Li .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Jie Tang
National University of Singapore, Singapore, Singapore
Min-Yen Kan
Peking University, Beijing, China
Dongyan Zhao
Peking University, Beijing, China
Sujian Li
Zhengzhou University, Zhengzhou, China
Hongying Zan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, J., Yu, S., Li, S., Yu, J. (2019). Multi-classification of Theses to Disciplines Based on Metadata. In: Tang, J., Kan, MY., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2019. Lecture Notes in Computer Science(), vol 11839. Springer, Cham. https://doi.org/10.1007/978-3-030-32236-6_43

Download citation

DOI: https://doi.org/10.1007/978-3-030-32236-6_43
Published: 30 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32235-9
Online ISBN: 978-3-030-32236-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)