Skip to main content

Multi-classification of Theses to Disciplines Based on Metadata

  • Conference paper
  • First Online:
Natural Language Processing and Chinese Computing (NLPCC 2019)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11839))

  • 4645 Accesses

Abstract

Thesis classification is fundamental to a wide range of efficient research management. Current thesis classification is limited to major, research direction and classification number manually labeled by students themselves, which lacks standard and accuracy. Furthermore, previous auto-classification studies do not take account of interdisciplinary. This study intends to make a major contribution to Chinese thesis classification by taking advantage of the metadata such as title, keywords in the thesis. We propose a novel hierarchical classification model based on methods in metadata semantic representation and the corresponding similarity calculation. Experiments on 4K+ Theses show our methods have significant effect.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The version we are using is GB/T13745-2009. This taxonomic hierarchy is divided into three levels: first-level disciplines, second-level disciplines, and third-level disciplines.

  2. 2.

    Wanfang Data is one of the most popular knowledge service platforms in China.

  3. 3.

    We define the title and keywords of a thesis as its central words.

References

  1. 30(5) (2011)

    Google Scholar 

  2. Mikolov, T., Sutskever, I., Chen, K., et al.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems (2013)

    Google Scholar 

  3. Chen, X., Xu, L., Liu, Z., et al.: Joint learning of character and word embeddings. In: International Conference on Artificial Intelligence. AAAI Press (2015)

    Google Scholar 

  4. Chen, X., Liu, Z., Sun, M.: A unified model for word sense representation and disambiguation. In: Conference on Empirical Methods in Natural Language Processing (EMNLP 2014), Doha, Qatar, 25–29 October 2014, vol. 2, pp. 1025–1035 (2014)

    Google Scholar 

  5. Xie, R., Yuan, X., Liu, Z., et al.: Lexical sememe prediction via word embeddings and matrix factorization. In: Twenty-Sixth International Joint Conference on Artificial Intelligence. AAAI Press (2017)

    Google Scholar 

  6. Sun, Y., Lin, L., Yang, N., Ji, Z., Wang, X.: Radical-enhanced Chinese character embedding. In: Loo, C.K., Yap, K.S., Wong, K.W., Teoh, A., Huang, K. (eds.) ICONIP 2014, Part II. LNCS, vol. 8835, pp. 279–286. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-12640-1_34

    Chapter  Google Scholar 

  7. Hashimoto, K., Tsuruoka, Y.: Adaptive joint learning of compositional and non-compositional phrase embeddings (2016)

    Google Scholar 

  8. Passos, A., Kumar, V., Mccallum, A.: Lexicon infused phrase embeddings for named entity resolution. Computer Science (2014)

    Google Scholar 

  9. Utsumi, A., Suzuki, D.: Word vectors and two kinds of similarity. In: International Conference on ACL. DBLP (2006)

    Google Scholar 

  10. 10(1), 79–81 (2011)

    Google Scholar 

  11. Google Scholar 

Download references

Acknowledgements

The research is supported by the National Key Research and Development Program of China (2018YFB1004502) and the National Natural Science Foundation of China (61532001, 61303190).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shasha Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, J., Yu, S., Li, S., Yu, J. (2019). Multi-classification of Theses to Disciplines Based on Metadata. In: Tang, J., Kan, MY., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2019. Lecture Notes in Computer Science(), vol 11839. Springer, Cham. https://doi.org/10.1007/978-3-030-32236-6_43

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-32236-6_43

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-32235-9

  • Online ISBN: 978-3-030-32236-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics