skip to main content
10.1145/3319921.3319958acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiciaiConference Proceedingsconference-collections
research-article

Authorship Attribution of The Golden Lotus Based on Text Classification Methods

Authors Info & Claims
Published:15 March 2019Publication History

ABSTRACT

In this paper, we explore the authorship attribution of The Golden Lotus using the traditional machine learning method of text classification. There are four candidate authors: Shizhen Wang, Wei Xu, Kaixian Li and Zhideng Wang. We choose The Golden Lotus's poems and four candidate authors' poems as data set. According to the characteristics of Chinese ancient poem, we choose Chinese character, rhyme, genre and overlapped word as features. We use six supervised machine learning algorithms, including Logistic Regression, Random Forests, Decision Tree and Naive Bayes, SVM and KNN classifiers respectively for text binary classification and multi-classification. According to two experiments results, the style of writing of Wei Xu's poems is the most similar to that of The Golden Lotus. It is proved that among four authors, Wei Xu most likely be the author of The Golden Lotus.

References

  1. Ðlker Nadi Bozkurt, Özgür Bağlioğlu, Erkan Uyar. Authorship Attribution Performance of various features and classification methods. ACIJ.2013.Google ScholarGoogle Scholar
  2. Mendenhall T C. The characteristic curves of composition{J}. Science, 1887: 237--246.Google ScholarGoogle Scholar
  3. Yule G U. On sentence-length as a statistical characteristic of style in prose: With application to two cases of disputed authorship{J}. Biometrika, 1939: 363--390.Google ScholarGoogle Scholar
  4. Jianjun Shi. The Author Attribution of a Dream of Red Mansions Based on SVM. Journal of A Dream of Red Mansions.2005Google ScholarGoogle Scholar
  5. Hassan F H. Chaurasia M A. Author assertion of furtive write print using character n-grams{C}/ /International Conference on Future Information Technology IPCSIT. Singapore: IACSIT PRESS, 2011: 212--216.Google ScholarGoogle Scholar
  6. Gamon M. Linguistic correlates of style: Authorship classification with deep linguistic analysis features{C}/ /Proceedings of the 20th International Conference on Computational Linguistics. Strouds-burg: Association for Computational Linguistics, 2004: 611--617. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Shen Li, Zhe Zhao, Renfen Hu, Wensi Li, TaoLiu, Xiaoyong Du. Analogical Reasoning on Chinese Morphological and SemanticRelations, ACL 2018Google ScholarGoogle Scholar
  8. Diederich Joachim, Kindermenn Jörg, Leopold Edda, and Pass Gerhard. Authorship attribution with Support Vector Machines". Applied Intelligence. 2003 pp.109--123. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Pattern Recognition. Wikipedia.http://en.wikipedia.org/wiki/Pattern_recognitionGoogle ScholarGoogle Scholar
  10. Fanjun Bu, Improvement of KNN and Its Application to Text Classification{D}. Jiangnan University, 2009Google ScholarGoogle Scholar
  11. Tianjiu Xiao, Ying Liu. A Stylistic Analysis of Jin Yong's and Gu Long's Fictions Based on Text Clustering and Classification{J}. Journal of Chinese Information Processing, 2015, 29(5):167--177.Google ScholarGoogle Scholar
  12. Benzhen Ou. Research on Author Style of the Dream of the Red Chamber from the Contemporary Writingology{D}. Sichuan Normal University, 2007.Google ScholarGoogle Scholar
  13. Sanderson J. and Simon G., "Short Text Authorship Attribution via Sequence Kernels, Markov Chains and Author Unmasking: An Investigation".Google ScholarGoogle Scholar
  14. Jianping Xu. The study of The Golden Lotus's author for 80 years. Hebei Academic Journa.2004(1).Google ScholarGoogle Scholar
  15. D. I. Holmes, "Authorship attribution," Computers and the Humanities, vol. 28, no. 2, pp. 87--106, 1994.Google ScholarGoogle ScholarCross RefCross Ref
  16. G. Avneri, S. Argamon, M. Koppel: Routing documents according to their style. Intl. Workshop on Innovative Internet Information Systems, 1998.Google ScholarGoogle Scholar
  17. Qi Ruihua, Huo Yuehong, Hu Runbo: Review on text authorship identification{J}. Library and Information Service 2015, 59(16):143--148.Google ScholarGoogle Scholar

Index Terms

  1. Authorship Attribution of The Golden Lotus Based on Text Classification Methods

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ICIAI '19: Proceedings of the 2019 3rd International Conference on Innovation in Artificial Intelligence
      March 2019
      279 pages
      ISBN:9781450361286
      DOI:10.1145/3319921

      Copyright © 2019 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 15 March 2019

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader