research-article

Authorship Attribution of The Golden Lotus Based on Text Classification Methods

Authors:

Zhiying LiuAuthors Info & Claims

ICIAI '19: Proceedings of the 2019 3rd International Conference on Innovation in Artificial Intelligence

Pages 69 - 72

https://doi.org/10.1145/3319921.3319958

Published: 15 March 2019 Publication History

Abstract

In this paper, we explore the authorship attribution of The Golden Lotus using the traditional machine learning method of text classification. There are four candidate authors: Shizhen Wang, Wei Xu, Kaixian Li and Zhideng Wang. We choose The Golden Lotus's poems and four candidate authors' poems as data set. According to the characteristics of Chinese ancient poem, we choose Chinese character, rhyme, genre and overlapped word as features. We use six supervised machine learning algorithms, including Logistic Regression, Random Forests, Decision Tree and Naive Bayes, SVM and KNN classifiers respectively for text binary classification and multi-classification. According to two experiments results, the style of writing of Wei Xu's poems is the most similar to that of The Golden Lotus. It is proved that among four authors, Wei Xu most likely be the author of The Golden Lotus.

References

[1]

Ðlker Nadi Bozkurt, Özgür Bağlioğlu, Erkan Uyar. Authorship Attribution Performance of various features and classification methods. ACIJ.2013.

[2]

Mendenhall T C. The characteristic curves of composition{J}. Science, 1887: 237--246.

[3]

Yule G U. On sentence-length as a statistical characteristic of style in prose: With application to two cases of disputed authorship{J}. Biometrika, 1939: 363--390.

[4]

Jianjun Shi. The Author Attribution of a Dream of Red Mansions Based on SVM. Journal of A Dream of Red Mansions.2005

[5]

Hassan F H. Chaurasia M A. Author assertion of furtive write print using character n-grams{C}/ /International Conference on Future Information Technology IPCSIT. Singapore: IACSIT PRESS, 2011: 212--216.

[6]

Gamon M. Linguistic correlates of style: Authorship classification with deep linguistic analysis features{C}/ /Proceedings of the 20th International Conference on Computational Linguistics. Strouds-burg: Association for Computational Linguistics, 2004: 611--617.

Digital Library

[7]

Shen Li, Zhe Zhao, Renfen Hu, Wensi Li, TaoLiu, Xiaoyong Du. Analogical Reasoning on Chinese Morphological and SemanticRelations, ACL 2018

[8]

Diederich Joachim, Kindermenn Jörg, Leopold Edda, and Pass Gerhard. Authorship attribution with Support Vector Machines". Applied Intelligence. 2003 pp.109--123.

Digital Library

[9]

Pattern Recognition. Wikipedia.http://en.wikipedia.org/wiki/Pattern_recognition

[10]

Fanjun Bu, Improvement of KNN and Its Application to Text Classification{D}. Jiangnan University, 2009

[11]

Tianjiu Xiao, Ying Liu. A Stylistic Analysis of Jin Yong's and Gu Long's Fictions Based on Text Clustering and Classification{J}. Journal of Chinese Information Processing, 2015, 29(5):167--177.

[12]

Benzhen Ou. Research on Author Style of the Dream of the Red Chamber from the Contemporary Writingology{D}. Sichuan Normal University, 2007.

[13]

Sanderson J. and Simon G., "Short Text Authorship Attribution via Sequence Kernels, Markov Chains and Author Unmasking: An Investigation".

[14]

Jianping Xu. The study of The Golden Lotus's author for 80 years. Hebei Academic Journa.2004(1).

[15]

D. I. Holmes, "Authorship attribution," Computers and the Humanities, vol. 28, no. 2, pp. 87--106, 1994.

[16]

G. Avneri, S. Argamon, M. Koppel: Routing documents according to their style. Intl. Workshop on Innovative Internet Information Systems, 1998.

[17]

Qi Ruihua, Huo Yuehong, Hu Runbo: Review on text authorship identification{J}. Library and Information Service 2015, 59(16):143--148.

Cited By

Alsanoosy TShalbi BNoor A(2024)Authorship Attribution for English Short TextsEngineering, Technology & Applied Science Research10.48084/etasr.830214:5(16419-16426)Online publication date: 9-Oct-2024
https://doi.org/10.48084/etasr.8302
Misini ACanhasi EKadriu AFetahi E(2024)Automatic authorship attribution in Albanian textsPLOS ONE10.1371/journal.pone.031005719:10(e0310057)Online publication date: 22-Oct-2024
https://doi.org/10.1371/journal.pone.0310057
Chakram PKumar VKhan AKukreja V(2024)Combining Convolutional Neural Networks and Random Forest for Lotus Multi-Classification2024 International Conference on Automation and Computation (AUTOCOM)10.1109/AUTOCOM60220.2024.10486155(20-23)Online publication date: 14-Mar-2024
https://doi.org/10.1109/AUTOCOM60220.2024.10486155
Show More Cited By

Index Terms

Authorship Attribution of The Golden Lotus Based on Text Classification Methods
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Lexical semantics

Recommendations

Naïve Bayes classifiers for authorship attribution of Arabic texts

Authorship attribution is the process of assigning an author to an anonymous text based on writing characteristics. Several authorship attribution methods were developed for natural languages, such as English, Chinese and Dutch. However, the number of ...
Chinese text classification by the Naïve Bayes Classifier and the associative classifier with multiple confidence threshold values

Each type of classifier has its own advantages as well as certain shortcomings. In this paper, we take the advantages of the associative classifier and the Naive Bayes Classifier to make up the shortcomings of each other, thus improving the accuracy of ...
Boosting to correct inductive bias in text classification
CIKM '02: Proceedings of the eleventh international conference on Information and knowledge management

This paper studies the effects of boosting in the context of different classification methods for text categorization, including Decision Trees, Naive Bayes, Support Vector Machines (SVMs) and a Rocchio-style classifier. We identify the inductive biases ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICIAI '19: Proceedings of the 2019 3rd International Conference on Innovation in Artificial Intelligence

March 2019

279 pages

ISBN:9781450361286

DOI:10.1145/3319921

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

Xi'an Jiaotong-Liverpool University: Xi'an Jiaotong-Liverpool University
University of Texas-Dallas: University of Texas-Dallas

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 March 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ICIAI 2019

ICIAI 2019: 2019 The 3rd International Conference on Innovation in Artificial Intelligence

March 15 - 18, 2019

Suzhou, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
103
Total Downloads

Downloads (Last 12 months)7
Downloads (Last 6 weeks)2

Reflects downloads up to 15 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Alsanoosy TShalbi BNoor A(2024)Authorship Attribution for English Short TextsEngineering, Technology & Applied Science Research10.48084/etasr.830214:5(16419-16426)Online publication date: 9-Oct-2024
https://doi.org/10.48084/etasr.8302
Misini ACanhasi EKadriu AFetahi E(2024)Automatic authorship attribution in Albanian textsPLOS ONE10.1371/journal.pone.031005719:10(e0310057)Online publication date: 22-Oct-2024
https://doi.org/10.1371/journal.pone.0310057
Chakram PKumar VKhan AKukreja V(2024)Combining Convolutional Neural Networks and Random Forest for Lotus Multi-Classification2024 International Conference on Automation and Computation (AUTOCOM)10.1109/AUTOCOM60220.2024.10486155(20-23)Online publication date: 14-Mar-2024
https://doi.org/10.1109/AUTOCOM60220.2024.10486155
Sommerschield TAssael YPavlopoulos JStefanak VSenior ADyer CBodel JPrag JAndroutsopoulos Ide Freitas N(2023)Machine Learning for Ancient Languages: A SurveyComputational Linguistics10.1162/coli_a_0048149:3(703-747)Online publication date: 1-Sep-2023
https://doi.org/10.1162/coli_a_00481
Misini AKadriu ACanhasi E(2023)Albanian Authorship Attribution Model2023 12th Mediterranean Conference on Embedded Computing (MECO)10.1109/MECO58584.2023.10155046(1-5)Online publication date: 6-Jun-2023
https://doi.org/10.1109/MECO58584.2023.10155046
Kumar KPadmaja T(2022)Inter country poetry classification using Topic modeling2022 First International Conference on Artificial Intelligence Trends and Pattern Recognition (ICAITPR)10.1109/ICAITPR51569.2022.9844213(1-6)Online publication date: 10-Mar-2022
https://doi.org/10.1109/ICAITPR51569.2022.9844213
Селиванова ИSelivanova IКосяков ДKosyakov DДубовицкий ДDubovitskii DГуськов АGuskov А(2021)Экспертная, журнальная и автоматическая классификация полных текстов и аннотаций научных статейEXPERT, JOURNAL AND AUTOMATIC CLASSIFICATION OF FULL TEXTS AND ANNOTATIONS OF SCIENTIFIC ARTICLESНаучно-техническая информация. Серия 2: Информационные процессы и системы10.36535/0548-0027-2021-08-3(15-27)Online publication date: 2021
https://doi.org/10.36535/0548-0027-2021-08-3
Selivanova IKosyakov DDubovitskii DGuskov A(2021)Expert, Journal, and Automatic Classification of Full Texts and Annotations of Scientific ArticlesAutomatic Documentation and Mathematical Linguistics10.3103/S000510552104007555:4(178-189)Online publication date: 1-Jul-2021
https://dl.acm.org/doi/10.3103/S0005105521040075
Nazir ZShahzad KMalik MAnwar WBajwa IMehmood K(2021)Authorship Attribution for a Resource Poor Language—UrduACM Transactions on Asian and Low-Resource Language Information Processing10.1145/348706121:3(1-23)Online publication date: 13-Dec-2021
https://dl.acm.org/doi/10.1145/3487061
Khonji MIraqi YMekouar L(2021)Authorship Identification of Electronic TextsIEEE Access10.1109/ACCESS.2021.30981929(101124-101146)Online publication date: 2021
https://doi.org/10.1109/ACCESS.2021.3098192

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten