Abstract
Authorship attribution (AA), which is the task of finding the owner of a given text, is an important and widely studied research topic with many applications. Recent works have shown that deep learning methods could achieve significant accuracy improvement for the AA task. Nevertheless, most of these proposed methods represent user posts using a single type of features (e.g., word bi-grams) and adopt a text classification approach to address the task. Furthermore, these methods offer very limited explainability of the AA results. In this paper, we address these limitations by proposing DeepStyle, a novel embedding-based framework that learns the representations of users’ salient writing styles. We conduct extensive experiments on two real-world datasets from Twitter and Weibo. Our experiment results show that DeepStyle outperforms the state-of-the-art baselines on the AA task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Code implementation: https://gitlab.com/bottle_shop/style/deepstyle.
- 2.
References
Boenninghoff, B., Hessler, S., Kolossa, D., Nickel, R.: Explainable authorship verification in social media via attention-based similarity learning. In: 2019 IEEE International Conference on Big Data (Big Data). IEEE (2019)
Bu, Z., Xia, Z., Wang, J.: A sock puppet detection algorithm on virtual spaces. Knowl.-Based Syst. 37, 366–377 (2013)
Cheng, D., Gong, Y., Zhou, S., Wang, J., Zheng, N.: Person re-identification by multi-channel parts-based CNN with improved triplet loss function. In: IEEE CVPR (2016)
Ding, S.H., Fung, B.C., Iqbal, F., Cheung, W.K.: Learning stylometric representations for authorship analysis. IEEE Trans. Cybern. 49(1), 107–121 (2017)
Koppel, M., Schler, J., Argamon, S.: Computational methods in authorship attribution. J. Am. Soc. Inform. Sci. Technol. 60(1), 9–26 (2009)
Koppel, M., Winter, Y.: Determining if two documents are written by the same author. J. Assoc. Inform. Sci. Technol. 65(1), 178–187 (2014)
Layton, R., Watters, P., Dazeley, R.: Authorship attribution for twitter in 140 characters or less. In: IEEE Cybercrime and Trustworthy Computing Workshop (2010)
Liu, P., Qiu, X., Huang, X.: Recurrent neural network for text classification with multi-task learning. In: IJCAI (2016)
Rocha, A., et al.: Authorship attribution for social media forensics. IEEE Trans. Inf. Forensics Secur. 12(1), 5–33 (2016)
Ruder, S., Ghaffari, P., Breslin, J.G.: Character-level and multi-channel convolutional neural networks for large-scale authorship attribution. Insight Centre for Data Analytics. National University of Ireland Galway, Technical Report (2016)
Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. In: NIPS (2017)
Sapkota, U., Bethard, S., Montes, M., Solorio, T.: Not all character N-grams are created equal: a study in authorship attribution. In: NAACL (2015)
Sari, Y., Stevenson, M., Vlachos, A.: Topic or style? Exploring the most useful features for authorship attribution. In: COLING (2018)
Schwartz, R., Tsur, O., Rappoport, A., Koppel, M.: Authorship attribution of micro-messages. In: EMNLP (2013)
Shrestha, P., Sierra, S., Gonzalez, F., Montes, M., Rosso, P., Solorio, T.: Convolutional neural networks for authorship attribution of short texts. In: EACL (2017)
Stamatatos, E.: A survey of modern authorship attribution methods. J. Am. Soc. Inform. Sci. Technol. 60(3), 538–556 (2009)
Sundararajan, K., Woodard, D.: What represents "style" in authorship attribution? In: COLING (2018)
Xiao, C., Freeman, D.M., Hwa, T.: Detecting clusters of fake accounts in online social networks. In: ACM Workshop on Artificial Intelligence and Security (2015)
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A.J., Hovy, E.H.: Hierarchical attention networks for document classification. In: NAACL: HLT (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Hu, Z., Lee, R.KW., Wang, L., Lim, Ep., Dai, B. (2020). DeepStyle: User Style Embedding for Authorship Attribution of Short Texts. In: Wang, X., Zhang, R., Lee, YK., Sun, L., Moon, YS. (eds) Web and Big Data. APWeb-WAIM 2020. Lecture Notes in Computer Science(), vol 12318. Springer, Cham. https://doi.org/10.1007/978-3-030-60290-1_17
Download citation
DOI: https://doi.org/10.1007/978-3-030-60290-1_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60289-5
Online ISBN: 978-3-030-60290-1
eBook Packages: Computer ScienceComputer Science (R0)