DeepStyle: User Style Embedding for Authorship Attribution of Short Texts

Hu, Zhiqiang; Lee, Roy Ka-Wei; Wang, Lei; Lim, Ee-peng; Dai, Bo

doi:10.1007/978-3-030-60290-1_17

Zhiqiang Hu¹³,
Roy Ka-Wei Lee¹⁴,
Lei Wang¹⁵,
Ee-peng Lim¹⁵ &
…
Bo Dai¹³

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12318))

Included in the following conference series:

Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data

1534 Accesses

Abstract

Authorship attribution (AA), which is the task of finding the owner of a given text, is an important and widely studied research topic with many applications. Recent works have shown that deep learning methods could achieve significant accuracy improvement for the AA task. Nevertheless, most of these proposed methods represent user posts using a single type of features (e.g., word bi-grams) and adopt a text classification approach to address the task. Furthermore, these methods offer very limited explainability of the AA results. In this paper, we address these limitations by proposing DeepStyle, a novel embedding-based framework that learns the representations of users’ salient writing styles. We conduct extensive experiments on two real-world datasets from Twitter and Weibo. Our experiment results show that DeepStyle outperforms the state-of-the-art baselines on the AA task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Integrating RoBERTa Fine-Tuning and User Writing Styles for Authorship Attribution of Short Texts

Unifying Lexical, Syntactic, and Structural Representations of Written Language for Authorship Attribution

Article 11 October 2021

A Syntax-Aware Encoder for Authorship Attribution

Notes

1.
Code implementation: https://gitlab.com/bottle_shop/style/deepstyle.
2.
https://hub.hku.hk/cris/dataset/dataset107483.

References

Boenninghoff, B., Hessler, S., Kolossa, D., Nickel, R.: Explainable authorship verification in social media via attention-based similarity learning. In: 2019 IEEE International Conference on Big Data (Big Data). IEEE (2019)
Google Scholar
Bu, Z., Xia, Z., Wang, J.: A sock puppet detection algorithm on virtual spaces. Knowl.-Based Syst. 37, 366–377 (2013)
Article Google Scholar
Cheng, D., Gong, Y., Zhou, S., Wang, J., Zheng, N.: Person re-identification by multi-channel parts-based CNN with improved triplet loss function. In: IEEE CVPR (2016)
Google Scholar
Ding, S.H., Fung, B.C., Iqbal, F., Cheung, W.K.: Learning stylometric representations for authorship analysis. IEEE Trans. Cybern. 49(1), 107–121 (2017)
Article Google Scholar
Koppel, M., Schler, J., Argamon, S.: Computational methods in authorship attribution. J. Am. Soc. Inform. Sci. Technol. 60(1), 9–26 (2009)
Article Google Scholar
Koppel, M., Winter, Y.: Determining if two documents are written by the same author. J. Assoc. Inform. Sci. Technol. 65(1), 178–187 (2014)
Article Google Scholar
Layton, R., Watters, P., Dazeley, R.: Authorship attribution for twitter in 140 characters or less. In: IEEE Cybercrime and Trustworthy Computing Workshop (2010)
Google Scholar
Liu, P., Qiu, X., Huang, X.: Recurrent neural network for text classification with multi-task learning. In: IJCAI (2016)
Google Scholar
Rocha, A., et al.: Authorship attribution for social media forensics. IEEE Trans. Inf. Forensics Secur. 12(1), 5–33 (2016)
Article Google Scholar
Ruder, S., Ghaffari, P., Breslin, J.G.: Character-level and multi-channel convolutional neural networks for large-scale authorship attribution. Insight Centre for Data Analytics. National University of Ireland Galway, Technical Report (2016)
Google Scholar
Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. In: NIPS (2017)
Google Scholar
Sapkota, U., Bethard, S., Montes, M., Solorio, T.: Not all character N-grams are created equal: a study in authorship attribution. In: NAACL (2015)
Google Scholar
Sari, Y., Stevenson, M., Vlachos, A.: Topic or style? Exploring the most useful features for authorship attribution. In: COLING (2018)
Google Scholar
Schwartz, R., Tsur, O., Rappoport, A., Koppel, M.: Authorship attribution of micro-messages. In: EMNLP (2013)
Google Scholar
Shrestha, P., Sierra, S., Gonzalez, F., Montes, M., Rosso, P., Solorio, T.: Convolutional neural networks for authorship attribution of short texts. In: EACL (2017)
Google Scholar
Stamatatos, E.: A survey of modern authorship attribution methods. J. Am. Soc. Inform. Sci. Technol. 60(3), 538–556 (2009)
Article Google Scholar
Sundararajan, K., Woodard, D.: What represents "style" in authorship attribution? In: COLING (2018)
Google Scholar
Xiao, C., Freeman, D.M., Hwa, T.: Detecting clusters of fake accounts in online social networks. In: ACM Workshop on Artificial Intelligence and Security (2015)
Google Scholar
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A.J., Hovy, E.H.: Hierarchical attention networks for document classification. In: NAACL: HLT (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Electronic Science and Technology of China, Chengdu, China
Zhiqiang Hu & Bo Dai
University of Saskatchewan, Saskatoon, Canada
Roy Ka-Wei Lee
Singapore Management University, Singapore, Singapore
Lei Wang & Ee-peng Lim

Authors

Zhiqiang Hu
View author publications
You can also search for this author in PubMed Google Scholar
Roy Ka-Wei Lee
View author publications
You can also search for this author in PubMed Google Scholar
Lei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Ee-peng Lim
View author publications
You can also search for this author in PubMed Google Scholar
Bo Dai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Roy Ka-Wei Lee .

Editor information

Editors and Affiliations

Tianjin University, Tianjin, China
Xin Wang
University of Melbourne, Melbourn, NSW, Australia
Rui Zhang
Kyung Hee University, Yongin, Korea (Democratic People's Republic of)
Young-Koo Lee
Nanjing University of Information Science and Technology, Nanjing, China
Le Sun
Kangwon National University, Chunchon, Korea (Republic of)
Yang-Sae Moon

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 273 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hu, Z., Lee, R.KW., Wang, L., Lim, Ep., Dai, B. (2020). DeepStyle: User Style Embedding for Authorship Attribution of Short Texts. In: Wang, X., Zhang, R., Lee, YK., Sun, L., Moon, YS. (eds) Web and Big Data. APWeb-WAIM 2020. Lecture Notes in Computer Science(), vol 12318. Springer, Cham. https://doi.org/10.1007/978-3-030-60290-1_17

Download citation

DOI: https://doi.org/10.1007/978-3-030-60290-1_17
Published: 14 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60289-5
Online ISBN: 978-3-030-60290-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

DeepStyle: User Style Embedding for Authorship Attribution of Short Texts

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Integrating RoBERTa Fine-Tuning and User Writing Styles for Authorship Attribution of Short Texts

Unifying Lexical, Syntactic, and Structural Representations of Written Language for Authorship Attribution

A Syntax-Aware Encoder for Authorship Attribution

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 273 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

DeepStyle: User Style Embedding for Authorship Attribution of Short Texts

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Integrating RoBERTa Fine-Tuning and User Writing Styles for Authorship Attribution of Short Texts

Unifying Lexical, Syntactic, and Structural Representations of Written Language for Authorship Attribution

A Syntax-Aware Encoder for Authorship Attribution

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 273 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation