Skip to main content

A Profile-Based Method for Authorship Verification

  • Conference paper
Artificial Intelligence: Methods and Applications (SETN 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8445))

Included in the following conference series:

Abstract

Authorship verification is one of the most challenging tasks in style-based text categorization. Given a set of documents, all by the same author, and another document of unknown authorship the question is whether or not the latter is also by that author. Recently, in the framework of the PAN-2013 evaluation lab, a competition in authorship verification was organized and the vast majority of submitted approaches, including the best performing models, followed the instance-based paradigm where each text sample by one author is treated separately. In this paper, we show that the profile-based paradigm (where all samples by one author are treated cumulatively) can be very effective surpassing the performance of PAN-2013 winners without using any information from external sources. The proposed approach is fully-trainable and we demonstrate an appropriate tuning of parameter settings for PAN-2013 corpora achieving accurate answers especially when the cost of false negatives is high.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aggarwal, C.C., Zhai, C.X.: A Survey of Text Classification Algorithms. In: Aggarwal, C.C., Zhai, C.X. (eds.) Mining Text Data, pp. 163–222. Springer (2012)

    Google Scholar 

  2. van Dam, M.: A Basic Character n-gram Approach to Authorship Verification – Notebook for PAN at CLEF 2013. In: Forner et al (eds.) [3] (2013)

    Google Scholar 

  3. Forner, P., Navigli, R., Tufis, D. (eds.): CLEF 2013 Evaluation Labs and Workshop –Working Notes Papers (2013)

    Google Scholar 

  4. Ghaeini, M.R.: Intrinsic Author Identification Using Modified Weighted KNN – Notebook for PAN at CLEF 2013. In: Forner et al (eds.) [3] (2013)

    Google Scholar 

  5. Halvani, O., Steinebach, M., Zimmermann, R.: Authorship Verification via k-Nearest Neighbor Estimation – Notebook for PAN at CLEF 2013. In: Forner et al (eds.) [3] (2013)

    Google Scholar 

  6. Holmes, D.I.: Authorship attribution. Computers and the Humanities 28, 87–106 (1994)

    Article  Google Scholar 

  7. Jankowska, M., Kešelj, V., Milios, E.: Proximity based One-class Classification with Common n-Gram Dissimilarity for Authorship Verification Task – Notebook for PAN at CLEF 2013. In: Forner et al (eds.) [3] (2013)

    Google Scholar 

  8. Juola, P.: Authorship Attribution. Foundations and Trends in IR 1, 234–334 (2008)

    Google Scholar 

  9. Juola, P., Stamatatos, E.: Overview of the Author Identification Taskat PAN 2013. In Forner et al (eds.) [3] (2013)

    Google Scholar 

  10. Keselj, V., Peng, F., Cercone, N., Thomas, C.: N-gram-based Author Profiles for Authorship Attribution. In: Proc. of the Pacific Association for Computational Linguistics, pp. 255–264 (2003)

    Google Scholar 

  11. Koppel, M., Schler, J., Bonchek-Dokow, E.: Measuring Differentiability: Unmasking Pseudonymous Authors. Journal of Machine Learning Research 8, 1261–1276 (2007)

    MATH  Google Scholar 

  12. Koppel, M., Schler, J., Argamon, S.: Authorship Attribution in the Wild. Language Resources and Evaluation 45, 83–94 (2011)

    Article  Google Scholar 

  13. Koppel, M., Schler, J., Argamon, S., Winter, Y.: The “Fundamental Problem” of Authorship Attribution. English Studies 93(3), 284–291 (2012)

    Article  Google Scholar 

  14. Koppel, M., Winter, Y.: Determining if Two Documents are by the Same Author. Journal of the American Society for Information Science and Technology 65(1), 178–187 (2014)

    Google Scholar 

  15. Layton, R., Watters, P., Dazeley, R.: Local n-grams for Author Identification – Notebook for PAN at CLEF 2013. In: Forner et al (eds.) [3] (2013)

    Google Scholar 

  16. Sanderson, C., Guenter, S.: Short Text Authorship Attribution via Sequence Kernels, Markov Chains and Author Unmasking: An Investigation. In: Proc. of the International Conference on Empirical Methods in Natural Language Engineering, pp. 482–491 (2006)

    Google Scholar 

  17. Seidman, S.: Authorship Verification Using the Impostors Method – Notebook for PAN at CLEF 2013. In: Forner et al (ed.) [3] (2013)

    Google Scholar 

  18. Stamatatos, E., Fakotakis, N., Kokkinakis, G.: Automatic Text Categorization in Terms of Genre and Author. Computational Linguistics 26(4), 471–495 (2000)

    Article  Google Scholar 

  19. Stamatatos. E.: Author Identification Using Imbalanced and Limited Training Texts. In: Proc. of the 4th International Workshop on Text-based Information Retrieval (2007)

    Google Scholar 

  20. Stamatatos, E.: A Survey of Modern Authorship Attribution Methods. Journal of the American Society for Information Science and Technology 60, 538–556 (2009)

    Article  Google Scholar 

  21. Stamatatos, E.: Intrinsic Plagiarism Detection Using Character n-gram Profiles. In: Proc. of the 3rd Int. Workshop on Uncovering Plagiarism, Authorship, and Social Software Misuse (2009)

    Google Scholar 

  22. Veenman, C.J., Li, Z.: Authorship Verification with Compression Features – Notebook for PAN at CLEF 2013. In: Forner et al (eds.) [3] (2013)

    Google Scholar 

  23. Vilariño, D., Pinto, D., Gómez, H., León, S., Castillo, E.: Lexical-Syntactic and Graph-Based Features for Authorship Verification – Notebook for PAN at CLEF 2013. In: Forner et al (eds.) [3] (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Potha, N., Stamatatos, E. (2014). A Profile-Based Method for Authorship Verification. In: Likas, A., Blekas, K., Kalles, D. (eds) Artificial Intelligence: Methods and Applications. SETN 2014. Lecture Notes in Computer Science(), vol 8445. Springer, Cham. https://doi.org/10.1007/978-3-319-07064-3_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-07064-3_25

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-07063-6

  • Online ISBN: 978-3-319-07064-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics