Skip to main content

Rethinking the Evaluation Methodology of Authorship Verification Methods

  • Conference paper
  • First Online:
Experimental IR Meets Multilinguality, Multimodality, and Interaction (CLEF 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11018))

Abstract

Authorship verification (AV) concerns itself with the task to judge, if two or more documents have been written by the same person. Even though an increase of research activities in the last years can be observed, it can also be clearly seen that AV suffers of well-defined evaluation standards. Based on a comprehensive literature review of more than 50 research works including conference papers, journals, bachelor’s/master’s theses and doctoral dissertations, we could not identify consistent evaluation procedures that adequately reflect the reliability of AV methods. To counteract this, we propose an alternative evaluation methodology based on the construction of reliable corpora in combination with a more suitable performance measure. In an experimental setup our approach reveals the weakness of a number of existing and successful AV methods, in particular, when it comes to accept as many documents of the true author, while at the same time reject as many documents of other authors, as possible.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Available under: http://bit.ly/CLEF_2018.

  2. 2.

    Available under: https://www.cs.cmu.edu/~enron.

  3. 3.

    In fact, we reimplemented two additional AV approaches [2, 4], but due to reproduction problems we had to discard them.

References

  1. Barbon Jr., S., Igawa, R.A., Bogaz Zarpelão, B.: Authorship verification applied to detection of compromised accounts on online social networks. Multimed. Tools Appl. 76(3), 3213–3233 (2017)

    Article  Google Scholar 

  2. Boukhaled, M.A., Ganascia, J.-G.: Probabilistic anomaly detection method for authorship verification. In: Besacier, L., Dediu, A.-H., Martín-Vide, C. (eds.) SLSP 2014. LNCS (LNAI), vol. 8791, pp. 211–219. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11397-5_16

    Chapter  Google Scholar 

  3. Brennan, M.R., Greenstadt, R.: Practical attacks against authorship recognition techniques. In: Haigh, K.Z., Rychtyckyj, N. (eds.) IAAI. AAAI (2009)

    Google Scholar 

  4. Brocardo, M.L., Traore, I., Woungang, I.: Toward a framework for continuous authentication using stylometry. In: 2014 IEEE 28th International Conference on Advanced Information Networking and Applications, pp. 106–115, May 2014

    Google Scholar 

  5. Cappellato, L., Ferro, N., Jones, G.J.F., San Juan, E. (eds.): Working Notes for CLEF 2015 Conference, Toulouse, France, 8–11 September 2015, CEUR Workshop Proceedings, vol. 1391. CEUR-WS.org (2015)

    Google Scholar 

  6. Castro Castro, D., Adame Arcia, Y., Pelaez Brioso, M., Muñoz Guillena, R.: Authorship verification, average similarity analysis. In: Proceedings of the International Conference Recent Advances in Natural Language Processing, pp. 84–90. INCOMA Ltd., Shoumen (2015)

    Google Scholar 

  7. Forner, P., Navigli, R., Tufis, D., Ferro, N. (eds.): Working Notes for CLEF 2013 Conference, Valencia, Spain, 23–26 September 2013, CEUR Workshop Proceedings, vol. 1179. CEUR-WS.org (2014)

    Google Scholar 

  8. Halvani, O., Graner, L., Vogel, I.: Authorship verification in the absence of explicit features and thresholds. In: Pasi, G., Piwowarski, B., Azzopardi, L., Hanbury, A. (eds.) ECIR 2018. LNCS, vol. 10772, pp. 454–465. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-76941-7_34

    Chapter  Google Scholar 

  9. Halvani, O., Steinebach, M.: An efficient intrinsic authorship verification scheme based on ensemble learning. In: Ninth International Conference on Availability. Reliability and Security, ARES 2014, Fribourg, Switzerland, 8–12 September 2014, pp. 571–578. IEEE Computer Society, Washington, DC (2014)

    Google Scholar 

  10. Halvani, O., Winter, C., Graner, L.: On the usefulness of compression models for authorship verification. In: Proceedings of the 12th International Conference on Availability, Reliability and Security, ARES 2017, pp. 54:1–54:10. ACM, New York (2017)

    Google Scholar 

  11. Halvani, O., Winter, C., Pflug, A.: Authorship verification for different languages, genres and topics. Digit. Investig. 16(S), S33–S43 (2016)

    Article  Google Scholar 

  12. Hürlimann, M., Weck, B., von den Berg, E., Šuster, S., Nissim, M.: GLAD: groningen lightweight authorship detection. In: Cappellato et al. [5]

    Google Scholar 

  13. Iqbal, F., Khan, L.A., Fung, B.C.M., Debbabi, M.: e-Mail authorship verification for forensic investigation. In: Proceedings of the 2010 ACM Symposium on Applied Computing, SAC 2010, pp. 1591–1598. ACM, New York (2010)

    Google Scholar 

  14. Jankowska, M., Milios, E.E., Keselj, V.: Author verification using common n-gram profiles of text documents. In: Hajic, J., Tsujii, J. (eds.) 25th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, COLING 2014, Dublin, Ireland, 23–29 August 2014, pp. 387–397. ACL (2014)

    Google Scholar 

  15. Noecker Jr., J., Ryan, M.: Distractorless authorship verification. In: Calzolari, N., et al. (eds.) Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC 2012). European Language Resources Association (ELRA), Istanbul, May 2012

    Google Scholar 

  16. Juola, P., Stamatatos, E.: Overview of the author identification task at PAN 2013. In: Forner et al. [7]

    Google Scholar 

  17. Kocher, M., Savoy, J.: A simple and efficient algorithm for authorship verification. J. Assoc. Inf. Sci. Technol. 68(1), 259–269 (2017)

    Article  Google Scholar 

  18. Koppel, M., Schler, J.: Authorship verification as a one-class classification problem. In: Brodley, C.E. (ed.) Machine Learning, Proceedings of the Twenty-First International Conference (ICML 2004), ACM International Conference Proceeding Series, Banff, Alberta, Canada, 4–8 July 2004, vol. 69. ACM (2004)

    Google Scholar 

  19. Koppel, M., Schler, J., Bonchek-Dokow, E.: Measuring differentiability: unmasking pseudonymous authors. J. Mach. Learn. Res. 8, 1261–1276 (2007)

    MATH  Google Scholar 

  20. Lewis, D.D., Yang, Y., Rose, T.G., Li, F.: RCV1: a new benchmark collection for text categorization research. J. Mach. Learn. Res. 5, 361–397 (2004)

    Google Scholar 

  21. Meister, J.C. (ed.): Evaluating Unmasking for Cross-Genre Authorship Verification. Hamburg, Germany (2012)

    Google Scholar 

  22. Petmanson, T.: Authorship verification of opinion pieces in Estonian. Eesti Rakenduslingvistika Uhingu Aastaraamat 10, 259–267 (2014)

    Article  Google Scholar 

  23. Potha, N., Stamatatos, E.: A profile-based method for authorship verification. In: Likas, A., Blekas, K., Kalles, D. (eds.) SETN 2014. LNCS (LNAI), vol. 8445, pp. 313–326. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07064-3_25

    Chapter  Google Scholar 

  24. Potha, N., Stamatatos, E.: An improved impostors method for authorship verification. In: Jones, G.J.F., et al. (eds.) CLEF 2017. LNCS, vol. 10456, pp. 138–144. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-65813-1_14

    Chapter  Google Scholar 

  25. Potthast, M., Kiesel, J., Reinartz, K., Bevendorff, J., Stein, B.: A stylometric inquiry into hyperpartisan and fake news. ArXiv e-prints, February 2017

    Google Scholar 

  26. Rexha, A., Kröll, M., Ziak, H., Kern, R.: Extending scientific literature search by including the author’s writing style. In: Mayr, P., Frommholz, I., Cabanac, G. (eds.) Proceedings of the Fifth Workshop on Bibliometric-enhanced Information Retrieval (BIR) Co-located with the 39th European Conference on Information Retrieval (ECIR 2017), Aberdeen, UK, 9th April 2017. CEUR Workshop Proceedings, vol. 1823, pp. 93–100. CEUR-WS.org (2017)

    Google Scholar 

  27. Seidman, S.: Authorship verification using the impostors method notebook for PAN at CLEF 2013. In: Forner et al. [7]

    Google Scholar 

  28. Stamatatos, E., et al.: Overview of the author identification task at PAN 2015. In: Cappellato et al. [5]

    Google Scholar 

  29. Stamatatos, E., et al.: Overview of the author identification task at PAN 2014. In: Cappellato, L., Ferro, N., Halvey, M., Kraaij, W. (eds.) Working Notes for CLEF 2014 Conference, Sheffield, UK, 15–18 September 2014. CEUR Workshop Proceedings, vol. 1180, pp. 877–897. CEUR-WS.org (2014)

    Google Scholar 

  30. Stein, B., Lipka, N., zu Eissen, S.M.: Meta analysis within authorship verification. In: 19th International Workshop on Database and Expert Systems Applications (DEXA 2008), Turin, Italy, 1–5 September 2008, pp. 34–39. IEEE Computer Society (2008)

    Google Scholar 

Download references

Acknowledgments

This work was supported by the German Federal Ministry of Education and Research (BMBF) under the project “DORIAN” (Scrutinise and thwart disinformation). We would like to thank our mock reviewer Christian Winter for his valuable comments that helped to improve the quality of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Oren Halvani .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Halvani, O., Graner, L. (2018). Rethinking the Evaluation Methodology of Authorship Verification Methods. In: Bellot, P., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2018. Lecture Notes in Computer Science(), vol 11018. Springer, Cham. https://doi.org/10.1007/978-3-319-98932-7_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-98932-7_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-98931-0

  • Online ISBN: 978-3-319-98932-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics