skip to main content
10.1145/3180445.3180451acmconferencesArticle/Chapter ViewAbstractPublication PagescodaspyConference Proceedingsconference-collections
short-paper

On De-anonymization of Single Tweet Messages

Published:21 March 2018Publication History

ABSTRACT

In this work, we address the question of whether the authorship of a single tweet can be successfully identified (and in a mixed set with other authors). Here, we present a new authorship identification scheme, which is useful in detecting authorship of short texts such as tweets, in case where only single messages are available. Our authorship identification scheme relies on selecting features that work for the special setting and combine them in order to obtain a better accuracy. This technique demonstrates significant results through out our experiments. Our results can be used to detect authors of illegitimate tweets, fake tweets in a Twitter account or break the privacy of a multi-user account by showing the authors who participate in it.

References

  1. {n. d.}. Twitter Blogs. Following rules and best practices. ({n. d.}). https://support.twitter.com/entries/68916-following-rules-and-best-practices.Google ScholarGoogle Scholar
  2. Ahmed Abbasi and Hsinchun Chen. 2008. Writeprints: A Stylometric Approach to Identity-level Identification and Similarity Detection in Cyberspace. ACM Trans. Inf. Syst. 26, 2, Article 7 (April 2008), 29 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Sadia Afroz, Aylin Caliskan Islam, Ariel Stolerman, Rachel Greenstadt, and Damon McCoy. 2014. Doppelganger Finder: Taking Stylometry to the Underground. In Proceedings of the 2014 IEEE Symposium on Security and Privacy (SP '14). IEEE Computer Society, Washington, DC, USA, 212--226. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Mudit Bhargava, Pulkit Mehndiratta, and Krishna Asawa. 2013. Stylometric Analysis for Authorship Attribution on Twitter. In Proceedings of the Second International Conference on Big Data Analytics - Volume 8302 (BDA 2013). Springer-Verlag New York, Inc., New York, NY, USA, 37--47. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten. 2009. The WEKA Data Mining Software: An Update. SIGKDD Explor. Newsl. 11, 1 (Nov. 2009), 10--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Twitter Inc. 2014. Twitter4J API. (2014). http://twitter4j.org/.Google ScholarGoogle Scholar
  7. Robert Layton, Paul Watters, and Richard Dazeley. 2010. Authorship Attribution for Twitter in 140 Characters or Less. In Proceedings of the 2010 Second Cyber-crime and Trustworthy Computing Workshop (CTC '10). IEEE Computer Society, Washington, DC, USA, 1--8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Christopher D. Manning and Hinrich Schütze. 1999. Foundations of statistical natural language processing. MIT Press, Cambridge, MA, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Mishari Al Mishari, Dali Kaafar, Gene Tsudik, and Ekin Oguz. 2014. Are 140 Characters Enough? A Large-Scale Linkability Study of Tweets. CoRR abs/1406.2746 (2014). http://arxiv.org/abs/1406.2746Google ScholarGoogle Scholar
  10. Mishari Al Mishari and Gene Tsudik. 2011. Exploring Linkablility of Community Reviewing. CoRR abs/1111.0338 (2011).Google ScholarGoogle Scholar
  11. T. M. Mitchell. 1997. Machine learning. McGraw Hill, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Arvind Narayanan, Hristo Paskov, Neil Zhenqiang Gong, John Bethencourt, Eui Chul, Richard Shin, and Dawn Song. 2012. On the Feasibility of Internet-scale Author Identification. In Proceedings of the 33rd conference on IEEE Sympsoium on Security and Privacy. IEEE. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Arvind Narayanan and Vitaly Shmatikov. 2008. Robust De-anonymization of Large Sparse Datasets. In Proceedings of the 2008 IEEE Symposium on Security and Privacy (SP '08). IEEE Computer Society, Washington, DC, USA, 111--125. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Telegraph News. {n. d.}. Female MPs were sent 25,000 abusive Twitter messages in just six months - with half of them directed at Diane Abbott. ({n. d.}). http://www.telegraph.co.uk/news/2017/09/04/female-mps-sent-25000-abusive-twitter-messages-just-six-months/.Google ScholarGoogle Scholar
  15. Rebekah Overdorf and Rachel Greenstadt. 2016. Blogs, Twitter Feeds, and Reddit Comments: Cross-domain Authorship Attribution. Proceedings on Privacy Enhancing Technologies 3 (July 2016), 155--171.Google ScholarGoogle ScholarCross RefCross Ref
  16. Roy Schwartz, Oren Tsur, Ari Rappoport, and Moshe Koppel. 2013. Authorship Attribution of Micro-Messages. In EMNLP. ACL, 1880--1891. http://dblp.uni-trier.de/db/conf/emnlp/emnlp2013.html#SchwartzTRK13Google ScholarGoogle Scholar
  17. Rui Sousa Silva, Gustavo Laboreiro, Luís Sarmento, Tim Grant, Eugénio Oliveira, and Belinda Maia. 2011. 'Twazn Me!!! ;('Automatic Authorship Analysis of Micro-blogging Messages. In Proceedings of the 16th International Conference on Natural Language Processing and Information Systems (NLDB'11). Springer-Verlag, Berlin, Heidelberg, 161--168. http://dl.acm.org/citation.cfm?id=2026011.2026029 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Jonghyuk Song, Sangho Lee, and Jong Kim. 2015. CrowdTarget: Target-based Detection of Crowdturfing in Online Social Networks. In Proceedings of the 22Nd ACM SIGSAC Conference on Computer and Communications Security (CCS '15). ACM, New York, NY, USA, 793--804. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Efstathios Stamatatos. 2009. A Survey of Modern Authorship Attribution Methods. J. Am. Soc. Inf. Sci. Technol. 60, 3 (March 2009), 538--556. Google ScholarGoogle ScholarCross RefCross Ref
  20. Efstathios Stamatatos, George Kokkinakis, and Nikos Fakotakis. 2000. Automatic Text Categorization in Terms of Genre and Author. Comput. Linguist. 26, 4 (Dec. 2000), 471--495. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. www.tripwire.com. {n. d.}. A Guide on 5 Common Twitter Scams. ({n. d.}). https://www.tripwire.com/state-of-security/security-awareness/a-guide-on-5-common-twitter-scams/.Google ScholarGoogle Scholar
  22. Rong Zheng, Jiexun Li, Hsinchun Chen, and Zan Huang. 2006. A Framework for Authorship Identification of Online Messages: Writing-style Features and Classification Techniques. Journal of the American Society for Information Science and Technology 57, 3 (2006), 378--393. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Rong Zheng, Jiexun Li, Hsinchun Chen, and Zan Huang. 2006. A Framework for Authorship Identification of Online Messages: Writing-style Features and Classification Techniques. J. Am. Soc. Inf. Sci. Technol. 57, 3 (Feb. 2006), 378--393. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. On De-anonymization of Single Tweet Messages

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          IWSPA '18: Proceedings of the Fourth ACM International Workshop on Security and Privacy Analytics
          March 2018
          72 pages
          ISBN:9781450356343
          DOI:10.1145/3180445

          Copyright © 2018 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 21 March 2018

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • short-paper

          Acceptance Rates

          IWSPA '18 Paper Acceptance Rate4of11submissions,36%Overall Acceptance Rate18of58submissions,31%

          Upcoming Conference

          CODASPY '24

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader