short-paper

On De-anonymization of Single Tweet Messages

Authors:
Hoi Le

Vietnam National University Hanoi, Hanoi, Vietnam

Vietnam National University Hanoi, Hanoi, Vietnam
View Profile

,
Reihaneh Safavi-Naini

University of Calgary, Calgary, AB, Canada

University of Calgary, Calgary, AB, Canada
View Profile

IWSPA '18: Proceedings of the Fourth ACM International Workshop on Security and Privacy AnalyticsMarch 2018Pages 8–14https://doi.org/10.1145/3180445.3180451

Published:21 March 2018Publication History

IWSPA '18: Proceedings of the Fourth ACM International Workshop on Security and Privacy Analytics

Pages 8–14

ABSTRACT

In this work, we address the question of whether the authorship of a single tweet can be successfully identified (and in a mixed set with other authors). Here, we present a new authorship identification scheme, which is useful in detecting authorship of short texts such as tweets, in case where only single messages are available. Our authorship identification scheme relies on selecting features that work for the special setting and combine them in order to obtain a better accuracy. This technique demonstrates significant results through out our experiments. Our results can be used to detect authors of illegitimate tweets, fake tweets in a Twitter account or break the privacy of a multi-user account by showing the authors who participate in it.

References

{n. d.}. Twitter Blogs. Following rules and best practices. ({n. d.}). https://support.twitter.com/entries/68916-following-rules-and-best-practices.Google Scholar
Ahmed Abbasi and Hsinchun Chen. 2008. Writeprints: A Stylometric Approach to Identity-level Identification and Similarity Detection in Cyberspace. ACM Trans. Inf. Syst. 26, 2, Article 7 (April 2008), 29 pages. Google ScholarDigital Library
Sadia Afroz, Aylin Caliskan Islam, Ariel Stolerman, Rachel Greenstadt, and Damon McCoy. 2014. Doppelganger Finder: Taking Stylometry to the Underground. In Proceedings of the 2014 IEEE Symposium on Security and Privacy (SP '14). IEEE Computer Society, Washington, DC, USA, 212--226. Google ScholarDigital Library
Mudit Bhargava, Pulkit Mehndiratta, and Krishna Asawa. 2013. Stylometric Analysis for Authorship Attribution on Twitter. In Proceedings of the Second International Conference on Big Data Analytics - Volume 8302 (BDA 2013). Springer-Verlag New York, Inc., New York, NY, USA, 37--47. Google ScholarDigital Library
Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten. 2009. The WEKA Data Mining Software: An Update. SIGKDD Explor. Newsl. 11, 1 (Nov. 2009), 10--18. Google ScholarDigital Library
Twitter Inc. 2014. Twitter4J API. (2014). http://twitter4j.org/.Google Scholar
Robert Layton, Paul Watters, and Richard Dazeley. 2010. Authorship Attribution for Twitter in 140 Characters or Less. In Proceedings of the 2010 Second Cyber-crime and Trustworthy Computing Workshop (CTC '10). IEEE Computer Society, Washington, DC, USA, 1--8. Google ScholarDigital Library
Christopher D. Manning and Hinrich Schütze. 1999. Foundations of statistical natural language processing. MIT Press, Cambridge, MA, USA. Google ScholarDigital Library
Mishari Al Mishari, Dali Kaafar, Gene Tsudik, and Ekin Oguz. 2014. Are 140 Characters Enough? A Large-Scale Linkability Study of Tweets. CoRR abs/1406.2746 (2014). http://arxiv.org/abs/1406.2746Google Scholar
Mishari Al Mishari and Gene Tsudik. 2011. Exploring Linkablility of Community Reviewing. CoRR abs/1111.0338 (2011).Google Scholar
T. M. Mitchell. 1997. Machine learning. McGraw Hill, New York. Google ScholarDigital Library
Arvind Narayanan, Hristo Paskov, Neil Zhenqiang Gong, John Bethencourt, Eui Chul, Richard Shin, and Dawn Song. 2012. On the Feasibility of Internet-scale Author Identification. In Proceedings of the 33rd conference on IEEE Sympsoium on Security and Privacy. IEEE. Google ScholarDigital Library
Arvind Narayanan and Vitaly Shmatikov. 2008. Robust De-anonymization of Large Sparse Datasets. In Proceedings of the 2008 IEEE Symposium on Security and Privacy (SP '08). IEEE Computer Society, Washington, DC, USA, 111--125. Google ScholarDigital Library
Telegraph News. {n. d.}. Female MPs were sent 25,000 abusive Twitter messages in just six months - with half of them directed at Diane Abbott. ({n. d.}). http://www.telegraph.co.uk/news/2017/09/04/female-mps-sent-25000-abusive-twitter-messages-just-six-months/.Google Scholar
Rebekah Overdorf and Rachel Greenstadt. 2016. Blogs, Twitter Feeds, and Reddit Comments: Cross-domain Authorship Attribution. Proceedings on Privacy Enhancing Technologies 3 (July 2016), 155--171.Google ScholarCross Ref
Roy Schwartz, Oren Tsur, Ari Rappoport, and Moshe Koppel. 2013. Authorship Attribution of Micro-Messages. In EMNLP. ACL, 1880--1891. http://dblp.uni-trier.de/db/conf/emnlp/emnlp2013.html#SchwartzTRK13Google Scholar
Rui Sousa Silva, Gustavo Laboreiro, Luís Sarmento, Tim Grant, Eugénio Oliveira, and Belinda Maia. 2011. 'Twazn Me!!! ;('Automatic Authorship Analysis of Micro-blogging Messages. In Proceedings of the 16th International Conference on Natural Language Processing and Information Systems (NLDB'11). Springer-Verlag, Berlin, Heidelberg, 161--168. http://dl.acm.org/citation.cfm?id=2026011.2026029 Google ScholarDigital Library
Jonghyuk Song, Sangho Lee, and Jong Kim. 2015. CrowdTarget: Target-based Detection of Crowdturfing in Online Social Networks. In Proceedings of the 22Nd ACM SIGSAC Conference on Computer and Communications Security (CCS '15). ACM, New York, NY, USA, 793--804. Google ScholarDigital Library
Efstathios Stamatatos. 2009. A Survey of Modern Authorship Attribution Methods. J. Am. Soc. Inf. Sci. Technol. 60, 3 (March 2009), 538--556. Google ScholarCross Ref
Efstathios Stamatatos, George Kokkinakis, and Nikos Fakotakis. 2000. Automatic Text Categorization in Terms of Genre and Author. Comput. Linguist. 26, 4 (Dec. 2000), 471--495. Google ScholarDigital Library
www.tripwire.com. {n. d.}. A Guide on 5 Common Twitter Scams. ({n. d.}). https://www.tripwire.com/state-of-security/security-awareness/a-guide-on-5-common-twitter-scams/.Google Scholar
Rong Zheng, Jiexun Li, Hsinchun Chen, and Zan Huang. 2006. A Framework for Authorship Identification of Online Messages: Writing-style Features and Classification Techniques. Journal of the American Society for Information Science and Technology 57, 3 (2006), 378--393. Google ScholarDigital Library
Rong Zheng, Jiexun Li, Hsinchun Chen, and Zan Huang. 2006. A Framework for Authorship Identification of Online Messages: Writing-style Features and Classification Techniques. J. Am. Soc. Inf. Sci. Technol. 57, 3 (Feb. 2006), 378--393. Google ScholarDigital Library

Index Terms

On De-anonymization of Single Tweet Messages
1. Security and privacy

Recommendations

Predicting Tweet Retweetability during Hurricane Disasters

Twitter is a vital source for obtaining information, especially during events such as natural disasters. Users can spread information on Twitter either by crafting new posts, which are called "tweets," or by using the retweet mechanism to re-post ...
Read More
Multitask learning for blackmarket tweet detection
ASONAM '19: Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

Online social media platforms have made the world more connected than ever before, thereby making it easier for everyone to spread their content across a wide variety of audiences. Twitter is one such popular platform where people publish tweets to ...
Read More
Tweet Properly: Analyzing Deleted Tweets to Understand and Identify Regrettable Ones
WWW '16: Proceedings of the 25th International Conference on World Wide Web

Inappropriate tweets can cause severe damages on authors' reputation or privacy. However, many users do not realize the negative consequences until they publish these tweets. Published tweets have lasting effects that may not be eliminated by simple ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
IWSPA '18: Proceedings of the Fourth ACM International Workshop on Security and Privacy Analytics
March 2018
72 pages
ISBN:9781450356343
DOI:10.1145/3180445
General Chair:
Rakesh Verma
University of Houston, USA
,
Program Chairs:
Murat Kantarcioglu
University of Texas - Dallas, USA
,
Rakesh Verma
University of Houston, USA
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 March 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
anonymity
online social networks
privacy
stylometry
twitter
Qualifiers
- short-paper
Conference

Acceptance Rates
IWSPA '18 Paper Acceptance Rate4of11submissions,36%Overall Acceptance Rate18of58submissions,31%
More
Upcoming Conference
CODASPY '24

Sponsor:

sigsac

Fourteenth ACM Conference on Data and Application Security and Privacy

June 19 - 21, 2024

Porto , Portugal
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 138
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

On De-anonymization of Single Tweet Messages

IWSPA '18: Proceedings of the Fourth ACM International Workshop on Security and Privacy Analytics

ABSTRACT

References

Cited By

Index Terms

Recommendations

Predicting Tweet Retweetability during Hurricane Disasters

Multitask learning for blackmarket tweet detection

Tweet Properly: Analyzing Deleted Tweets to Understand and Identify Regrettable Ones

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

On De-anonymization of Single Tweet Messages

IWSPA '18: Proceedings of the Fourth ACM International Workshop on Security and Privacy Analytics

ABSTRACT

References

Cited By

Index Terms

Recommendations

Predicting Tweet Retweetability during Hurricane Disasters

Multitask learning for blackmarket tweet detection

Tweet Properly: Analyzing Deleted Tweets to Understand and Identify Regrettable Ones

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media