Conferences >2015 IEEE International Confe...

6 million spam tweets: A large ground truth for timely Twitter spam detection

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Twitter has changed the way of communication and getting news for people's daily life in recent years. Meanwhile, due to the popularity of Twitter, it also becomes a main...Show More

Metadata

Abstract:

Twitter has changed the way of communication and getting news for people's daily life in recent years. Meanwhile, due to the popularity of Twitter, it also becomes a main target for spamming activities. In order to stop spammers, Twitter is using Google SafeBrowsing to detect and block spam links. Despite that blacklists can block malicious URLs embedded in tweets, their lagging time hinders the ability to protect users in real-time. Thus, researchers begin to apply different machine learning algorithms to detect Twitter spam. However, there is no comprehensive evaluation on each algorithms' performance for real-time Twitter spam detection due to the lack of large groundtruth. To carry out a thorough evaluation, we collected a large dataset of over 600 million public tweets. We further labelled around 6.5 million spam tweets and extracted 12 light-weight features, which can be used for online detection. In addition, we have conducted a number of experiments on six machine learning algorithms under various conditions to better understand their effectiveness and weakness for timely Twitter spam detection. We will make our labelled dataset for researchers who are interested in validating or extending our work.

Published in: 2015 IEEE International Conference on Communications (ICC)

Date of Conference: 08-12 June 2015

Date Added to IEEE Xplore: 10 September 2015

ISBN Information:

ISSN Information:

DOI: 10.1109/ICC.2015.7249453

Conference Location: London, UK

Contents

References is not available for this document.

6 million spam tweets: A large ground truth for timely Twitter spam detection

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

6 million spam tweets: A large ground truth for timely Twitter spam detection

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?