Person Versus Non-person Classification of Twitter Handle

Budania, Himanshu; Singh, Pramod Kumar

doi:10.1007/978-3-319-76351-4_11

Person Versus Non-person Classification of Twitter Handle

Himanshu Budania¹⁸ &
Pramod Kumar Singh¹⁸

Conference paper
First Online: 16 March 2018

862 Accesses
1 Citations

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 734))

Abstract

As marketing based upon the social media profiles is growing at a great pace, finding authenticity of social media accounts is vital for brands. This paper addresses the task of user classification in the micro blogging social media Twitter. We aim to identify whether a Twitter handle is a real person or not. It is done in two steps. First, we segregate human and bot twitter handles to discard the latter. Secondly, we classify whether a human twitter handle is a real person or non-person, e.g., an organization. For the first step, we use a Twitterati identification system [16] which computes various statistical measures from the tweets and use them to segregate human and bot twitter handles. For the second step we use two most widely used and well performing classifiers linear regression (LR) and support vector machine (SVM) for classification of human twitter handles as real person or non-person. We find that SVM outperforms LR. Moreover, the performance of SVM (F1-score = 0.9310) indicates that the proposed method may be used practically in real-life application.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Bontcheva, K., Derczynski, L., Funk, A., Greenwood, M., Maynard, D., Aswani, N.: TwitIE: an open-source information extraction pipeline for microblog text. In: Proceedings of the International Conference on Recent Advances in Natural Language Processing. Association for Computational Linguistics (2013)
Google Scholar
Bruce, R.F., Wiebe, J.M.: Recognizing subjectivity: a case study in manual tagging. Nat. Lang. Eng. 5(2), 187–205 (1999)
Article Google Scholar
Chu, Z., Gianvecchio, S., Wang, H., Jajodia, S.: Detecting automation of twitter accounts: Are you a human, bot, or cyborg? IEEE Trans. Dependable Secur. Comput. 9(6), 811–824 (2012)
Article Google Scholar
Eberhardt, J.J.: Bayesian spam detection. Sch. Horiz. Univ. Minn. Morris Undergraduate J. 2(1) (2015)
Google Scholar
Gunn, S.R.: Support vector machines for classification and regression. Isis technical report, School of Electronics and Computer Science, University of Southampton (1998)
Google Scholar
Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd. edn. Morgan Kaufmann (2011)
Google Scholar
Kamal, A.: Subjectivity classification using machine learning techniques for mining feature-opinion pairs from web opinion sources (2013). http://arxiv.org/abs/1312.6962
Klout: Be known for what you love (2014). https://klout.com/home/
Lee, A.J., Seber, G.A.F.: Linear Regression Analysis, 2nd edn. Wiley, Hoboken (2003)
MATH Google Scholar
McCord, M., Chuah, M.: Spam detection on twitter using traditional classifiers. In: Proceedings of the International Conference on Autonomic and Trusted Computing (ATC), pp. 175–186. Springer, Berlin, Heidelberg (2011)
Google Scholar
Quinlan, R.: C4.5, March 2014. https://ww.mgt.ncu.edu.tw/wabble/School/C45.ppt
Riloff, E.M., Phillips, W.: Introduction to the sundance and autoslog systems. Technical Report UUCS-04-015, 1-47.7, School of Computing: The University of Utah (2004)
Google Scholar
Tao, K., Abel, F., Hauff, C., Houben, G.J., Gadiraju, U.: Groundhog day: near-duplicate detection on twitter. In: Proceedings of the 22nd International Conference on World Wide Web (WWW), pp. 1273–1284. ACM, New York (2013)
Google Scholar
Weka: Data mining with open source machine learning (2014). https://www.cs.waikato.ac.nz/ml/weka
Wiebe, J., Riloff, E.: Creating subjective and objective sentence classifiers from unannotated texts. In: Proceedings of the 6th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing), pp. 486–497 (2005)
Chapter Google Scholar
Winnie Main, N.S.: Twitterati identification system. In: Proceedings of the International Conference on Advanced Computing Technologies and Applications (ICACTA), vol. 45, pp. 32–41 (2015)
Article Google Scholar
Yang, C., Harkreader, R., Gu, G.: Empirical evaluation and new design for fighting evolving twitter spammers. Trans. Info. For. Sec. 8(8), 1280–1293 (2013)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computational Intelligence and Data Mining Research Laboratory, ABV – Indian Institute of Information Technology and Management Gwalior, Gwalior, MP, India
Himanshu Budania & Pramod Kumar Singh

Authors

Himanshu Budania
View author publications
You can also search for this author in PubMed Google Scholar
Pramod Kumar Singh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Himanshu Budania .

Editor information

Editors and Affiliations

Machine Intelligence Research Labs, Auburn, WA, USA
Ajith Abraham
Department of Computer Science, South Asian University, Chanakyapuri, Delhi, India
Pranab Kr. Muhuri
Faculty of Information and Communication Technology, Universiti Teknikal Malaysia Melaka, Durian Tunggal, Melaka, Malaysia
Azah Kamilah Muda
Machine Intelligence Research Labs, Auburn, WA, USA
Niketa Gandhi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Budania, H., Singh, P.K. (2018). Person Versus Non-person Classification of Twitter Handle. In: Abraham, A., Muhuri, P., Muda, A., Gandhi, N. (eds) Hybrid Intelligent Systems. HIS 2017. Advances in Intelligent Systems and Computing, vol 734. Springer, Cham. https://doi.org/10.1007/978-3-319-76351-4_11

Download citation

DOI: https://doi.org/10.1007/978-3-319-76351-4_11
Published: 16 March 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-76350-7
Online ISBN: 978-3-319-76351-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics