Abstract
During the past decade, microblog services have been extensively utilized by millions of business and private users as one of the most powerful information broadcasting tools. For example, Twitter attracted many social science researchers due to its high popularity, constrained format of thought expression, and the ability to react actual trends. However, unstructured data from microblogs often suffer from the lack of representativeness due to the tremendous amount of noise. Such noise is often introduced by the activity of organizational and fake user ac-counts that may not be useful in many application domains. Aiming to tackle the information filtering problem, in this paper, we classify Twitter accounts into three categories: “Personal”, “Organization”, and “Personage”. Specifically, we utilize various text-based data representation approaches to extract features for our proposed microblog account type prediction framework “POP-MAP”. To study the problem at a cross-language level, we harvested and learned from a multi-lingual Twitter dataset, which allows us to achieve better classification performance, as compared to various state-of-the-art baselines.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
References
Aramaki, E., Maskawa, S., Morita, M.: Twitter catches the flu: detecting influenza epidemics using twitter. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1568–1576. Association for Computational Linguistics (2011)
Barone, L.: Which type of twitter account should you create? (2010). http://smallbiztrends.com/2010/02/types-of-twitter-accounts.html. Accessed 15 Apr 2016
Bartunov, S., Korshunov, A., Park, S.-T., Ryu, W., Lee, H.: Joint link-attribute user identity resolution in online social networks. In: Proceedings of the 6th International Conference on Knowledge Discovery and Data Mining, Workshop on Social Network Mining and Analysis. ACM (2012)
Boshmaf, Y., Muslukhov, I., Beznosov, K., Ripeanu, M.: Design and analysis of a social botnet. Comput. Netw. 57(2), 556–578 (2013)
Cao, Q., Sirivianos, M., Yang, X., Pregueiro, T.: Aiding the detection of fake accounts in large scale social online services. In: Presented as Part of the 9th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2012, pp. 197–210 (2012)
Chu, Z., Gianvecchio, S., Wang, H., Jajodia, S.: Who is tweeting on Twitter: human, bot, or cyborg? In: Proceedings of the 26th Annual Computer Security Applications Conference, pp. 21–30. ACM (2010)
Culotta, A.: Towards detecting influenza epidemics by analyzing twitter messages. In: Proceedings of the First Workshop on Social Media Analytics, pp. 115–122. ACM (2010)
Deitrick, W., Miller, Z., Valyou, B., Dickinson, B., Munson, T., Wei, H.: Gender identification on twitter using the modified balanced winnow. Commun. Netw. 4(3), 1–7 (2012)
Farseev, A., Akbari, M., Samborskii, I., Chua, T.-S.: 360° user profiling: past, future, and applications. ACM SIGWEB Newslett, (Summer), Article no. 4 (2016)
Farseev, A., Chua, T.-S.: TweetFit: fusing sensors and multiple social media for wellness profile learning. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence. AAAI (2017)
Farseev, A., Kotkov, D., Semenov, A., Veijalainen, J., Chua, T.-S.: Cross-social network collaborative recommendation. In: Proceedings of the ACM Web Science Conference, p. 38. ACM (2015)
Farseev, A., Nie, L., Akbari, M., Chua, T.-S.: Harvesting multiple sources for user profile learning: a big data study. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, pp. 235–242. ACM (2015)
Farseev, A., Samborskii, I., Chua, T.-S.: bBridge: a big data platform for social multimedia analytics. In: Proceedings of the 2016 ACM Conference on Multimedia, pp. 759–761. ACM (2016)
Filchenkov, A.A., Azarov, A.A., Abramov, M.V.: What is more predictable in social media: election outcome or protest action? In: Proceedings of the 2014 Conference on Electronic Governance and Open Society: Challenges in Eurasia, pp. 157–161. ACM (2014)
Hendler, J., Shadbolt, N., Hall, W., Berners-Lee, T., Weitzner, D.: Web science: an interdisciplinary approach to understanding the web. Commun. ACM 51(7), 60–69 (2008)
Kafeza, E., Kanavos, A., Makris, C., Vikatos, P.: T-PICE: Twitter personality based influential communities extraction system. In: 2014 IEEE International Congress on Big Data, pp. 212–219. IEEE (2014)
Lee, K., Agrawal, A., Choudhary, A.: Real-time disease surveillance using twitter data: demonstration on flu and cancer. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1474–1477. ACM (2013)
Lin, J.: Automatic author profiling of online chat logs. Ph.D. thesis, Monterey, California. Naval Postgraduate School (2007)
Lin, J., Sugiyama, K., Kan, M.-T., Chua, T.-S.: Addressing cold-start in app recommendation: latent user models constructed from twitter followers. In: Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 283–292. ACM (2013)
Oentaryo, R.J., Low, J.-W., Lim, E.-P.: Chalk and Cheese in twitter: discriminating personal and organization accounts. In: Hanbury, A., Kazai, G., Rauber, A., Fuhr, N. (eds.) ECIR 2015. LNCS, vol. 9022, pp. 465–476. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16354-3_51
Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)
Schwartz, H.A., et al.: Personality, gender, and age in the language of social media: the open-vocabulary approach. PLoS One 8(9), e73791 (2013)
Tavares, G., Faisal, A.: Scaling-laws of human broadcast communication enable distinction between human, corporate and robot twitter users. PLoS One 8(7), e65774 (2013)
Tsakalidis, A., Papadopoulos, S., Cristea, A.I., Kompatsiaris, Y.: Predicting elections for multiple countries using twitter and polls. IEEE Intell. Syst. 30(2), 10–17 (2015)
Varlamov, M.I., Turdakov, D.Y.: A survey of methods for the extraction of information from web resources. Program. Comput. Softw. 42(5), 279–291 (2016)
Wang, A.H.: Detecting spam bots in online social networking sites: a machine learning approach. In: Foresti, S., Jajodia, S. (eds.) DBSec 2010. LNCS, vol. 6166, pp. 335–342. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13739-6_25
Wang, G., Song, Q., Sun, H., Zhang, X., Xu, B., Zhou, Y.: A feature subset selection algorithm automatic recommendation method. J. Artif. Intell. Res. 47, 1–34 (2013)
Zhao, W.X., et al.: Comparing twitter and traditional media using topic models. In: Clough, P., et al. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 338–349. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20161-5_34
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Samborskii, I., Filchenkov, A., Korneev, G., Farseev, A. (2019). Person, Organization, or Personage: Towards User Account Type Prediction in Microblogs. In: Chugunov, A., Misnikov, Y., Roshchin, E., Trutnev, D. (eds) Electronic Governance and Open Society: Challenges in Eurasia. EGOSE 2018. Communications in Computer and Information Science, vol 947. Springer, Cham. https://doi.org/10.1007/978-3-030-13283-5_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-13283-5_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-13282-8
Online ISBN: 978-3-030-13283-5
eBook Packages: Computer ScienceComputer Science (R0)