Abstract
Can we detect anomalies and abuse among users of commenting platforms? Commenting has become a significant activity and specialized platforms provide commenting capability to many popular websites, such as Huffington Post. These platforms have become a new type of online social interaction, but have received very little attention. We conduct an extensive study on 19M comments from Disqus, one of the largest commenting platforms. Our work consists of two thrusts: (a) we identify features and patterns of commenting behavior, and (b) we detect peculiar and parasitic users. First, we study and evaluate features of user behavior that capture different aspects: user-user interaction (“social”), user-article interaction (“engagement”), and temporal properties. We also develop a method which we call, DownTimeFinder, to determine users’ downtime (think night-time) in their daily behavior, which helps identify three major groups of users based on their utilization (3, 9, 15 h of up-time). Second, we identify surprising and abnormal behaviors using our features. Interestingly, we find: (a) two tightly collaborative groups of size at least 29 users that seem to be promoting the same ideas, (b) 38 users with behavior that points to spamming and trolling activities, and (c) 19 different instances where Disqus is used as a chat room. The goal of our work is to highlight commenting platforms as an ignored, but information-rich, online activity.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Backstrom, L., Sun, E., Marlow, C.: Find me if you can: improving geographical prediction with social and spatial proximity. In: WWW, pp. 61–70. ACM (2010)
Cheng, J., Danescu-Niculescu-Mizil, C., Leskovec, J.: Antisocial behavior in online discussion communities. arXiv preprint arXiv:1504.00680 (2015)
Cheng, Z., Caverlee, J., Lee, K.: You are where you tweet: a content-based approach to geo-locating Twitter users. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 759–768. ACM (2010)
Chu, Z., Gianvecchio, S., Wang, H., Jajodia, S.: Detecting automation of Twitter accounts: are you a human, bot, or cyborg? IEEE Trans. Dependable Secure Comput. 9(6), 811–824 (2012)
Davis Jr., C.A., Pappa, G.L., de Oliveira, D.R.R., de L Arcanjo, F.: Inferring the location of Twitter messages based on user relationships. Trans. GIS 15(6), 735–751 (2011)
De Choudhury, M., Counts, S., Horvitz, E.J., Hoff, A.: Characterizing and predicting postpartum depression from shared Facebook data. In: Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work & Social Computing, pp. 626–638. ACM (2014)
Devineni, P., Koutra, D., Faloutsos, M., Faloutsos, C.: If walls could talk: patterns and anomalies in Facebook wallposts. In: ASONAM, pp. 367–374. ACM (2015)
Disqus: What’s Cooler Than a Billion Monthly Uniques? May 2013. http://blog.disqus.com/post/50374065365/whats-cooler-than-a-billion-monthly-uniques
Disqus: Disqus: Blog-comment hosting service (2015). https://disqus.com
Ferraz Costa, A., Yamaguchi, Y., Juci Machado Traina, A., Traina Jr., C., Falout-sos, C.: Rsc: mining and modeling temporal activity in social media. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discoveryand Data Mining, pp. 269–278. ACM (2015)
Gelernter, J., Balaji, S.: An algorithm for local geoparsing of microtext. GeoInformatica 17(4), 635–667 (2013)
Gelernter, J., Zhang, W.: Cross-lingual geo-parsing for non-structured data. In: Proceedings of the 7th Workshop on Geographic Information Retrieval, pp. 64–71. ACM (2013)
Intense Debate: Intense Debate: Imagine better comments. http://intensedebate.com
Jurgens, D.: That’s what friends are for: inferring location in online social media platforms based on social relationships. ICWSM 13, 273–282 (2013)
Kosinski, M., Stillwell, D., Graepel, T.: Private traits and attributes are predictable from digital records of human behavior. Proc. Natl. Acad. Sci. 110(15), 5802–5805 (2013)
LiveFyre: LiveFyre: Real-time Content Marketing and Engagement. http://web.livefyre.com
Mahmud, J., Nichols, J., Drews, C.: Where is this tweet from? Inferring home locations of Twitter users. ICWSM 12, 511–514 (2012)
Mahmud, J., Nichols, J., Drews, C.: Home location identification of Twitter users. ACM Trans. Intell. Syst. Technol. (TIST) 5(3), 47 (2014)
Mishne, G., Carmel, D., Lempel, R.: Blocking blog spam with language model disagreement. AIRWeb. 5, 1–6 (2005)
Mueen, A., Viswanathan, K., Gupta, C., Keogh, E.: The fastest similarity search algorithm for time series subsequences under euclidean distance, August 2015. http://www.cs.unm.edu/mueen/FastestSimilaritySearch.html
Sculley, D., Wachman, G.M.: Relaxed online svms for spam filtering. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 415–422. ACM (2007)
Sood, S.O., Churchill, E.F., Antin, J.: Automatic identification of personal insults on social news sites. J. Am. Soc. Inf. Sci. Technol. 63(2), 270–285 (2012)
Sureka, A.: Mining user comment activity for detecting forum spammers in youtube. arXiv preprint arXiv:1103.5044 (2011)
Sysomos: Inside Twitter: An in-depth look inside the twitter world, April 2014. http://sysomos.com/sites/default/files/Inside-Twitter-BySysomos.pdf
Wang, R., Chen, F., Chen, Z., Li, T., Harari, G., Tignor, S., Zhou, X., Ben-Zeev, D., Campbell, A.T.: Studentlife: assessing mental health, academic performance and behavioral trends of college students using smartphones. In: UbiComp, pp. 3–14. ACM (2014)
Warren, C.: When are Facebook users most active? [study] (2010). http://mashable.com/2010/10/28/facebook-activity-study/
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Li, TC., Mueen, A., Faloutsos, M., Hang, H. (2016). Comment-Profiler: Detecting Trends and Parasitic Behaviors in Online Comments. In: Spiro, E., Ahn, YY. (eds) Social Informatics. SocInfo 2016. Lecture Notes in Computer Science(), vol 10046. Springer, Cham. https://doi.org/10.1007/978-3-319-47880-7_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-47880-7_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47879-1
Online ISBN: 978-3-319-47880-7
eBook Packages: Computer ScienceComputer Science (R0)