Abstract
We propose an approach for real-time sentiment-based anomaly detection (RSAD) in Twitter data streams. Sentiment classification is used to split the data into independent streams (positive, neutral, and negative), which are then analyzed for anomalous spikes in the number of tweets. Four approaches for evaluating the data streams are studied, along with the parameters that adjust their sensitivity. Results from an evaluation show the effectiveness of a probabilistic exponentially weighted moving average (PEWMA) coupled with a sliding window that uses median absolute deviation (MAD).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Apache Storm. https://storm.apache.org/ (accessed January 1, 2015)
Bifet, A., Frank, E.: Sentiment knowledge discovery in twitter streaming data. In: Pfahringer, B., Holmes, G., Hoffmann, A. (eds.) DS 2010. LNCS, vol. 6332, pp. 1–15. Springer, Heidelberg (2010)
Carter, K.M., Streilein, W.W.: Probabilistic reasoning for streaming anomaly detection. In: Proceedings of the Statistical Signal Processing Workshop (SSP), pp. 377–380. IEEE (2012)
Gupta, M., Gao, J., Aggarwal, C.C., Han, J.: Outlier detection for temporal data: A survey. IEEE Transactions on Knowledge and Data Engineering 26(9), 2250–2267 (2014)
Guzman, J., Poblete, B.: On-line relevant anomaly detection in the twitter stream: an efficient bursty keyword detection model. In: Proceedings of the ACM SIGKDD Workshop on Outlier Detection and Description, pp. 31–39. ACM (2013)
Hoeber, O., Hoeber, L., Wood, L., Snelgrove, R., Hugel, I., Wagner, D.: Visual twitter analytics: exploring fan and organizer sentiment during Le Tour de France. In: Proceedings of the VIS Workshop on Sports Data Visualization, pp. 1–7. IEEE (2013)
Leys, C., Ley, C., Klein, O., Bernard, P., Licata, L.: Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median. Journal of Experimental Social Psychology 49(4), 764–766 (2013)
Marcus, A., Bernstein, M.S., Badar, O., Karger, D.R., Madden, S., Miller, R.C.: Twitinfo: aggregating and visualizing microblogs for event exploration. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 227–236. ACM (2011)
Münz, G., Carle, G.: Application of forecasting techniques and control charts for traffic anomaly detection. In: Proceedings of the ITC Specialist Seminar on Network Usage and Traffic. Logos Verlag (2008)
Sadik, S., Gruenwald, L.: Online outlier detection for data streams. In: Proceedings of the Symposium on International Database Engineering & Applications, pp. 88–96. ACM (2011)
Sadik, S., Gruenwald, L.: Research issues in outlier detection for data streams. SIGKDD Explor. Newsl. 15(1), 33–40 (2014)
Sentiment140. http://www.sentiment140.com/ (accessed December 10, 2014)
Twitter Public Streams. https://dev.twitter.com/streaming/public (accessed December 10, 2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Patel, K., Hoeber, O., Hamilton, H.J. (2015). Real-Time Sentiment-Based Anomaly Detection in Twitter Data Streams. In: Barbosa, D., Milios, E. (eds) Advances in Artificial Intelligence. Canadian AI 2015. Lecture Notes in Computer Science(), vol 9091. Springer, Cham. https://doi.org/10.1007/978-3-319-18356-5_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-18356-5_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18355-8
Online ISBN: 978-3-319-18356-5
eBook Packages: Computer ScienceComputer Science (R0)