Skip to main content
Log in

An innovative user-attentive framework for supporting real-time detection and mining of streaming microblog posts

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

In this paper, we present a modular system capable of catching the attention of a new user, to detect in real-time events and emotions related to them in a stream of microblog posts. The system is capable of making social sensing and exploiting the information arising on the Internet through user-generated contents, and it is equipped with a conversational engine that manages the interaction with the human user. The whole approach can be applied either by a human user or a robot, which remains a future application to be further improved in the context of our proposed system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  • Aggarwal CC, Yu PS (2006) A framework for clustering massive text and categorical data streams. In: Proceedings of the SIAM conference on data mining, pp 477–481

  • Aggarwal CC, Subbian K (2012) Event detection in social streams. In: SIAM 2012 international conference on data mining, April 27–28, 2012. Anaheim, California, USA, pp 624–635

  • Agostaro F, Augello A, Pilato G, Vassallo G, Gaglio S (2005) A conversational agent based on a conceptual interpretation of a data driven semantic space. Lect Notes Artif Intell 3673(2):381–392

    Google Scholar 

  • Amati G, Van Rijsbergen CJ (2002) Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Trans Inf Syst 20(4):357–389

    Google Scholar 

  • Anantharam P, Thirunarayan K, Sheth AP (2012) Topical anomaly detection from twitter stream. In: ACM web science 2012, June 22–24, Evanston, IL, USA, pp 11–14

  • Barbosa L, Feng J (2010) Robust sentiment detection on twitter from biased and noisy data. In: COLING (Posters), pp 36–44

  • Bifet A, Frank E (2010) Sentiment knowledge discovery in twitter streaming data. In: Discovery science, pp 1–15

    Google Scholar 

  • Blei D, Ng A, Jordan M (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022

    MATH  Google Scholar 

  • Braun P, Cameron JJ, Cuzzocrea A, Jiang F, Leung CK-S (2014) Effectively and efficiently mining frequent patterns from dense graph streams on disk. Proc Comput Sci 35(1):338–347

    Google Scholar 

  • Brethes L, Menezes P, Lerasle F, Hayet J (2004) Face tracking and hand gesture recognition for human-robot interaction. In: IEEE international conference on robotics and automation, vol 2. IEEE, pp 1901–1906

  • Chella A, Frixione M, Gaglio S (2008) A cognitive architecture for robot self consciousness. Artif Intell Med 44(2):147–154

    Google Scholar 

  • Colace F, Santo MD, Greco L (2013) A probabilistic approach to tweets’ sentiment classification. In: ACII, pp 37–42

  • Colbaugh R, Glass K (2010) Estimating sentiment orientation in social media for intelligence monitoring and analysis. In: ISI, Yang CC, Zeng D, Wang K, Sanfilippo A, Tsang HH, Day M-Y, Glässer U, Brantingham PL, Chen H (Eds.), IEEE, pp 135–137

  • Cordeiro M (2012) Twitter event detection: combining wavelet analysis and topic inference summarization. DSIE, University of Porto, Portugal, Doctoral Symposium on Informatics Engineering

    Google Scholar 

  • Cannataro M, Cuzzocrea A, Pugliese A (2001) A probabilistic approach to model adaptive hypermedia systems. In: 1st International workshop on web dynamics, in conjunction on ICDT 2001

  • Corrigan Lee J, Peters C, Küster D, Castellano G (2016) Engagement perception and generation for social robots and virtual agents. In: Toward robotic socially believable behaving systems - volume I. Intelligent Systems Reference Library 105, pp 29-51, Springer

  • Celikyilmaz A, Hakkani-Tür D, Feng J (2010) Probabilistic model-based sentiment analysis of twitter messages, In: SLT, pp 79–84

  • Cuzzocrea A, Pilato G (2018) Taxonomy-based detection of user emotions for advanced artificial intelligent applications. In: International conference on hybrid artificial intelligence systems. Springer, Cham, pp 573–585

    Google Scholar 

  • Cuzzocrea A, Fortino G, Rana O (2013) Managing data and processes in cloud-enabled large-scale sensor networks: state-of-the-art and future research directions. In: 13th IEEE/ACM international symposium on cluster, cloud, and grid computing, CCGrid 2013, pp 583–588

  • Darling WM (2011) A theoretical and practical implementation tutorial on topic modeling and gibbs sampling. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, pp 642–647

  • D’Avanzo E, Pilato G (2014) Mining social network users opinions’ to aid buyers’ shopping decisions. Comput Hum Behav 51:1284–1294

    Google Scholar 

  • D’Avanzo E, Pilato G, Lytras MD (2017) Using twitter sentiment and emotions analysis of Google trends for decisions making. Program 51(3):322–350

    Google Scholar 

  • D’Avanzo E, Pilato G (2016) The good, the ugly and the bad situation awareness in the big data: a cognitive architecture for social forecasting. Int J Knowl Soc Res (IJKSR) 7(2):25–39

    Google Scholar 

  • Delaherche E, Dumas G, Nadel J, Chetouani M (2014) Automatic measure of imitation during social interaction: a behavioral and hyperscanning-eeg benchmark. Pattern Recognit Lett 66:118–126

    Google Scholar 

  • Dong G, Zhang X, Wong L, Li J (1999) CAEP: Classification by aggregating emerging patterns. In: DS’99 (LNCS 1721), Japan, Dec. 1999

    Google Scholar 

  • Ekman P, Friesen WV (1971) Constants across cultures in the face and emotion. J Pers Soc Psychol 17:124

    Google Scholar 

  • Esuli A, Sebastiani F (2006) “Sentiwordnet: A publicly available lexical resource for opinion mining. In: Proceedings of the 5th conference on language resources and evaluation (LREC’06), pp 417–422

  • Fellbaum C (ed) (1998) Wordnet: an electronic lexical database. The MIT Press, Cambridge

    MATH  Google Scholar 

  • Frias D, Pilato G (2016) A data-driven approach to dynamically learn focused lexicons for recognizing emotions in social network streams. In: Intelligent interactive multimedia systems and services, pp 609–618. Springer, Cham

    Google Scholar 

  • Ghag K, Shah K (2014) SentiTFIDF - sentiment classification using relative term frequency inverse document frequency. Int J Adv Comput Sci Appl 5(2):36–43

    Google Scholar 

  • Godbole N, Srinivasaiah M, Skiena S (2007) Large-scale sentiment analysis for news and blogs. In: Proceedings of the international conference on weblogs and social media (ICWSM)

  • Hao MC, Rohrdantz C, Janetzko H, Dayal U, Keim DA, Haug L-E, Hsu M (2011) Visual sentiment analysis on twitter data streams. In: IEEE VAST, pp 277–278

  • Hatzivassiloglou V, McKeown KR (1997) Predicting the semantic orientation of adjectives, pp 174–181

  • Hsieh L-C, Lee C-W, Chiu T-H, Hsu WH (2012) Live semantic sport highlight detection based on analyzing Tweets of twitter. In: IEEE international conference on multimedia expo (ICME) 9th–13th July 2012. Melbourne, Australia, pp 949–954

  • https://www.omnicoreagency.com/twitter-statistics/

  • Ilina E, Hauff C, Celik I, Abel F, Houben G-J (2012) Social event detection on twitter. In: 12th International conference on web engineering ICWE 2012, July 23–27, Berlin, Germany, pp 169–176

  • Interactive Advertising Bureau (IAB) (2017) Contextual taxonomy. http://www.iab.net/, Retrieved December 2017

  • Internet (2012) Numbers—Resources available at:http://royal.pingdom.com/2013/01/16/internet-2012-in-numbers/

  • Jurka TP (2012) Tools for sentiment analysis, R Package version 0.2. http://CRAN.R-project.org/package=sentiment

  • Kamps J, Marx M, Mokken RJ, Rijke MD (2004) Using wordnet to measure semantic orientation of adjectives. In: National Institute for, pp 1115–1118

  • Kanagasabai R, Veeramani A, Ngan LD, Yap GE, Decraene J, Nash AS (2014) Using semantic technologies to mine customer insights in telecom industry. In: International semantic web conference (Industry Track)

  • Landauer TK, Dumais ST (1990) A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol Rev 104(2):211–223

    Google Scholar 

  • Landauer TK, Foltz PW, Laham D (1998) An introduction to latent semantic analysis. Discourse Process 25:259–284

    Google Scholar 

  • Lee C-H, Yang H-C, Chien T-F, Wen W-S (2011) A novel approach for event detection by mining spatio-temporal information on microblogs. In: International conference on advances in social networks analysis and mining, ASONAM 2011, Kaohsiung, Taiwan, 25–27 July 2011. IEEE Computer Society, pp 254–259

  • Lee C-H, Chien T-F, Yang H-C (2011) “An automatic topic ranking approach for event detection on microblogging messages. In: IEEE international conference on systems, man, and cybernetics, Oct 9–12, 2011. Anchorage, Alaska, pp 1358–1363

  • Li K-C, Jiang H, Yang LT, Cuzzocrea A (2015) Big data: algorithms, analytics, and applications. Chapman and Hall/CRC, Boca Raton

    MATH  Google Scholar 

  • Lima ACES, de Castro LN (2012) Automatic sentiment analysis of twitter messages. In: CASoN. IEEE, pp 52–57

  • Liu B (2010) Sentiment analysis and subjectivity. In: Indurkhya N, Damerau FJ (eds) Handbook of natural language processing. CRC Press, Boca Raton, pp 627–665

    Google Scholar 

  • Liu B, Hsu W, Ma Y (1998) Integrating classification and association rule mining. In: KDD’98, New York, NY, Aug. 1998

  • Lu R, Xu Z, Zhang Y, Yang Q (2012) Life activity modeling of news event on twitter using energy function. In: Advances in knowledge discovery and data mining—16th Pacific-Asia conference, PAKDD 2012, Kuala Lumpur, Malaysia, May 29–June 1, 2012, Proceedings, Part II. Lecture Notes in Computer Science 7302, Springer 2012, ISBN 978-3-642-30219-0, pp 73–84

  • Maeda H, Shimada K, Endo T (2012) Twitter sentiment analysis based on writing style. In: Isahara H, Kanzaki K (eds) JapTAL, ser. Lecture Notes in Computer Science, vol 7614. Springer, pp 278–288

  • Nasukawa T, Yi J (2003) Sentiment analysis: capturing favorability using natural language processing. in: Gennari JH, Porter BW, Gil Y (eds) K-CAP. ACM, pp 70–77

  • Pang B, Lee L, Vaithyanathan S (2002) Thumbs up?: sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on empirical methods in natural language processing, Volume 10. Association for Computational Linguistics, pp 79–86

  • Pak A, Paroubek P (2010) Twitter as a corpus for sentiment analysis and opinion mining. In: LREC

  • Petherbridge N (2018) Artificial intelligence scripting language: Rivescript.com (online). https://www.rivescript.com/

  • Pilato G, D’Avanzo E (2018) Data-driven social mood analysis through the conceptualization of emotional fingerprints. Procedia Comput Sci 123:360–365

    Google Scholar 

  • Pilato G, Maniscalco U (2015) Soft sensors for social sensing in cultural heritage. In: Digital heritage, 2015, Vol. 2. IEEE, pp. 749–750

  • Pilato G, Maniscalco U (2016) A framework based on semantic spaces and glyphs for social sensing on twitter. Procedia Comput Sci 88:107–114

    Google Scholar 

  • Petrovic S, Osborne M, Lavrenko V (2010) Streaming first story detection with application to twitter. In: Human language technologies: the 11th annual conference of the North American chapter of the association for computational linguistics, June 1–6, 2010, Los Angeles, pp 181–189

  • Rose S, Engel D, Cramer N, We Cowley (2010) Automatic keyword extraction from individual documents. Text Min Appl Theory 1:1–20. https://doi.org/10.1002/9780470689646.ch1

    Google Scholar 

  • Ryanakelly: Pearanalytics - Twitter Study (2009) Resources available at: http://www.pearanalytics.com/wp-content/uploads/2012/12/Twitter-Study-August-2009.pdf

  • Saif H, He Y, Alani H (2012) Semantic sentiment analysis of twitter. In: International semantic web conference vol 1, pp 508–524

  • Santilli S, Nota L, Pilato G (2017) The use of latent semantic analysis in the positive psychology: a comparison with twitter posts. In: 2017 IEEE 11th international conference on semantic computing (ICSC). IEEE, pp 494–498

  • Santorini B (1995) Part-of-speech tagging guidelines for the penn treebank project. In: D. o. Science, Technical Reports. University of Pennsylvania

  • Shahheidari S, Dong H, Daud MNRB (2013) Twitter sentiment mining: a multi domain analysis. In: Barolli L, Xhafa F, Chen H-C, Gómez-Skarmeta AF, Hussain F (eds) CISIS. IEEE, pp 144–149

  • Shuyo N (2010) Language detection library for java. http://code.google.complanguage-detection

  • Siddharth G, Borkar D, De Mello C, Patil S (2015) An E-commerce website based chatbot. Int J Comput Sci Inf Technol 6(2):1483–1485

    Google Scholar 

  • Strapparava C, Valitutti A (2004) WordNet-affect: an affective extension of WordNet. In: Proceedings of the 4th international conference on language resources and evaluation (LREC 2004). Lisbon, pp 1083–1086

  • Strapparava C, Mihalcea R (2007) Semeval-2007 task 14: affective text. In: Proceedings of the 4th international workshop on semantic evaluations. Association for Computational Linguistics, pp 70–74

  • Strapparava C, Mihalcea R (2008) Learning to identify emotions in text. In: Proceedings of the 2008 ACM symposium on applied computing SAC’08

  • Teh YW, Newman D, Welling M (2006) A collapsed variational Bayesian inference algorithm for latent Dirichlet allocation. NIPS 6:1378–1385

    Google Scholar 

  • Terrana D, Augello A, Pilato (2014) Facebook users relationships analysis based on sentiment classification. In: Proceedings of 2014 IEEE international conference on semantic computing (ICSC), pp 290–296

  • Terrana D, Pilato G (2013) Detection, clustering and tracking of life cycle events on Twitter using electric fields analogy. In: 2013 IEEE Seventh International Conference on Semantic computing (ICSC). IEEE, pp 220-227

  • Tsolmon B, Kwon A-R, Lee K-S (2013) Extracting social events based on timeline and sentiment analysis in twitter corpus. In: 18th International conference on application of natural language to information systems (NLDB2013) 19–21 June 2013, University of Salford, MediaCity, UK, 2012, pp 265–270

  • Twitter Developers: Streaming API Methods (XXXX) Resources available at:https://dev.twitter.com/docs/streaming-api/methods

  • Twitter for Business (XXXX) Resources available at https://business.twitter.com/

  • Waltinger U (2009) Polarity reinforcement: sentiment polarity identification by means of social semantics. In: Proceedings of the IEEE Africon 2009, September 23–25, Nairobi, Kenya

  • Wiebe J, Wilson T, Bruce R, Bell M, Martin M (2004) Learning subjective language. Comput Linguist 30(3):277–308

    Google Scholar 

  • Wu Z, Yin W, Cao J, Xu G, Cuzzocrea A (2013) Community detection in multi-relational social networks. In: Proceedings of 2013 International conference on web information systems engineering

    Google Scholar 

  • Yang CT, Liu JC, Hsu CH, Chou WL (2014) On improvement of cloud virtual machine availability with virtualization fault tolerance mechanism. J Supercomput 69(3):1103–1122

    Google Scholar 

  • Yu CT, Salton G (1976) Precision weighting: an effective automatic indexing method. J ACM 23(1):76–88

    MathSciNet  MATH  Google Scholar 

  • Zhang K, Cheng Y, Xie Y, Honbo D, Agrawal A, Palsetia D, Lee K, keng Liao W, Choudhary AN (2011) SES: Sentiment elicitation system for social media data. In: ICDM Workshops, pp 129–136

  • Zhou X, Tao X, Yong J, Yang Z (2013) Sentiment analysis on Tweets for social events. In: Shen W, Li W, Barthès J-PA, Luo J, Zhu H, Yong J, Li X (eds) CSCWD. IEEE, pp 557–562

Download references

Acknowledgements

The authors would like to thank Diego Terrana and Diego Frias for their partial contribution to the previous works that led to the development of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. Cuzzocrea.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cuzzocrea, A., Pilato, G. An innovative user-attentive framework for supporting real-time detection and mining of streaming microblog posts. Soft Comput 24, 9663–9682 (2020). https://doi.org/10.1007/s00500-019-04478-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-019-04478-2

Keywords

Navigation