Abstract
In this paper, data concerning MILANO EXPO2015 is collected from the official twitter page of the event before and after its opening. In order to extract a semi-supervised ontology and to evaluate the global sentiment around the event, a variety of language processing techniques has been applied on the collected “tweets”: Latent Semantic Analysis, sentiment polarity tracking, along with gap analysis has allowed the semantic evaluation of users’ opinions. Moreover, the generalized cross entropy approach has been applied for the first time on web data, adding prior information on the effect of semantic classes on the global sentiment, improving accuracy and adding detail to the analysis.

Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
‘Fuzzy’ is here intended as embedding concepts from more than one cluster at once.
Remember that our considerations are restricted to the case where no fuzzy k-means clustering is applied. In this last case, further consideration should be made: The most straightforward approach would be to keep the same general approach, but here it should be: \(C' \notin \mathcal {R}^{n_o \times n_c}\).
This matrix is actually a vector, because there is only one document: The query itself.
See https://dev.twitter.com/streaming/public for more information.
Available at https://store.continuum.io/cshop/anaconda/.
References
Abbott D (2014) Applied predictive analytics: principles and techniques for the professional data analyst. Wiley, Hoboken
Alexeyeva N, Alexandre S (2013) The negative binomial model of word usage. Electron J Appl Stat Anal 6(1):84–96
Badri MA, Abdulla M, Al-Madani A (2005) Information technology center service quality: assessment and application of SERVQUAL. Int J Qual Reliab Manag 22(8):819–848
Berry M, Do T, O’Brien G, Krishna V, Varadhan S (1993) SVDPACKC (version 1.0) user’s Guide1
Blackburn KG, Yilmaz G, Boyd RL (2018) Food for thought: exploring how people think and talk about food online. Appetite 123:390–401
Brown SW, Swartz TA (1989) A gap analysis of professional service quality. J Mark 53:92–98
Carpita M, Ciavolino E (2017) A generalized maximum entropy estimator to simple linear measurement error model with a composite indicator. Adv Data Anal Classif 11(1):139–158
Ciavolino E (2011) An information theoretic job satisfaction analysis. J Appl Sci 11(4):686–692
Ciavolino E, Al-Nasser AD (2009) Comparing generalised maximum entropy and partial least squares methods for structural equation models. J Nonparametric Stat 21(8):1017–1036
Ciavolino E, Calcagnì A (2015) Generalized cross entropy method for analysing the SERVQUAL model. J Appl Stat 42(3):520–534
Ciavolino E, Calcagnì A (2016) A generalized maximum entropy (GME) estimation approach to fuzzy regression model. Appl Soft Comput 38:51–63
Ciavolino E, Carpita M (2015) The GME estimator for the regression model with a composite indicator as explanatory variable. Qual Quant 49(3):955–965
Ciavolino E, Dahlgaard JJ (2009) Simultaneous equation model based on the generalized maximum entropy for studying the effect of management factors on enterprise performance. J Appl Stat 36(7):801–815
Ciavolino E, Carpita M, Al-Nasser A (2015) Modelling the quality of work in the Italian social co-operatives combining NPCA-RSM and SEM-GME approaches. J Appl Stat 42(1):161–179
Cover TM, Thomas JA (2006) Elements of information theory, 2nd edn. Wiley, Hoboken
Deerwester SC, Dumais ST, Landauer TK, Furnas GW, Harshman RA (1990) Indexing by latent semantic analysis. JASIS 41(6):391–407
Dumais ST (2004) Latent semantic analysis. Annu Rev Inf Sci Technol 38(1):188–230
Feldman R (2013) Techniques and applications for sentiment analysis. Commun ACM 56(4):82–89
Golan A (2008) Information and entropy econometrics: a review and synthesis. Found Trends Econ 2(1–2):1–145
Golan A, Judge G, Miller D (1996) Maximum entropy econometrics: robust estimation with limited data, series in financial economics and quantitative analysis. Wiley, Hoboken
Halko N, Martinsson P-G, Tropp JA (2011) Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev 53(2):217–288
Jaynes ET (1957) Information theory and statistical mechanics. Phys Rev 106(4):620
Jaynes ET (1968) Prior probabilities. IEEE Trans Syst Sci Cybern 4(3):227–241
Jolliffe IT, Cadima J (2016) Principal component analysis: a review and recent developments. Philos Trans R Soc A 374(2065):20150202
Klema V, Laub A (1980) The singular value decomposition: its computation and some applications. IEEE Trans Autom Control 25(2):164–176
Landauer TK (2006) Latent semantic analysis. Wiley, Hoboken
Landauer TK, Foltz PW, Laham D (1998) An introduction to latent semantic analysis. Discourse Process 25(2–3):259–284
Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5(1):1–167
Lloyd SP (1982) Least squares quantization in PCM. Inf Theory IEEE Trans 28(2):129–137
Ofli F, Aytar Y, Weber I, Al Hammouri R, Torralba A (2017) Is saki# delicious? The food perception gap on instagram and its relation to health. In: Proceedings of the 26th international conference on world wide web, international world wide web conferences steering committee, pp 509–518
Omachonu V, Haar J, Berg D (2016) Assessing quality in professional services: a framework for gap analysis. Int J Trans Innov Syst 5(1):4–19
Paliouras G, Spyropoulos CD, Tsatsaronis G (2011) Knowledge-driven multimedia information extraction and ontology evolution. Springer, Berlin
Papalia RB, Ciavolino E (2011) Gme estimation of spatial structural equations models. J Classif 28(1):126–141
Pasca P, Ciavolino E, Boyd R (2018) A data-mining approach to the Parkour discipline. In: Proceedings of the 49th annual meeting of the Italian statistical society. Palermo
Pukelsheim F (1994) The three sigma rule. Am Stat 48(2):88–91
Seth N, Deshmukh S, Vrat P (2005) Service quality models: a review. Int J Qual Reliab Manag 22(9):913–949
Shannon CE (2001) A mathematical theory of communication. ACM SIGMOBILE Mob Comput Commun Rev 5(1):3–55
Trefethen LN, Bau D III (1997) Numerical linear algebra, vol 50. Siam, Philadelphia
Turney PD (2001) Mining the web for synonyms: PMI-IR versus LSA on TOEFL. In: European conference on machine learning. Springer, Berlin, pp 491–502
Varçın F, Erbay H, Horasan F (2016) Latent semantic analysis via truncated ULV decomposition. In: Signal processing and communication application conference (SIU), 2016 24th. IEEE, pp 1333–1336
Weng S-S, Tsai H-J, Liu S-C, Hsu C-H (2006) Ontology construction for information classification. Expert Syst Appl 31(1):1–12
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Authors declare that there is no conflict of interest.
Human and animal rights
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Communicated by Massimo Squillante.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Corallo, A., Fortunato, L., Massafra, A. et al. Sentiment analysis of expectation and perception of MILANO EXPO2015 in twitter data: a generalized cross entropy approach. Soft Comput 24, 13597–13607 (2020). https://doi.org/10.1007/s00500-019-04368-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-019-04368-7