Sentiment Analysis Through Finite State Automata

Pelosi, Serena; Maisto, Alessandro; Melillo, Lorenza; Elia, Annibale

doi:10.1007/978-3-031-24340-0_14

Serena Pelosi⁸,
Alessandro Maisto⁸,
Lorenza Melillo⁸ &
…
Annibale Elia⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13452))

Included in the following conference series:

International Conference on Computational Linguistics and Intelligent Text Processing

362 Accesses

Abstract

The present research aims to demonstrate how powerful Finite State Automata (FSA) can be, into a domain in which the vagueness of the human opinions and the subjectivity of the user generated contents make the automatic “understanding” of texts extremely hard. Assuming that the semantic orientation of sentences is based on the manipulation of sentiment words, we built from scratch, for the Italian language, a network of local grammars for the annotation of sentiment expressions and electronic dictionaries for the classification of more than 15,000 opinionated words. In the paper we explain in detail how we made use of FSA for both the automatic population of sentiment lexicons and the sentiment classification of real sentences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Annibale Elia, Lorenza Melillo and Alessandro Maisto worked on the Conclusion of the paper, while Serena Pelosi on Introduction and Paragraphs 1, 2, 3 and 4.
2.
We chose a rule-based method, among others, in order to verify the hypothesis that words can be classified together in accordance to both semantic and syntactic criteria.
3.
Word Similarity is a very frequently used method in the dictionary propagation over the thesaurus-based approaches. Examples are the Maryland dictionary, created thanks to a Roget-like thesaurus and a handful of affixe [48], and other lexicons based on WordNet, like SentiWordNet, built on the base of quantitative analysis of glosses associated to synsets [17, 18] or other lexicons based on the computing of the distance measure on WordNet [17, 34].
4.
Seed words are words which are strongly associated with a positive/negative meaning, such as eccellente (“excellent”) or orrendo (“horrible”), by which it is possible to build a bigger lexicon, detecting other words that frequently occur alongside them.
5.
http://valeriobasile.github.io/twita/downloads.html.
6.
https://www.celi.it/.
7.
http://ai-applied.nl/sentiment-analysis-api.
8.
http://hdl.handle.net/20.500.11752/ILC-73.
9.
Although WordNet does not include semantic orientation information for its lemmas; semantic relations, such as synonymy or antonymy, are commonly used in order to automatically propagate the polarity, starting from a manually annotated set of seed word. [2, 13, 18, 18, 28, 31, 31, 34, 39, 45, 45]. This approach presents some drawbacks, such as the lack of scalability, the unavailability of enough resources for many languages and the difficulty to handle newly coined words, which are not already contained in the thesauri.
10.
Morphemes allow not only the propagation of a given word polarity (e.g. en-, -ous, -fy), but also its switching (e.g. dis-, -less), its intensification (e.g. super-, over-) and its weakening (e.g. semi-) [54].
11.
While compiling the dictionary, the judgment on the words “prior polarity” is given without considering any textual context. The entries of the sentiment dictionary receive the same annotation and, then, are grouped together if they posses the same semantic orientation. The Prior Polarity [56] refers to the individual words Semantic Orientation (SO) and differs from the SO because it is always independent from the context.
12.
Local grammars are algorithms that, through grammatical, morphological and lexical instructions, are used to formalize linguistic phenomena and to -parse texts. They are defined “local” because, despite any generalization, they can be used only in the description and analysis of limited linguistic phenomena.
13.
The main difference between the words listed in the two scales is the possibility to use them as indicators for the subjectivity detection: basically, the words belonging to the evaluation scale are “anchors” that begin the identification of polarized phrases or sentences, while the ones belonging to the strength scale are just used as intensity modifiers (see Paragraph 5.3).
14.
available for consultation at http://dsc.unisa.it/composti/tavole/combo/tavole.asp.
15.
The morphological method could be also applied to Italian verbs, but we chose to avoid this solution because of the complexity of their argument structures. We decided, instead, to manually evaluate all the verbs described in the Italian Lexicon-grammar binary tables, so we could preserve the different lexical, syntactic and transformational rules connected to each one of them [16].
16.
The meaning of the deadjectival adverbs in -mente is not always predictable starting from the base adjectives from which they are derived. Also the syntactic structures in which they occur influences their interpretation. Depending on their position in sentences, the deadjectival adverbs can be described as adjective modifiers (e.g. altamente “highly”), predicate modifiers (e.g. perfettamente “perfectly”) or sentence modifiers (e.g. ultimamente “lately”).
17.
Metanodes are labeled through the six corresponding values of the evaluation scale, which goes from –3 to +3.
18.
The dataset contains Italian opinionated texts in the form of users reviews and comments from e-commerce and opinion websites; it lists 600 texts units (50 positive and 50 negative for each product class) and refers to six different domains, for all of which different websites (such as www.ciao.it; www.amazon.it; www.mymovies.it; www.tripadvisor.it) have been exploited [44].
19.
Other idioms included in our resources are of the kind N0 essere (Agg + Ppass) Prep C1 (e.g. Max è matto da legare, “Max is so crazy he should be locked up”); N0 essere Agg e Agg (e.g. Max è bello e fritto, “Max is cooked”); C0 essere Agg (come C1 + E) (e.g. Mary ha la coscienza sporca \(\leftrightarrow \) La coscienza è sporca, “Mary has a guilty conscience” \(\leftrightarrow \) “The conscience is guilty”), N0 essere C1 Agg (e.g. Mary è una gatta morta, “Mary is a cock tease”).
20.
Words that, at first glance, seem to be intensifiers but at a deeper analysis reveal a more complex behavior are abbastanza “enough” troppo “too much” and poco “not much”.
In this research we noticed as well that the co-occurrence of troppo, poco and abbastanza with polar lexical items can provoke, in their semantic orientation, effects that can be associated to other contextual valence shifters. The ad hoc rules dedicated to these words (see Table ??) are not actually new, but refer to other contextual valence shifting rules that have been discussed in this Paragraph.

References

Andreevskaia, A., Bergler, S.: When specialists and generalists work together: overcoming domain dependence in sentiment tagging. In: ACL, pp. 290–298 (2008)
Google Scholar
Argamon, S., Bloom, K., Esuli, A., Sebastiani, F.: Automatically determining attitude type and force for sentiment analysis, pp. 218–231 (2009)
Google Scholar
Balibar-Mrabti, A.: Une étude de la combinatoire des noms de sentiment dans une grammaire locale. Langue française, pp. 88–97 (1995)
Google Scholar
Baroni, M., Vegnaduzzo, S.: Identifying subjective adjectives through web-based mutual information, vol. 4, pp. 17–24 (2004)
Google Scholar
Basile, V., Nissim, M.: Sentiment analysis on Italian tweets. In: Proceedings of the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pp. 100–107 (2013)
Google Scholar
Benamara, F., Cesarano, C., Picariello, A., Recupero, D.R., Subrahmanian, V.S.: Sentiment analysis: adjectives and adverbs are better than adjectives alone. In: ICWSM (2007)
Google Scholar
Benamara, F., Chardon, B., Mathieu, Y., Popescu, V., Asher, N.: How do negation and modality impact on opinions? pp. 10–18 (2012)
Google Scholar
Bolioli, A., Salamino, F., Porzionato, V.: Social media monitoring in real life with blogmeter platform. In: ESSEM@ AI* IA 1096, 156–163 (2013)
Google Scholar
Dang, Y., Zhang, Y., Chen, H.: A lexicon-enhanced method for sentiment classification: An experiment on online product reviews. In: Intelligent Systems, IEEE, vol. 25, pp. 46–53. IEEE (2010)
Google Scholar
Dasgupta, S., Ng, V.: Mine the easy, classify the hard: a semi-supervised approach to automatic sentiment classification. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2-vol. 2, pp. 701–709. Association for Computational Linguistics (2009)
Google Scholar
De Mauro, T.: Dizionario italiano. Paravia, Torino (2000)
Google Scholar
Di Gennaro, P., Rossi, A., Tamburini, F.: The ficlit+ cs@ unibo system at the evalita 2014 sentiment polarity classification task. In: Proceedings of the Fourth International Workshop EVALITA 2014 (2014)
Google Scholar
Dragut, E.C., Yu, C., Sistla, P., Meng, W.: Construction of a sentimental word dictionary, pp. 1761–1764 (2010)
Google Scholar
Elia, A.: Le verbe italien. Les complétives dans les phrases àa un complément (1984)
Google Scholar
Elia, A.: Chiaro e tondo: Lessico-Grammatica degli avverbi composti in italiano. Segno Associati (1990)
Google Scholar
Elia, A., Martinelli, M., D’Agostino, E.: Lessico e Strutture sintattiche. Liguori, Introduzione alla sintassi del verbo italiano. Napoli (1981)
Google Scholar
Esuli, A., Sebastiani, F.: Determining the semantic orientation of terms through gloss classification. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp. 617–624. ACM (2005)
Google Scholar
Esuli, A., Sebastiani, F.: Determining term subjectivity and term orientation for opinion mining vol. 6, p. 2006 (2006)
Google Scholar
Esuli, A., Sebastiani, F.: SentiWordNet: a publicly available lexical resource for opinion mining. In: Proceedings of LREC, vol. 6, pp. 417–422 (2006)
Google Scholar
Fellbaum, C.: WordNet. Wiley Online Library (1998)
Google Scholar
Gaeta, L.: Nomi d’azione. La formazione d elle parole in italiano. Tübingen: Max Niemeyer Verlag, pp. 314–351 (2004)
Google Scholar
Gamon, M., Aue, A.: Automatic identification of sentiment vocabulary: exploiting low association with known sentiment terms, pp. 57–64 (2005)
Google Scholar
Ganapathibhotla, M., Liu, B.: Mining opinions in comparative sentences. In: Proceedings of the 22nd International Conference on Computational Linguistics, vol. 1. pp. 241–248. Association for Computational Linguistics (2008)
Google Scholar
Goldberg, A.B., Zhu, X.: Seeing stars when there aren’t many stars: graph-based semi-supervised learning for sentiment categorization. In: Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing, pp. 45–52. Association for Computational Linguistics (2006)
Google Scholar
Gross, M.: Les bases empiriques de la notion de prédicat sémantique. Langages, pp. 7–52 (1981)
Google Scholar
Gross, M.: Les phrases figées en français. In: L’information grammaticale, vol. 59, pp. 36–41. Peeters (1993)
Google Scholar
Gross, M.: Une grammaire locale de l’expression des sentiments. Langue française, pp. 70–87 (1995)
Google Scholar
Hassan, A., Radev, D.: Identifying text polarity using random walks, pp. 395–403 (2010)
Google Scholar
Hatzivassiloglou, V., McKeown, K.R.: Predicting the semantic orientation of adjectives. In: Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics, pp. 174–181. Association for Computational Linguistics (1997)
Google Scholar
Hernandez-Farias, I., Buscaldi, D., Priego-Sánchez, B.: Iradabe: adapting English lexicons to the Italian sentiment polarity classification task. In: First Italian Conference on Computational Linguistics (CLiC-it 2014) and the fourth International Workshop EVALITA2014, pp. 75–81 (2014)
Google Scholar
Hu, M., Liu, B.: Mining and summarizing customer reviews, pp. 168–177 (2004)
Google Scholar
Iacobini, C.: Prefissazione. La formazione delle parole in italiano. Tübingen: Max Niemeyer Verlag, pp. 97–161 (2004)
Google Scholar
Kaji, N., Kitsuregawa, M.: Building lexicon for sentiment analysis from massive collection of html documents. In: EMNLP-CoNLL, pp. 1075–1083 (2007)
Google Scholar
Kamps, J., Marx, M., Mokken, R.J., De Rijke, M.: Using wordnet to measure semantic orientations of adjectives (2004)
Google Scholar
Kanayama, H., Nasukawa, T.: Fully automatic lexicon expansion for domain-oriented sentiment analysis. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pp. 355–363. Association for Computational Linguistics (2006)
Google Scholar
Kanayama, H., Nasukawa, T.: Fully automatic lexicon expansion for domain-oriented sentiment analysis, p. 355 (2006)
Google Scholar
Kang, H., Yoo, S.J., Han, D.: Senti-lexicon and improved naïve bayes algorithms for sentiment analysis of restaurant reviews. In: Expert Systems with Applications, vol. 39, pp. 6000–6010. Elsevier (2012)
Google Scholar
Kennedy, A., Inkpen, D.: Sentiment classification of movie reviews using contextual valence shifters. Comput. Intell. 22(2), 110–125 (2006)
Article Google Scholar
Kim, S.M., Hovy, E.: Determining the sentiment of opinions, p. 1367 (2004)
Google Scholar
Ku, L.W., Huang, T.H., Chen, H.H.: Using morphological and syntactic structures for Chinese opinion analysis, pp. 1260–1269 (2009)
Google Scholar
Landauer, T.K., Dumais, S.T.: A solution to plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. In: Psychological Review, vol. 104, p. 211. American Psychological Association (1997)
Google Scholar
Li, F., Huang, M., Zhu, X.: Sentiment analysis with global topics and local dependency. In: AAAI (2010)
Google Scholar
Maisto, A., Pelosi, S.: Feature-based customer review summarization. In: Meersman, R., et al. (eds.) OTM 2014. LNCS, vol. 8842, pp. 299–308. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-45550-0_30
Chapter Google Scholar
Maisto, A., Pelosi, S.: A lexicon-based approach to sentiment analysis. the Italian module for Nooj. In: Proceedings of the International Nooj 2014 Conference, University of Sassari, Italy. Cambridge Scholar Publishing (2014)
Google Scholar
Maks, I., Vossen, P.: Different approaches to automatic polarity annotation at synset level, pp. 62–69 (2011)
Google Scholar
Mathieu, Y.Y.: Les prédicats de sentiment. Langages, pp. 41–52 (1999)
Google Scholar
Miller, G.A.: Wordnet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)
Article Google Scholar
Mohammad, S., Dunne, C., Dorr, B.: Generating high-coverage semantic orientation lexicons from overtly marked words and a thesaurus. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2-vol. 2, pp. 599–608. Association for Computational Linguistics (2009)
Google Scholar
Moilanen, K., Pulman, S.: Sentiment composition, pp. 378–382 (2007)
Google Scholar
Moilanen, K., Pulman, S.: The good, the bad, and the unknown: morphosyllabic sentiment tagging of unseen words, pp. 109–112 (2008)
Google Scholar
Mulder, M., Nijholt, A., Den Uyl, M., Terpstra, P.: A lexical grammatical implementation of affect, pp. 171–177 (2004)
Google Scholar
Mullen, T., Collier, N.: Sentiment analysis using support vector machines with diverse information sources. In: EMNLP, vol. 4, pp. 412–418 (2004)
Google Scholar
Navigli, R., Ponzetto, S.P.: BabelNet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic network, vol. 193, pp. 217–250 (2012)
Google Scholar
Neviarouskaya, A.: Compositional approach for automatic recognition of fine-grained affect, judgment, and appreciation in text (2010)
Google Scholar
Neviarouskaya, A., Prendinger, H., Ishizuka, M.: Compositionality principle in recognition of fine-grained emotions from text. In: ICWSM (2009)
Google Scholar
Osgood, C.E.: The nature and measurement of meaning. Psychol. Bull. 49(3), 197 (1952)
Article Google Scholar
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up?: sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on Empirical methods in natural language processing, vol. 10, pp. 79–86. Association for Computational Linguistics (2002)
Google Scholar
Pianta, E., Bentivogli, L., Girardi, C.: MultiWordNet: developing an aligned multilingual database. In: Proceedings of the first international conference on global WordNet, vol. 152, pp. 55–63 (2002)
Google Scholar
Polanyi, L., Zaenen, A.: Contextual valence shifters, pp. 1–10 (2006)
Google Scholar
Prabowo, R., Thelwall, M.: Sentiment analysis: a combined approach. J. Inf. 3, 143–157 (2009)
Google Scholar
Qiu, G., Liu, B., Bu, J., Chen, C.: Expanding domain sentiment lexicon through double propagation. vol. 9, pp. 1199–1204 (2009)
Google Scholar
Rainer, F.: Derivazione nominale deaggettivale. La formazione delle parole in italiano, pp. 293–314 (2004)
Google Scholar
Rao, D., Ravichandran, D.: Semi-supervised polarity lexicon induction. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, pp. 675–682. Association for Computational Linguistics (2009)
Google Scholar
Read, J., Carroll, J.: Weakly supervised techniques for domain-independent sentiment classification. In: Proceedings of the 1st International CIKM Workshop on Topic-Sentiment Analysis for Mass Opinion, pp. 45–52. ACM (2009)
Google Scholar
Riloff, E., Wiebe, J., Wilson, T.: Learning subjective nouns using extraction pattern bootstrapping. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, vol. 4. pp. 25–32. Association for Computational Linguistics (2003)
Google Scholar
Russo, I., Frontini, F., Quochi, V.: OpeNER sentiment lexicon italian - LMF (2016). http://hdl.handle.net/20.500.11752/ILC-73, digital Repository for the CLARIN Research Infrastructure provided by ILC-CNR
Taboada, M., Anthony, C., Voll, K.: Methods for creating semantic orientation dictionaries. In: Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC), Genova, Italy, pp. 427–432 (2006)
Google Scholar
Tan, S., Cheng, X., Wang, Y., Xu, H.: Adapting naive bayes to domain adaptation for sentiment analysis. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 337–349. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-00958-7_31
Chapter Google Scholar
Turney, P.D.: Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews, pp. 417–424 (2002)
Google Scholar
Turney, P.D., Littman, M.L.: Measuring praise and criticism: inference of semantic orientation from association. ACM Trans. Inf. Syst. (TOIS) 21, 315–346 (2003)
Article Google Scholar
Velikovich, L., Blair-Goldensohn, S., Hannan, K., McDonald, R.: The viability of web-derived polarity lexicons. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 777–785. Association for Computational Linguistics (2010)
Google Scholar
Vermeij, M.: The orientation of user opinions through adverbs, verbs and nouns. In: 3rd Twente Student Conference on IT, Enschede June (2005)
Google Scholar
Vietri, S.: The Italian module for Nooj. In: In Proceedings of the First Italian Conference on Computational Linguistics, CLiC-it 2014. Pisa University Press (2014)
Google Scholar
Vietri, S.: On some comparative frozen sentences in Italian. Lingvisticæ Investigationes 14(1), 149–174 (1990)
Article Google Scholar
Vietri, S.: On a class of Italian frozen sentences. Lingvisticæ Investigationes 34(2), 228–267 (2011)
Article Google Scholar
Wan, X.: Co-training for cross-lingual sentiment classification. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, vol. 1, pp. 235–243. Association for Computational Linguistics (2009)
Google Scholar
Wang, X., Zhao, Y., Fu, G.: A morpheme-based method to Chinese sentence-level sentiment classification. Int. J. Asian Lang. Proc. 21(3), 95–106 (2011)
Google Scholar
Wawer, A.: Extracting emotive patterns for languages with rich morphology. Int. J. Comput. Linguist. Appl. 3(1), 11–24 (2012)
Google Scholar
Wiebe, J.: Learning subjective adjectives from corpora. In: AAAI/IAAI, pp. 735–740 (2000)
Google Scholar
Ye, Q., Zhang, Z., Law, R.: Sentiment classification of online reviews to travel destinations by supervised machine learning approaches. In: Expert Systems with Applications, vol. 36, pp. 6527–6535. Elsevier (2009)
Google Scholar
Yi, J., Nasukawa, T., Bunescu, R., Niblack, W.: Sentiment analyzer: extracting sentiments about a given topic using natural language processing techniques, pp. 427–434 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Political and Communication Science, University of Salerno, Salerno, Italy
Serena Pelosi, Alessandro Maisto, Lorenza Melillo & Annibale Elia

Authors

Serena Pelosi
View author publications
You can also search for this author in PubMed Google Scholar
Alessandro Maisto
View author publications
You can also search for this author in PubMed Google Scholar
Lorenza Melillo
View author publications
You can also search for this author in PubMed Google Scholar
Annibale Elia
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Serena Pelosi .

Editor information

Editors and Affiliations

Instituto Politécnico Nacional, Mexico City, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pelosi, S., Maisto, A., Melillo, L., Elia, A. (2023). Sentiment Analysis Through Finite State Automata. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2019. Lecture Notes in Computer Science, vol 13452. Springer, Cham. https://doi.org/10.1007/978-3-031-24340-0_14

Download citation

DOI: https://doi.org/10.1007/978-3-031-24340-0_14
Published: 26 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-24339-4
Online ISBN: 978-3-031-24340-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics