Abstract
The present research aims to demonstrate how powerful Finite State Automata (FSA) can be, into a domain in which the vagueness of the human opinions and the subjectivity of the user generated contents make the automatic “understanding” of texts extremely hard. Assuming that the semantic orientation of sentences is based on the manipulation of sentiment words, we built from scratch, for the Italian language, a network of local grammars for the annotation of sentiment expressions and electronic dictionaries for the classification of more than 15,000 opinionated words. In the paper we explain in detail how we made use of FSA for both the automatic population of sentiment lexicons and the sentiment classification of real sentences.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Annibale Elia, Lorenza Melillo and Alessandro Maisto worked on the Conclusion of the paper, while Serena Pelosi on Introduction and Paragraphs 1, 2, 3 and 4.
- 2.
We chose a rule-based method, among others, in order to verify the hypothesis that words can be classified together in accordance to both semantic and syntactic criteria.
- 3.
Word Similarity is a very frequently used method in the dictionary propagation over the thesaurus-based approaches. Examples are the Maryland dictionary, created thanks to a Roget-like thesaurus and a handful of affixe [48], and other lexicons based on WordNet, like SentiWordNet, built on the base of quantitative analysis of glosses associated to synsets [17, 18] or other lexicons based on the computing of the distance measure on WordNet [17, 34].
- 4.
Seed words are words which are strongly associated with a positive/negative meaning, such as eccellente (“excellent”) or orrendo (“horrible”), by which it is possible to build a bigger lexicon, detecting other words that frequently occur alongside them.
- 5.
- 6.
- 7.
- 8.
- 9.
Although WordNet does not include semantic orientation information for its lemmas; semantic relations, such as synonymy or antonymy, are commonly used in order to automatically propagate the polarity, starting from a manually annotated set of seed word. [2, 13, 18, 18, 28, 31, 31, 34, 39, 45, 45]. This approach presents some drawbacks, such as the lack of scalability, the unavailability of enough resources for many languages and the difficulty to handle newly coined words, which are not already contained in the thesauri.
- 10.
Morphemes allow not only the propagation of a given word polarity (e.g. en-, -ous, -fy), but also its switching (e.g. dis-, -less), its intensification (e.g. super-, over-) and its weakening (e.g. semi-) [54].
- 11.
While compiling the dictionary, the judgment on the words “prior polarity” is given without considering any textual context. The entries of the sentiment dictionary receive the same annotation and, then, are grouped together if they posses the same semantic orientation. The Prior Polarity [56] refers to the individual words Semantic Orientation (SO) and differs from the SO because it is always independent from the context.
- 12.
Local grammars are algorithms that, through grammatical, morphological and lexical instructions, are used to formalize linguistic phenomena and to -parse texts. They are defined “local” because, despite any generalization, they can be used only in the description and analysis of limited linguistic phenomena.
- 13.
The main difference between the words listed in the two scales is the possibility to use them as indicators for the subjectivity detection: basically, the words belonging to the evaluation scale are “anchors” that begin the identification of polarized phrases or sentences, while the ones belonging to the strength scale are just used as intensity modifiers (see Paragraph 5.3).
- 14.
available for consultation at http://dsc.unisa.it/composti/tavole/combo/tavole.asp.
- 15.
The morphological method could be also applied to Italian verbs, but we chose to avoid this solution because of the complexity of their argument structures. We decided, instead, to manually evaluate all the verbs described in the Italian Lexicon-grammar binary tables, so we could preserve the different lexical, syntactic and transformational rules connected to each one of them [16].
- 16.
The meaning of the deadjectival adverbs in -mente is not always predictable starting from the base adjectives from which they are derived. Also the syntactic structures in which they occur influences their interpretation. Depending on their position in sentences, the deadjectival adverbs can be described as adjective modifiers (e.g. altamente “highly”), predicate modifiers (e.g. perfettamente “perfectly”) or sentence modifiers (e.g. ultimamente “lately”).
- 17.
Metanodes are labeled through the six corresponding values of the evaluation scale, which goes from –3 to +3.
- 18.
The dataset contains Italian opinionated texts in the form of users reviews and comments from e-commerce and opinion websites; it lists 600 texts units (50 positive and 50 negative for each product class) and refers to six different domains, for all of which different websites (such as www.ciao.it; www.amazon.it; www.mymovies.it; www.tripadvisor.it) have been exploited [44].
- 19.
Other idioms included in our resources are of the kind N0 essere (Agg + Ppass) Prep C1 (e.g. Max è matto da legare, “Max is so crazy he should be locked up”); N0 essere Agg e Agg (e.g. Max è bello e fritto, “Max is cooked”); C0 essere Agg (come C1 + E) (e.g. Mary ha la coscienza sporca \(\leftrightarrow \) La coscienza è sporca, “Mary has a guilty conscience” \(\leftrightarrow \) “The conscience is guilty”), N0 essere C1 Agg (e.g. Mary è una gatta morta, “Mary is a cock tease”).
- 20.
Words that, at first glance, seem to be intensifiers but at a deeper analysis reveal a more complex behavior are abbastanza “enough” troppo “too much” and poco “not much”.
In this research we noticed as well that the co-occurrence of troppo, poco and abbastanza with polar lexical items can provoke, in their semantic orientation, effects that can be associated to other contextual valence shifters. The ad hoc rules dedicated to these words (see Table ??) are not actually new, but refer to other contextual valence shifting rules that have been discussed in this Paragraph.
References
Andreevskaia, A., Bergler, S.: When specialists and generalists work together: overcoming domain dependence in sentiment tagging. In: ACL, pp. 290–298 (2008)
Argamon, S., Bloom, K., Esuli, A., Sebastiani, F.: Automatically determining attitude type and force for sentiment analysis, pp. 218–231 (2009)
Balibar-Mrabti, A.: Une étude de la combinatoire des noms de sentiment dans une grammaire locale. Langue française, pp. 88–97 (1995)
Baroni, M., Vegnaduzzo, S.: Identifying subjective adjectives through web-based mutual information, vol. 4, pp. 17–24 (2004)
Basile, V., Nissim, M.: Sentiment analysis on Italian tweets. In: Proceedings of the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pp. 100–107 (2013)
Benamara, F., Cesarano, C., Picariello, A., Recupero, D.R., Subrahmanian, V.S.: Sentiment analysis: adjectives and adverbs are better than adjectives alone. In: ICWSM (2007)
Benamara, F., Chardon, B., Mathieu, Y., Popescu, V., Asher, N.: How do negation and modality impact on opinions? pp. 10–18 (2012)
Bolioli, A., Salamino, F., Porzionato, V.: Social media monitoring in real life with blogmeter platform. In: ESSEM@ AI* IA 1096, 156–163 (2013)
Dang, Y., Zhang, Y., Chen, H.: A lexicon-enhanced method for sentiment classification: An experiment on online product reviews. In: Intelligent Systems, IEEE, vol. 25, pp. 46–53. IEEE (2010)
Dasgupta, S., Ng, V.: Mine the easy, classify the hard: a semi-supervised approach to automatic sentiment classification. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2-vol. 2, pp. 701–709. Association for Computational Linguistics (2009)
De Mauro, T.: Dizionario italiano. Paravia, Torino (2000)
Di Gennaro, P., Rossi, A., Tamburini, F.: The ficlit+ cs@ unibo system at the evalita 2014 sentiment polarity classification task. In: Proceedings of the Fourth International Workshop EVALITA 2014 (2014)
Dragut, E.C., Yu, C., Sistla, P., Meng, W.: Construction of a sentimental word dictionary, pp. 1761–1764 (2010)
Elia, A.: Le verbe italien. Les complétives dans les phrases àa un complément (1984)
Elia, A.: Chiaro e tondo: Lessico-Grammatica degli avverbi composti in italiano. Segno Associati (1990)
Elia, A., Martinelli, M., D’Agostino, E.: Lessico e Strutture sintattiche. Liguori, Introduzione alla sintassi del verbo italiano. Napoli (1981)
Esuli, A., Sebastiani, F.: Determining the semantic orientation of terms through gloss classification. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp. 617–624. ACM (2005)
Esuli, A., Sebastiani, F.: Determining term subjectivity and term orientation for opinion mining vol. 6, p. 2006 (2006)
Esuli, A., Sebastiani, F.: SentiWordNet: a publicly available lexical resource for opinion mining. In: Proceedings of LREC, vol. 6, pp. 417–422 (2006)
Fellbaum, C.: WordNet. Wiley Online Library (1998)
Gaeta, L.: Nomi d’azione. La formazione d elle parole in italiano. Tübingen: Max Niemeyer Verlag, pp. 314–351 (2004)
Gamon, M., Aue, A.: Automatic identification of sentiment vocabulary: exploiting low association with known sentiment terms, pp. 57–64 (2005)
Ganapathibhotla, M., Liu, B.: Mining opinions in comparative sentences. In: Proceedings of the 22nd International Conference on Computational Linguistics, vol. 1. pp. 241–248. Association for Computational Linguistics (2008)
Goldberg, A.B., Zhu, X.: Seeing stars when there aren’t many stars: graph-based semi-supervised learning for sentiment categorization. In: Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing, pp. 45–52. Association for Computational Linguistics (2006)
Gross, M.: Les bases empiriques de la notion de prédicat sémantique. Langages, pp. 7–52 (1981)
Gross, M.: Les phrases figées en français. In: L’information grammaticale, vol. 59, pp. 36–41. Peeters (1993)
Gross, M.: Une grammaire locale de l’expression des sentiments. Langue française, pp. 70–87 (1995)
Hassan, A., Radev, D.: Identifying text polarity using random walks, pp. 395–403 (2010)
Hatzivassiloglou, V., McKeown, K.R.: Predicting the semantic orientation of adjectives. In: Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics, pp. 174–181. Association for Computational Linguistics (1997)
Hernandez-Farias, I., Buscaldi, D., Priego-Sánchez, B.: Iradabe: adapting English lexicons to the Italian sentiment polarity classification task. In: First Italian Conference on Computational Linguistics (CLiC-it 2014) and the fourth International Workshop EVALITA2014, pp. 75–81 (2014)
Hu, M., Liu, B.: Mining and summarizing customer reviews, pp. 168–177 (2004)
Iacobini, C.: Prefissazione. La formazione delle parole in italiano. Tübingen: Max Niemeyer Verlag, pp. 97–161 (2004)
Kaji, N., Kitsuregawa, M.: Building lexicon for sentiment analysis from massive collection of html documents. In: EMNLP-CoNLL, pp. 1075–1083 (2007)
Kamps, J., Marx, M., Mokken, R.J., De Rijke, M.: Using wordnet to measure semantic orientations of adjectives (2004)
Kanayama, H., Nasukawa, T.: Fully automatic lexicon expansion for domain-oriented sentiment analysis. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pp. 355–363. Association for Computational Linguistics (2006)
Kanayama, H., Nasukawa, T.: Fully automatic lexicon expansion for domain-oriented sentiment analysis, p. 355 (2006)
Kang, H., Yoo, S.J., Han, D.: Senti-lexicon and improved naïve bayes algorithms for sentiment analysis of restaurant reviews. In: Expert Systems with Applications, vol. 39, pp. 6000–6010. Elsevier (2012)
Kennedy, A., Inkpen, D.: Sentiment classification of movie reviews using contextual valence shifters. Comput. Intell. 22(2), 110–125 (2006)
Kim, S.M., Hovy, E.: Determining the sentiment of opinions, p. 1367 (2004)
Ku, L.W., Huang, T.H., Chen, H.H.: Using morphological and syntactic structures for Chinese opinion analysis, pp. 1260–1269 (2009)
Landauer, T.K., Dumais, S.T.: A solution to plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. In: Psychological Review, vol. 104, p. 211. American Psychological Association (1997)
Li, F., Huang, M., Zhu, X.: Sentiment analysis with global topics and local dependency. In: AAAI (2010)
Maisto, A., Pelosi, S.: Feature-based customer review summarization. In: Meersman, R., et al. (eds.) OTM 2014. LNCS, vol. 8842, pp. 299–308. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-45550-0_30
Maisto, A., Pelosi, S.: A lexicon-based approach to sentiment analysis. the Italian module for Nooj. In: Proceedings of the International Nooj 2014 Conference, University of Sassari, Italy. Cambridge Scholar Publishing (2014)
Maks, I., Vossen, P.: Different approaches to automatic polarity annotation at synset level, pp. 62–69 (2011)
Mathieu, Y.Y.: Les prédicats de sentiment. Langages, pp. 41–52 (1999)
Miller, G.A.: Wordnet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)
Mohammad, S., Dunne, C., Dorr, B.: Generating high-coverage semantic orientation lexicons from overtly marked words and a thesaurus. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2-vol. 2, pp. 599–608. Association for Computational Linguistics (2009)
Moilanen, K., Pulman, S.: Sentiment composition, pp. 378–382 (2007)
Moilanen, K., Pulman, S.: The good, the bad, and the unknown: morphosyllabic sentiment tagging of unseen words, pp. 109–112 (2008)
Mulder, M., Nijholt, A., Den Uyl, M., Terpstra, P.: A lexical grammatical implementation of affect, pp. 171–177 (2004)
Mullen, T., Collier, N.: Sentiment analysis using support vector machines with diverse information sources. In: EMNLP, vol. 4, pp. 412–418 (2004)
Navigli, R., Ponzetto, S.P.: BabelNet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic network, vol. 193, pp. 217–250 (2012)
Neviarouskaya, A.: Compositional approach for automatic recognition of fine-grained affect, judgment, and appreciation in text (2010)
Neviarouskaya, A., Prendinger, H., Ishizuka, M.: Compositionality principle in recognition of fine-grained emotions from text. In: ICWSM (2009)
Osgood, C.E.: The nature and measurement of meaning. Psychol. Bull. 49(3), 197 (1952)
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up?: sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on Empirical methods in natural language processing, vol. 10, pp. 79–86. Association for Computational Linguistics (2002)
Pianta, E., Bentivogli, L., Girardi, C.: MultiWordNet: developing an aligned multilingual database. In: Proceedings of the first international conference on global WordNet, vol. 152, pp. 55–63 (2002)
Polanyi, L., Zaenen, A.: Contextual valence shifters, pp. 1–10 (2006)
Prabowo, R., Thelwall, M.: Sentiment analysis: a combined approach. J. Inf. 3, 143–157 (2009)
Qiu, G., Liu, B., Bu, J., Chen, C.: Expanding domain sentiment lexicon through double propagation. vol. 9, pp. 1199–1204 (2009)
Rainer, F.: Derivazione nominale deaggettivale. La formazione delle parole in italiano, pp. 293–314 (2004)
Rao, D., Ravichandran, D.: Semi-supervised polarity lexicon induction. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, pp. 675–682. Association for Computational Linguistics (2009)
Read, J., Carroll, J.: Weakly supervised techniques for domain-independent sentiment classification. In: Proceedings of the 1st International CIKM Workshop on Topic-Sentiment Analysis for Mass Opinion, pp. 45–52. ACM (2009)
Riloff, E., Wiebe, J., Wilson, T.: Learning subjective nouns using extraction pattern bootstrapping. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, vol. 4. pp. 25–32. Association for Computational Linguistics (2003)
Russo, I., Frontini, F., Quochi, V.: OpeNER sentiment lexicon italian - LMF (2016). http://hdl.handle.net/20.500.11752/ILC-73, digital Repository for the CLARIN Research Infrastructure provided by ILC-CNR
Taboada, M., Anthony, C., Voll, K.: Methods for creating semantic orientation dictionaries. In: Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC), Genova, Italy, pp. 427–432 (2006)
Tan, S., Cheng, X., Wang, Y., Xu, H.: Adapting naive bayes to domain adaptation for sentiment analysis. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 337–349. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-00958-7_31
Turney, P.D.: Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews, pp. 417–424 (2002)
Turney, P.D., Littman, M.L.: Measuring praise and criticism: inference of semantic orientation from association. ACM Trans. Inf. Syst. (TOIS) 21, 315–346 (2003)
Velikovich, L., Blair-Goldensohn, S., Hannan, K., McDonald, R.: The viability of web-derived polarity lexicons. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 777–785. Association for Computational Linguistics (2010)
Vermeij, M.: The orientation of user opinions through adverbs, verbs and nouns. In: 3rd Twente Student Conference on IT, Enschede June (2005)
Vietri, S.: The Italian module for Nooj. In: In Proceedings of the First Italian Conference on Computational Linguistics, CLiC-it 2014. Pisa University Press (2014)
Vietri, S.: On some comparative frozen sentences in Italian. Lingvisticæ Investigationes 14(1), 149–174 (1990)
Vietri, S.: On a class of Italian frozen sentences. Lingvisticæ Investigationes 34(2), 228–267 (2011)
Wan, X.: Co-training for cross-lingual sentiment classification. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, vol. 1, pp. 235–243. Association for Computational Linguistics (2009)
Wang, X., Zhao, Y., Fu, G.: A morpheme-based method to Chinese sentence-level sentiment classification. Int. J. Asian Lang. Proc. 21(3), 95–106 (2011)
Wawer, A.: Extracting emotive patterns for languages with rich morphology. Int. J. Comput. Linguist. Appl. 3(1), 11–24 (2012)
Wiebe, J.: Learning subjective adjectives from corpora. In: AAAI/IAAI, pp. 735–740 (2000)
Ye, Q., Zhang, Z., Law, R.: Sentiment classification of online reviews to travel destinations by supervised machine learning approaches. In: Expert Systems with Applications, vol. 36, pp. 6527–6535. Elsevier (2009)
Yi, J., Nasukawa, T., Bunescu, R., Niblack, W.: Sentiment analyzer: extracting sentiments about a given topic using natural language processing techniques, pp. 427–434 (2003)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 Springer Nature Switzerland AG
About this paper
Cite this paper
Pelosi, S., Maisto, A., Melillo, L., Elia, A. (2023). Sentiment Analysis Through Finite State Automata. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2019. Lecture Notes in Computer Science, vol 13452. Springer, Cham. https://doi.org/10.1007/978-3-031-24340-0_14
Download citation
DOI: https://doi.org/10.1007/978-3-031-24340-0_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-24339-4
Online ISBN: 978-3-031-24340-0
eBook Packages: Computer ScienceComputer Science (R0)