Skip to main content

Sentiment Analysis Through Finite State Automata

  • Conference paper
  • First Online:
Book cover Computational Linguistics and Intelligent Text Processing (CICLing 2019)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13452))

  • 362 Accesses

Abstract

The present research aims to demonstrate how powerful Finite State Automata (FSA) can be, into a domain in which the vagueness of the human opinions and the subjectivity of the user generated contents make the automatic “understanding” of texts extremely hard. Assuming that the semantic orientation of sentences is based on the manipulation of sentiment words, we built from scratch, for the Italian language, a network of local grammars for the annotation of sentiment expressions and electronic dictionaries for the classification of more than 15,000 opinionated words. In the paper we explain in detail how we made use of FSA for both the automatic population of sentiment lexicons and the sentiment classification of real sentences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Annibale Elia, Lorenza Melillo and Alessandro Maisto worked on the Conclusion of the paper, while Serena Pelosi on Introduction and Paragraphs 1, 2, 3 and 4.

  2. 2.

    We chose a rule-based method, among others, in order to verify the hypothesis that words can be classified together in accordance to both semantic and syntactic criteria.

  3. 3.

    Word Similarity is a very frequently used method in the dictionary propagation over the thesaurus-based approaches. Examples are the Maryland dictionary, created thanks to a Roget-like thesaurus and a handful of affixe [48], and other lexicons based on WordNet, like SentiWordNet, built on the base of quantitative analysis of glosses associated to synsets [17, 18] or other lexicons based on the computing of the distance measure on WordNet [17, 34].

  4. 4.

    Seed words are words which are strongly associated with a positive/negative meaning, such as eccellente (“excellent”) or orrendo (“horrible”), by which it is possible to build a bigger lexicon, detecting other words that frequently occur alongside them.

  5. 5.

    http://valeriobasile.github.io/twita/downloads.html.

  6. 6.

    https://www.celi.it/.

  7. 7.

    http://ai-applied.nl/sentiment-analysis-api.

  8. 8.

    http://hdl.handle.net/20.500.11752/ILC-73.

  9. 9.

    Although WordNet does not include semantic orientation information for its lemmas; semantic relations, such as synonymy or antonymy, are commonly used in order to automatically propagate the polarity, starting from a manually annotated set of seed word. [2, 13, 18, 18, 28, 31, 31, 34, 39, 45, 45]. This approach presents some drawbacks, such as the lack of scalability, the unavailability of enough resources for many languages and the difficulty to handle newly coined words, which are not already contained in the thesauri.

  10. 10.

    Morphemes allow not only the propagation of a given word polarity (e.g. en-, -ous, -fy), but also its switching (e.g. dis-, -less), its intensification (e.g. super-, over-) and its weakening (e.g. semi-) [54].

  11. 11.

    While compiling the dictionary, the judgment on the words “prior polarity” is given without considering any textual context. The entries of the sentiment dictionary receive the same annotation and, then, are grouped together if they posses the same semantic orientation. The Prior Polarity [56] refers to the individual words Semantic Orientation (SO) and differs from the SO because it is always independent from the context.

  12. 12.

    Local grammars are algorithms that, through grammatical, morphological and lexical instructions, are used to formalize linguistic phenomena and to -parse texts. They are defined “local” because, despite any generalization, they can be used only in the description and analysis of limited linguistic phenomena.

  13. 13.

    The main difference between the words listed in the two scales is the possibility to use them as indicators for the subjectivity detection: basically, the words belonging to the evaluation scale are “anchors” that begin the identification of polarized phrases or sentences, while the ones belonging to the strength scale are just used as intensity modifiers (see Paragraph 5.3).

  14. 14.

    available for consultation at http://dsc.unisa.it/composti/tavole/combo/tavole.asp.

  15. 15.

    The morphological method could be also applied to Italian verbs, but we chose to avoid this solution because of the complexity of their argument structures. We decided, instead, to manually evaluate all the verbs described in the Italian Lexicon-grammar binary tables, so we could preserve the different lexical, syntactic and transformational rules connected to each one of them [16].

  16. 16.

    The meaning of the deadjectival adverbs in -mente is not always predictable starting from the base adjectives from which they are derived. Also the syntactic structures in which they occur influences their interpretation. Depending on their position in sentences, the deadjectival adverbs can be described as adjective modifiers (e.g. altamente “highly”), predicate modifiers (e.g. perfettamente “perfectly”) or sentence modifiers (e.g. ultimamente “lately”).

  17. 17.

    Metanodes are labeled through the six corresponding values of the evaluation scale, which goes from –3 to +3.

  18. 18.

    The dataset contains Italian opinionated texts in the form of users reviews and comments from e-commerce and opinion websites; it lists 600 texts units (50 positive and 50 negative for each product class) and refers to six different domains, for all of which different websites (such as www.ciao.it; www.amazon.it; www.mymovies.it; www.tripadvisor.it) have been exploited [44].

  19. 19.

    Other idioms included in our resources are of the kind N0 essere (Agg + Ppass) Prep C1 (e.g. Max è matto da legare, “Max is so crazy he should be locked up”); N0 essere Agg e Agg (e.g. Max è bello e fritto, “Max is cooked”); C0 essere Agg (come C1 + E) (e.g. Mary ha la coscienza sporca \(\leftrightarrow \) La coscienza è sporca, “Mary has a guilty conscience” \(\leftrightarrow \) “The conscience is guilty”), N0 essere C1 Agg (e.g. Mary è una gatta morta, “Mary is a cock tease”).

  20. 20.

    Words that, at first glance, seem to be intensifiers but at a deeper analysis reveal a more complex behavior are abbastanza “enough” troppo “too much” and poco “not much”.

    In this research we noticed as well that the co-occurrence of troppo, poco and abbastanza with polar lexical items can provoke, in their semantic orientation, effects that can be associated to other contextual valence shifters. The ad hoc rules dedicated to these words (see Table ??) are not actually new, but refer to other contextual valence shifting rules that have been discussed in this Paragraph.

References

  1. Andreevskaia, A., Bergler, S.: When specialists and generalists work together: overcoming domain dependence in sentiment tagging. In: ACL, pp. 290–298 (2008)

    Google Scholar 

  2. Argamon, S., Bloom, K., Esuli, A., Sebastiani, F.: Automatically determining attitude type and force for sentiment analysis, pp. 218–231 (2009)

    Google Scholar 

  3. Balibar-Mrabti, A.: Une étude de la combinatoire des noms de sentiment dans une grammaire locale. Langue française, pp. 88–97 (1995)

    Google Scholar 

  4. Baroni, M., Vegnaduzzo, S.: Identifying subjective adjectives through web-based mutual information, vol. 4, pp. 17–24 (2004)

    Google Scholar 

  5. Basile, V., Nissim, M.: Sentiment analysis on Italian tweets. In: Proceedings of the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pp. 100–107 (2013)

    Google Scholar 

  6. Benamara, F., Cesarano, C., Picariello, A., Recupero, D.R., Subrahmanian, V.S.: Sentiment analysis: adjectives and adverbs are better than adjectives alone. In: ICWSM (2007)

    Google Scholar 

  7. Benamara, F., Chardon, B., Mathieu, Y., Popescu, V., Asher, N.: How do negation and modality impact on opinions? pp. 10–18 (2012)

    Google Scholar 

  8. Bolioli, A., Salamino, F., Porzionato, V.: Social media monitoring in real life with blogmeter platform. In: ESSEM@ AI* IA 1096, 156–163 (2013)

    Google Scholar 

  9. Dang, Y., Zhang, Y., Chen, H.: A lexicon-enhanced method for sentiment classification: An experiment on online product reviews. In: Intelligent Systems, IEEE, vol. 25, pp. 46–53. IEEE (2010)

    Google Scholar 

  10. Dasgupta, S., Ng, V.: Mine the easy, classify the hard: a semi-supervised approach to automatic sentiment classification. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2-vol. 2, pp. 701–709. Association for Computational Linguistics (2009)

    Google Scholar 

  11. De Mauro, T.: Dizionario italiano. Paravia, Torino (2000)

    Google Scholar 

  12. Di Gennaro, P., Rossi, A., Tamburini, F.: The ficlit+ cs@ unibo system at the evalita 2014 sentiment polarity classification task. In: Proceedings of the Fourth International Workshop EVALITA 2014 (2014)

    Google Scholar 

  13. Dragut, E.C., Yu, C., Sistla, P., Meng, W.: Construction of a sentimental word dictionary, pp. 1761–1764 (2010)

    Google Scholar 

  14. Elia, A.: Le verbe italien. Les complétives dans les phrases àa un complément (1984)

    Google Scholar 

  15. Elia, A.: Chiaro e tondo: Lessico-Grammatica degli avverbi composti in italiano. Segno Associati (1990)

    Google Scholar 

  16. Elia, A., Martinelli, M., D’Agostino, E.: Lessico e Strutture sintattiche. Liguori, Introduzione alla sintassi del verbo italiano. Napoli (1981)

    Google Scholar 

  17. Esuli, A., Sebastiani, F.: Determining the semantic orientation of terms through gloss classification. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp. 617–624. ACM (2005)

    Google Scholar 

  18. Esuli, A., Sebastiani, F.: Determining term subjectivity and term orientation for opinion mining vol. 6, p. 2006 (2006)

    Google Scholar 

  19. Esuli, A., Sebastiani, F.: SentiWordNet: a publicly available lexical resource for opinion mining. In: Proceedings of LREC, vol. 6, pp. 417–422 (2006)

    Google Scholar 

  20. Fellbaum, C.: WordNet. Wiley Online Library (1998)

    Google Scholar 

  21. Gaeta, L.: Nomi d’azione. La formazione d elle parole in italiano. Tübingen: Max Niemeyer Verlag, pp. 314–351 (2004)

    Google Scholar 

  22. Gamon, M., Aue, A.: Automatic identification of sentiment vocabulary: exploiting low association with known sentiment terms, pp. 57–64 (2005)

    Google Scholar 

  23. Ganapathibhotla, M., Liu, B.: Mining opinions in comparative sentences. In: Proceedings of the 22nd International Conference on Computational Linguistics, vol. 1. pp. 241–248. Association for Computational Linguistics (2008)

    Google Scholar 

  24. Goldberg, A.B., Zhu, X.: Seeing stars when there aren’t many stars: graph-based semi-supervised learning for sentiment categorization. In: Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing, pp. 45–52. Association for Computational Linguistics (2006)

    Google Scholar 

  25. Gross, M.: Les bases empiriques de la notion de prédicat sémantique. Langages, pp. 7–52 (1981)

    Google Scholar 

  26. Gross, M.: Les phrases figées en français. In: L’information grammaticale, vol. 59, pp. 36–41. Peeters (1993)

    Google Scholar 

  27. Gross, M.: Une grammaire locale de l’expression des sentiments. Langue française, pp. 70–87 (1995)

    Google Scholar 

  28. Hassan, A., Radev, D.: Identifying text polarity using random walks, pp. 395–403 (2010)

    Google Scholar 

  29. Hatzivassiloglou, V., McKeown, K.R.: Predicting the semantic orientation of adjectives. In: Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics, pp. 174–181. Association for Computational Linguistics (1997)

    Google Scholar 

  30. Hernandez-Farias, I., Buscaldi, D., Priego-Sánchez, B.: Iradabe: adapting English lexicons to the Italian sentiment polarity classification task. In: First Italian Conference on Computational Linguistics (CLiC-it 2014) and the fourth International Workshop EVALITA2014, pp. 75–81 (2014)

    Google Scholar 

  31. Hu, M., Liu, B.: Mining and summarizing customer reviews, pp. 168–177 (2004)

    Google Scholar 

  32. Iacobini, C.: Prefissazione. La formazione delle parole in italiano. Tübingen: Max Niemeyer Verlag, pp. 97–161 (2004)

    Google Scholar 

  33. Kaji, N., Kitsuregawa, M.: Building lexicon for sentiment analysis from massive collection of html documents. In: EMNLP-CoNLL, pp. 1075–1083 (2007)

    Google Scholar 

  34. Kamps, J., Marx, M., Mokken, R.J., De Rijke, M.: Using wordnet to measure semantic orientations of adjectives (2004)

    Google Scholar 

  35. Kanayama, H., Nasukawa, T.: Fully automatic lexicon expansion for domain-oriented sentiment analysis. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pp. 355–363. Association for Computational Linguistics (2006)

    Google Scholar 

  36. Kanayama, H., Nasukawa, T.: Fully automatic lexicon expansion for domain-oriented sentiment analysis, p. 355 (2006)

    Google Scholar 

  37. Kang, H., Yoo, S.J., Han, D.: Senti-lexicon and improved naïve bayes algorithms for sentiment analysis of restaurant reviews. In: Expert Systems with Applications, vol. 39, pp. 6000–6010. Elsevier (2012)

    Google Scholar 

  38. Kennedy, A., Inkpen, D.: Sentiment classification of movie reviews using contextual valence shifters. Comput. Intell. 22(2), 110–125 (2006)

    Article  Google Scholar 

  39. Kim, S.M., Hovy, E.: Determining the sentiment of opinions, p. 1367 (2004)

    Google Scholar 

  40. Ku, L.W., Huang, T.H., Chen, H.H.: Using morphological and syntactic structures for Chinese opinion analysis, pp. 1260–1269 (2009)

    Google Scholar 

  41. Landauer, T.K., Dumais, S.T.: A solution to plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. In: Psychological Review, vol. 104, p. 211. American Psychological Association (1997)

    Google Scholar 

  42. Li, F., Huang, M., Zhu, X.: Sentiment analysis with global topics and local dependency. In: AAAI (2010)

    Google Scholar 

  43. Maisto, A., Pelosi, S.: Feature-based customer review summarization. In: Meersman, R., et al. (eds.) OTM 2014. LNCS, vol. 8842, pp. 299–308. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-45550-0_30

    Chapter  Google Scholar 

  44. Maisto, A., Pelosi, S.: A lexicon-based approach to sentiment analysis. the Italian module for Nooj. In: Proceedings of the International Nooj 2014 Conference, University of Sassari, Italy. Cambridge Scholar Publishing (2014)

    Google Scholar 

  45. Maks, I., Vossen, P.: Different approaches to automatic polarity annotation at synset level, pp. 62–69 (2011)

    Google Scholar 

  46. Mathieu, Y.Y.: Les prédicats de sentiment. Langages, pp. 41–52 (1999)

    Google Scholar 

  47. Miller, G.A.: Wordnet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  48. Mohammad, S., Dunne, C., Dorr, B.: Generating high-coverage semantic orientation lexicons from overtly marked words and a thesaurus. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2-vol. 2, pp. 599–608. Association for Computational Linguistics (2009)

    Google Scholar 

  49. Moilanen, K., Pulman, S.: Sentiment composition, pp. 378–382 (2007)

    Google Scholar 

  50. Moilanen, K., Pulman, S.: The good, the bad, and the unknown: morphosyllabic sentiment tagging of unseen words, pp. 109–112 (2008)

    Google Scholar 

  51. Mulder, M., Nijholt, A., Den Uyl, M., Terpstra, P.: A lexical grammatical implementation of affect, pp. 171–177 (2004)

    Google Scholar 

  52. Mullen, T., Collier, N.: Sentiment analysis using support vector machines with diverse information sources. In: EMNLP, vol. 4, pp. 412–418 (2004)

    Google Scholar 

  53. Navigli, R., Ponzetto, S.P.: BabelNet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic network, vol. 193, pp. 217–250 (2012)

    Google Scholar 

  54. Neviarouskaya, A.: Compositional approach for automatic recognition of fine-grained affect, judgment, and appreciation in text (2010)

    Google Scholar 

  55. Neviarouskaya, A., Prendinger, H., Ishizuka, M.: Compositionality principle in recognition of fine-grained emotions from text. In: ICWSM (2009)

    Google Scholar 

  56. Osgood, C.E.: The nature and measurement of meaning. Psychol. Bull. 49(3), 197 (1952)

    Article  Google Scholar 

  57. Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up?: sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on Empirical methods in natural language processing, vol. 10, pp. 79–86. Association for Computational Linguistics (2002)

    Google Scholar 

  58. Pianta, E., Bentivogli, L., Girardi, C.: MultiWordNet: developing an aligned multilingual database. In: Proceedings of the first international conference on global WordNet, vol. 152, pp. 55–63 (2002)

    Google Scholar 

  59. Polanyi, L., Zaenen, A.: Contextual valence shifters, pp. 1–10 (2006)

    Google Scholar 

  60. Prabowo, R., Thelwall, M.: Sentiment analysis: a combined approach. J. Inf. 3, 143–157 (2009)

    Google Scholar 

  61. Qiu, G., Liu, B., Bu, J., Chen, C.: Expanding domain sentiment lexicon through double propagation. vol. 9, pp. 1199–1204 (2009)

    Google Scholar 

  62. Rainer, F.: Derivazione nominale deaggettivale. La formazione delle parole in italiano, pp. 293–314 (2004)

    Google Scholar 

  63. Rao, D., Ravichandran, D.: Semi-supervised polarity lexicon induction. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, pp. 675–682. Association for Computational Linguistics (2009)

    Google Scholar 

  64. Read, J., Carroll, J.: Weakly supervised techniques for domain-independent sentiment classification. In: Proceedings of the 1st International CIKM Workshop on Topic-Sentiment Analysis for Mass Opinion, pp. 45–52. ACM (2009)

    Google Scholar 

  65. Riloff, E., Wiebe, J., Wilson, T.: Learning subjective nouns using extraction pattern bootstrapping. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, vol. 4. pp. 25–32. Association for Computational Linguistics (2003)

    Google Scholar 

  66. Russo, I., Frontini, F., Quochi, V.: OpeNER sentiment lexicon italian - LMF (2016). http://hdl.handle.net/20.500.11752/ILC-73, digital Repository for the CLARIN Research Infrastructure provided by ILC-CNR

  67. Taboada, M., Anthony, C., Voll, K.: Methods for creating semantic orientation dictionaries. In: Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC), Genova, Italy, pp. 427–432 (2006)

    Google Scholar 

  68. Tan, S., Cheng, X., Wang, Y., Xu, H.: Adapting naive bayes to domain adaptation for sentiment analysis. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 337–349. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-00958-7_31

    Chapter  Google Scholar 

  69. Turney, P.D.: Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews, pp. 417–424 (2002)

    Google Scholar 

  70. Turney, P.D., Littman, M.L.: Measuring praise and criticism: inference of semantic orientation from association. ACM Trans. Inf. Syst. (TOIS) 21, 315–346 (2003)

    Article  Google Scholar 

  71. Velikovich, L., Blair-Goldensohn, S., Hannan, K., McDonald, R.: The viability of web-derived polarity lexicons. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 777–785. Association for Computational Linguistics (2010)

    Google Scholar 

  72. Vermeij, M.: The orientation of user opinions through adverbs, verbs and nouns. In: 3rd Twente Student Conference on IT, Enschede June (2005)

    Google Scholar 

  73. Vietri, S.: The Italian module for Nooj. In: In Proceedings of the First Italian Conference on Computational Linguistics, CLiC-it 2014. Pisa University Press (2014)

    Google Scholar 

  74. Vietri, S.: On some comparative frozen sentences in Italian. Lingvisticæ Investigationes 14(1), 149–174 (1990)

    Article  Google Scholar 

  75. Vietri, S.: On a class of Italian frozen sentences. Lingvisticæ Investigationes 34(2), 228–267 (2011)

    Article  Google Scholar 

  76. Wan, X.: Co-training for cross-lingual sentiment classification. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, vol. 1, pp. 235–243. Association for Computational Linguistics (2009)

    Google Scholar 

  77. Wang, X., Zhao, Y., Fu, G.: A morpheme-based method to Chinese sentence-level sentiment classification. Int. J. Asian Lang. Proc. 21(3), 95–106 (2011)

    Google Scholar 

  78. Wawer, A.: Extracting emotive patterns for languages with rich morphology. Int. J. Comput. Linguist. Appl. 3(1), 11–24 (2012)

    Google Scholar 

  79. Wiebe, J.: Learning subjective adjectives from corpora. In: AAAI/IAAI, pp. 735–740 (2000)

    Google Scholar 

  80. Ye, Q., Zhang, Z., Law, R.: Sentiment classification of online reviews to travel destinations by supervised machine learning approaches. In: Expert Systems with Applications, vol. 36, pp. 6527–6535. Elsevier (2009)

    Google Scholar 

  81. Yi, J., Nasukawa, T., Bunescu, R., Niblack, W.: Sentiment analyzer: extracting sentiments about a given topic using natural language processing techniques, pp. 427–434 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Serena Pelosi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pelosi, S., Maisto, A., Melillo, L., Elia, A. (2023). Sentiment Analysis Through Finite State Automata. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2019. Lecture Notes in Computer Science, vol 13452. Springer, Cham. https://doi.org/10.1007/978-3-031-24340-0_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-24340-0_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-24339-4

  • Online ISBN: 978-3-031-24340-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics