Skip to main content
Log in

Dream sentiment analysis using second order soft co-occurrences (SOSCO) and time course representations

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

We describe a project undertaken by an interdisciplinary team combining researchers in sleep psychology and in Natural Language Processing/Machine Learning. The goal is sentiment analysis on a corpus containing short textual descriptions of dreams. Dreams are categorized in a four-level scale of positive and negative sentiments. We chose a four scale annotation to reflect the sentiment strength and simplicity at the same time. The approach is based on a novel representation, taking into account the leading themes of the dream and the sequential unfolding of associated sentiments during the dream. The dream representation is based on three combined parts, two of which are automatically produced from the description of the dream. The first part consists of co-occurrence vector representation of dreams in order to detect sentiment levels in the dream texts. Those vectors unlike the standard Bag-of-words model capture non-local relationships between meanings of word in a corpus. The second part introduces the dynamic representation that captures the sentimental changes throughout the progress of the dream. The third part is the self-reported assessment of the dream by the dreamer according to eight given attributes (self-assessment is different in many respects from the dream’s sentiment classification). The three representations are subject to aggressive feature selection. Using an ensemble of classifiers on the combined 3-partite representation, the agreement between machine rating and the human judge scores on the four scales was 64 % which is in the range of human experts’ consensus in that domain. The accuracy of the system was 14 % more than previous results on the same task.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. General Inquirer Project., The General Inquirer; introduction to a computer-based system of content analysis, General Inquirer Project, Edinburgh, 1974.

  2. In first order text representation, the text is represented only by the set of words that either directly occurred in the text or frequently co-occurred with them in the corpus; in second order text representation the text is represented indirectly via a vector average of its containing words co-occurrence vectors.

  3. The word is supposed to be disambiguated over its several meanings or senses.

  4. Unigrams are single terms which occur more than once, Bigrams are ordered pairs of words, and Co-occurrences are simply unordered bigrams.

  5. The study has been carried out in the same dream laboratory and the same task definitions as this research.

  6. (-) has been kept in order to individualize the compound words like: Multi-dimensions, Anti-oxidant, etc.

  7. We recall that the applied notation for “Words” was an upper case ‘W’. Thus, the notation used for weight in this section is a lower case ‘w’- , the values for \(\left\{ {w_i } \right\}_{i\in \left\{ {1,\ldots ,6} \right\}} =\left( {100,\;35,\;15,\;10,\;3,\;1} \right)\) have been empirically set for the weights.

  8. Classifiers which can handle numerous features.

  9. The Link Grammar Parser is a syntactic parser of English, based on link grammar, an original theory of English syntax which has been designed and developed at Carnegie Mellon University, School of Computer Science. For more details please refer to: http://www.link.cs.cmu.edu/link/.

  10. Sentiment modifiers are taken into account.

  11. Because of computational and consequently time complexity.

  12. The simple classifiers used for the above classifiers were Multinomial logistic regression and J48 decision trees.

  13. Literature shows 57 % to 80 % agreement in human judgment in this area and range.

  14. Tagged by dreamers themselves.

References

  • Chaffar, S., & Inkpen, D. (2011). Towards Emotion Detection in Text. IEEE Transactions in Affective Computing.

  • Choi, Y., & Cardie, C. (2008). Learning with compositional semantics as structural inference for subsentential sentiment analysis. In: EMNLP ’08: Proceedings of the conference on empirical methods in natural language processing (pp. 793–801). Morristown, NJ: Association for Computational Linguistics.

    Chapter  Google Scholar 

  • Delorme, M.-A., Lortie-Lussier, M., De Koninck, J. (2002). Stress and coping in the waking and dreaming states during an examination period. Dreaming, 12(4), 171–183.

    Article  Google Scholar 

  • Domhoff, G. W. (2003). The scientific study of dreams: Neural networks, cognitive development, and content analysis. New York: American Psychological Association.

    Book  Google Scholar 

  • Domhoff, G. W., & Schneider, A. (2008). Studying dream content using the archive and search engine on DreamBank.net. Consciousness and Cognition, 17, 1238–1247.

    Article  Google Scholar 

  • Ekman, P. (1992a). Are there basic emotions? Psychological Review, 99, 550–553.

    Article  Google Scholar 

  • Ekman, P. (1992b). An argument for basic emotions. Cognition and Emotion, 6, 169–200.

    Article  Google Scholar 

  • Firth, J.R., et al. (1957). Studies in linguistic analysis. A synopsis of linguistic theory 1930–1955. Special volume of the Philological Society. Oxford: Blackwell.

  • Frantova, E., & Bergler, S. (2009). Automatic emotion annotation of dream diaries. K-CAP.

  • Ghazi, D., Inkpen, D., & Szpakowicz, S. (2010). Hierarchical versus flat classification of emotions in text. In Proceedings of the NAACL HLT 2010 workshop on computational approaches to analysis and generation of emotion in text; Association for Computational Linguistics (USA) (pp. 140–146).

  • Hall, C.S., & Van de Castle, R.L. (1966). The content analysis of dreams. New York: Meredith Publishing Company.

    Google Scholar 

  • Harris, Z. (1954). Distributional structure. Word, 10(2/3), 146–162.

    Google Scholar 

  • Harris, Z. (1964). Distributional structure. In J.J. Katz & J.A. Fodor (Eds.), The philosophy of linguistics. New York: Oxford University Press.

    Google Scholar 

  • Harris, Z. (1985). On Grammars of Science. Linguistics and Philosophy: Essays in honor of Rulon S. Wells. In A. Makkai & A.K. Melby (Eds.), Current issues in linguistic theory (Vol. 42, pp. 139–148). Amsterdam & Philadelphia: John Benjamins.

    Google Scholar 

  • Hartmann, E. (1998). Dreams and nightmares: The new theory on the origin and meaning of dreams. New York: Plenum.

    Google Scholar 

  • Hobson, J.A., Stickgold, R., Pace-Schott, E.F. (1998). The Neuropsychology of REMsleep dreaming. NeuroReport, 9, R1–R14.

    Article  Google Scholar 

  • Kulkarni, A., & Pedersen, T. (2005). SenseClusters: Unsupervised clustering and labeling of similar contexts - appears in the proceedings of the demonstration and interactive poster session of the 43rd annual meeting of the Association for Computational Linguistics, pp. 105–108.

  • Manning, C.D., & Schütze, H. (1998). Foundations of statistical natural language processing. Cambridge, MA: The MIT Press.

    Google Scholar 

  • Maquet, P., Péters, J.M. , Aerts, J., Delfiore, G., Degueldre, C, Luxen, A., & Franck, G. (1996). Functional neuroanatomy of human rapid-eye- movement sleep and dreaming. Nature, 383, 163–166.

    Article  Google Scholar 

  • Matwin, S., Kouznetsov, A., Inkpen, D., Frunza, O., & O’Blenis, P. (2010). A new algorithm for reducing the workload of experts in performing systematic reviews. Journal of JAMIA, 17(14), 446–453.

    Google Scholar 

  • Mcdonald, S., & Ramscar, M. (2001). Testing the distributional hypothesis: The influence of context on judgements of semantic similarity. In Proceedings of the 23rd annual conference of the Cognitive Science Society.

  • Melville, P., Gryc, W., & Lawrence, R.D. (2009). Sentiment analysis of blogs by combining lexical knowledge with text classification. In KDD ’09: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1275–1284). New York, NY: ACM.

    Chapter  Google Scholar 

  • Miller, G., & Charles, W. (1991). Contextual correlates of semantic similarity. Language and Cognitive Processes, 6(1), 1–28.

    Article  Google Scholar 

  • Nadeau, D., Sabourin, C., De Koninck, J., Matwin, S., & Turney, P.D. (2006). Automatic dream sentiment analysis. In Proceedings of the workshop on computational aesthetics at the twenty-first national conference on artificial intelligence (AAAI-06), Boston, USA.

  • Nielsen, T.A., & Strenstrom, P. (2005). What are the memory sources of dreaming? Nature, 437, 1286–1289.

    Article  Google Scholar 

  • Pang, B., Lee, L., & Vaithyanathan, S. (2002). Thumbs up?: sentiment classification using machine learning techniques. In EMNLP ’02: Proceedings of the ACL-02 conference on empirical methods in natural language processing (pp. 79–86). Morristown, NJ: Association for Computational Linguistics.

    Chapter  Google Scholar 

  • Pedersen, T., & Bruce, R. (1997). Distinguishing word senses in untagged text. In Proceedings of the second conference on empirical methods in natural language processing, Providence, RI (pp. 197–207).

  • Pedersen, T., & Bruce, R. (1998). Knowledge lean word sense disambiguation. In Proceedings of the fifteenth national conference on artificial intelligenc, Madison, WI (pp. 800–805).

  • Pedersen, T., & Kulkarni, A. (2006). Selecting the right number of senses based on clustering criterion functions. In Proceedings of the posters and demo program of the eleventh conference of the European chapter of the Association for Computational Linguistics, Trento, Italy (pp. 111–114).

  • Pedersen, T., & Kulkarni, A. (2007). Unsupervised discrimination of person names in web contexts. In CICLing (pp. 299–310).

  • Pedersen, T., Purandare, A., & Kulkarni, A. (2005). Name discrimination by clustering similar contexts. In Proceedings of the sixth international conference on intelligent text processing and computational linguistics, Mexico City (pp. 220–231).

  • Pedersen, T., Kulkarni, A., Angheluta, R., Kozareva, Z., & Solorio, T. (2006). An unsupervised language independent method of name discrimination using second order co-occurrence features. In Proceedings of the seventh international conference on intelligent text processing and computational linguistics, Mexico City (pp. 208–222).

  • Pennebaker, J.W., Francis, M.E., & Booth, R.J. (2001). Linguistic inquiry and word count LIWC2001. Mahwah, NJ: Erlbaum Publishers.

    Google Scholar 

  • Posner, M.I., & Keele, S.W. (1968). On the genesis of abstract ideas. Journal of Experimental Psychology, 77, 353–363.

    Article  Google Scholar 

  • Purandare, A., & Pedersen, T. (2004a). Word sense discrimination by clustering contexts in vector and similarity spaces. In Proceedings of the conference on computational natural language learning, Boston, MA.

  • Purandare, A., & Pedersen, T. (2004b). SenseClusters—finding clusters that represent word senses. AAAI, 1030–1031.

  • Robnik-Sikonja, M., & Kononenko, I. (1997). An adaptation of Relief for attribute estimation in regression. In Fourteenth international conference on machine learning (pp. 296–304).

  • Rosch, E. (1978). Principles of categorization. In E. Rosch & B.B. Loyd (Eds.), Cognition and categorization (pp. 28–71). Hillsdale: Erlbaum.

    Google Scholar 

  • Schtitze, H., & Pedersen, J.O. (1995). Information retrieval based on word senses. In Fourth annual symposium on document analysis and information retrieval (pp. 161–175).

  • Schütze, H. (1998). Automatic word sense discrimination. Computational Linguistics, 24(1), 97–123.

    Google Scholar 

  • Smith, E.E., & Medin, D.L. (1981). Categories and concepts. Cambridge: Harvard University Press.

    Book  Google Scholar 

  • Spark, K., & Jones, A. (1972). Statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28(1), 11–21.

    Article  Google Scholar 

  • St-Onge, M., Lortie-Lussier, M., Mercier, P., Grenier, J., De Koninck, J. (2005). Emotions in the diary and REM dreams of young and late adulthood women and their relation to life satisfaction. Dreaming, 15, 116–128.

    Article  Google Scholar 

  • Turney, P. (2002). Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proc. 40th annual meeting of the association for computational linguistics.

  • Turney, P.D. (2008). The latent relation mapping engine: algorithm and experiments. Journal of Artificial Intelligence Research (JAIR), 33, 615–655 (NRC #50738).

    Google Scholar 

  • Turney, P.D., & Pantel, P. (2010). From frequency to meaning: vector space models of semantics. Journal of Artificial Intelligence Research (JAIR), 37, 141–188.

    MATH  MathSciNet  Google Scholar 

  • Turney, P.D., Neuman, Y., Assaf, D., & Cohen, Y. (2011). Literal and metaphorical sense identification through concrete and abstract context. In Proceedings of the 2011 conference on empirical methods in natural language processing (EMNLP-2011), Edinburgh, Scotland, UK (pp. 680–690).

  • Witten, I.H., & Frank, E. (2005). Data mining: Practical machine learning tools and techniques (2nd ed.). San Francisco: Morgan Kaufmann.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amir H. Razavi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Razavi, A.H., Matwin, S., De Koninck, J. et al. Dream sentiment analysis using second order soft co-occurrences (SOSCO) and time course representations. J Intell Inf Syst 42, 393–413 (2014). https://doi.org/10.1007/s10844-013-0273-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-013-0273-4

Keywords

Navigation