Abstract
The approach described in this paper explores the use of semantic structured representation of sentences extracted from texts for multi-domain sentiment analysis purposes. The presented algorithm is built upon a domain-based supervised approach using index-like structured for representing information extracted from text. The algorithm extracts dependency parse relationships from the sentences containing in a training set. Then, such relationships are aggregated in a semantic structured together with either polarity and domain information. Such information is exploited in order to have a more fine-grained representation of the learned sentiment information. When the polarity of a new text has to be computed, such a text is converted in the same semantic representation that is used (i) for detecting the domain to which the text belongs to, and then (ii), once the domain is assigned to the text, the polarity is extracted from the index-like structure. First experiments performed by using the Blitzer dataset for training the system demonstrated the feasibility of the proposed approach.
Similar content being viewed by others
1 Introduction
Sentiment analysis is a natural language processing task whose aim is to classify documents according to the opinion (polarity) they express on a given subject [1]. Generally speaking, sentiment analysis aims at determining the attitude of a speaker or a writer with respect to a topic or the overall tonality of a document. This task has created a considerable interest due to its wide applications. In recent years, the exponential increase of the Web for exchanging public opinions about events, facts, products, etc., has led to an extensive usage of sentiment analysis approaches, especially for marketing purposes.
By formalizing the sentiment analysis problem, a “sentiment” or “opinion” has been defined by [2] as a quintuple:
where \(o_j\) is a target object, \(f_{jk}\) is a feature of the object \(o_j\), \(so_{ijkl}\) is the sentiment value of the opinion of the opinion holder \(h_i\) on feature \(f_{jk}\) of object \(o_j\) at time \(t_l\). The value of \(so_{ijkl}\) can be positive (by denoting a state of happiness, bliss, or satisfaction), negative (by denoting a state of sorrow, dejection, or disappointment), or neutral (it is not possible to denote any particular sentiment), or a more granular rating. The term \(h_i\) encodes the opinion holder, and \(t_l\) is the time when the opinion is expressed.
Such an analysis, may be document-based, where the positive, negative, or neutral sentiment is assigned to the entire document content; or sentence-based where individual sentences are analyzed separately and classified according to the different polarity values. In the latter case, it is often desirable to find with a high precision the entity attributes towards which the detected sentiment is directed. Based on the scenario in which the opinion is needed, the use of a document-based analysis is preferred with respect to a sentence-based one, and vice versa. In this work, we want to extract the general opinion of an entire document; therefore, our approach relies on a document-based analysis.
A further aspect that it is important to take into account is that, in the classic sentiment analysis problem, the polarity of each document term is considered independently by the domain which the document belongs to. We illustrate the intuition behind domain specific term polarity by considering the following example:
-
1.
The sideboard is small and it is not able to contain a lot of stuff.
-
2.
The small dimensions of this decoder allow to move it easily.
In these two sentences the adjective “small” is used in two different domains. In the first sentence, we considered the Furnishings domain and, within it, the polarity of the adjective “small” is, for sure, “negative” because it highlights an issue of the described item. On the other hand, in the second sentence, where we considered the Electronics domain, the polarity of such an adjective may be considered “positive”. First attempts exploring how term polarity is conditioned by domain is presented in [3].
Unlike the approaches already discussed in the literature (presented in Sect. 2), we address the multi-domain sentiment analysis problem from a different perspective. Firstly, we extract semantic and linguistic relationships from document terms, and then, we aggregate them in a structured representation where domain information, and the related polarities, are preserved. Such a structured representation is stored in an index-like repository (from now simply referred as “index”). When the polarity of a new document has to be computed, its structured representation is built and, combined with domain information, it is used for querying the index in order to estimate the polarity of the whole document.
The rest of the work is structured as follows. Section 2 presents a survey on works about sentiment analysis. Section 3 described the proposed approach by explaining how texts are converted in a semantic structured representation, stored during the training phase, and exploited during the test one. Section 4 reports the comparison between the presented approach and three baselines. Finally, Sect. 6 concludes the paper.
2 Related Work
The topic of sentiment analysis has been studied extensively in the literature [2], where several techniques have been proposed and validated.
Machine learning techniques are the most common approaches used for addressing this problem, given that any existing supervised methods can be applied to sentiment classification. For instance, in [4], the authors compared the performance of Naive-Bayes, Maximum Entropy, and Support Vector Machines in sentiment analysis on different features like considering only unigrams, bigrams, combination of both, incorporating parts of speech and position information or by taking only adjectives. Moreover, beside the use of standard machine learning method, researchers have also proposed several custom techniques specifically for sentiment classification, like the use of adapted score function based on the evaluation of positive or negative words in product reviews [5], as well as by defining weighting schemata for enhancing classification accuracy [6].
An obstacle to research in this direction is the need of labeled training data, whose preparation is a time-consuming activity. Therefore, in order to reduce the labeling effort, opinion words have been used for training procedures. In [7, 8], the authors used opinion words to label portions of informative examples for training the classifiers. Opinion words have been exploited also for improving the accuracy of sentiment classification, as presented in [9], where a framework incorporating lexical knowledge in supervised learning to enhance accuracy has been proposed. Opinion words have been used also for unsupervised learning approaches like the one presented in [10].
Another research direction concerns the exploitation of discourse-analysis techniques. [11] discusses some discourse-based supervised and unsupervised approaches for opinion analysis; while in [12], the authors present an approach to identify discourse relations.
The approaches presented above are applied at the document-level [13,14,15,16,17,18], i.e., the polarity value is assigned to the entire document content. However, in some case, for improving the accuracy of the sentiment classification, a more fine-grained analysis of a document is needed. Hence, the sentiment classification of the single sentences, has to be performed. In the literature, we may find approaches ranging from the use of fuzzy logic [19,20,21,22,23] to the use of aggregation techniques [24] for computing the score aggregation of opinion words. In the case of sentence-level sentiment classification, two different sub-tasks have to be addressed: (i) to determine if the sentence is subjective or objective, and (ii) in the case that the sentence is subjective, to determine if the opinion expressed in the sentence is positive, negative, or neutral. The task of classifying a sentence as subjective or objective, called “subjectivity classification”, has been widely discussed in the literature [25,26,27,28] and systems implementing the capabilities of identifying opinion’s holder, target, and polarity have been presented [29]. Once subjective sentences are identified, the same methods as for sentiment classification may be applied. For example, in [30] the authors consider gradable adjectives for sentiment spotting; while in [31,32,33] the authors built models to identify some specific types of opinions.
In the last years, with the growth of product reviews, the use of sentiment analysis techniques was the perfect floor for validating them in marketing activities [34,35,36]. However, the issue of improving the ability of detecting the different opinions concerning the same product expressed in the same review became a challenging problem. Such a task has been faced by introducing “aspect” extraction approaches that were able to extract, from each sentence, which is the aspect the opinion refers to. In the literature, many approaches have been proposed: conditional random fields (CRF) [37], hidden Markov models (HMM) [38], sequential rule mining [39], dependency tree kernels [40], clustering [41], neural networks [42, 43], and genetic algorithms [44]. In [45, 46], two methods were proposed to extract both opinion words and aspects simultaneously by exploiting some syntactic relations of opinion words and aspects.
A particular attention should be given also to the application of sentiment analysis in social networks [47,48,49]. More and more often, people use social networks for expressing their moods concerning their last purchase or, in general, about new products. Such a social network environment opened up new challenges due to the different ways people express their opinions, as described by [50, 51], who mention “noisy data” as one of the biggest hurdles in analyzing social network texts.
One of the first studies on sentiment analysis on micro-blogging websites has been discussed in [52], where the authors present a distant supervision-based approach for sentiment classification.
At the same time, the social dimension of the Web opens up the opportunity to combine computer science and social sciences to better recognize, interpret, and process opinions and sentiments expressed over it. Such multi-disciplinary approach has been called sentic computing [53]. Application domains where sentic computing has already shown its potential are the cognitive-inspired classification of images [54], of texts in natural language, and of handwritten text [55].
Finally, an interesting recent research direction is domain adaptation, as it has been shown that sentiment classification is highly sensitive to the domain from which the training data is extracted. A classifier trained using opinionated documents from one domain often performs poorly when it is applied or tested on opinionated documents from another domain, as we demonstrated through the example presented in Sect. 1. The reason is that words and even language constructs used in different domains for expressing opinions can be quite different. To make matters worse, the same word in one domain may have positive connotations, but in another domain may have negative ones; therefore, domain adaptation is needed. In the literature, different approaches related to the Multi-Domain sentiment analysis have been proposed. Briefly, two main categories may be identified: (i) the transfer of learned classifiers across different domains [3, 56, 57], and (ii) the use of propagation of labels through graph structures [19, 58,59,60].
All approaches presented above are based on the use of statistical techniques for building sentiment models. The exploitation of semantic information is not taken into account. In this work, we proposed a first version of a semantic-based approach preserving the semantic relationships between the terms of each sentence in order to exploit them either for building the model and for estimating document polarity. The proposed approach, falling into the multi-domain sentiment analysis category, instead of using pre-determined polarity information associated with terms, it learns them directly from domain-specific documents. Such documents are used for training the models used by the system.
3 The Approach
As introduced in Sect. 1, the proposed system is based on the implementation of an index-like approach, based on the use of structured representations of documents. Such representation is use for either preserving domain information associated with each document and for estimating the polarity of unclassified ones. Document polarity is estimated through the computation of a Score Status Value [61] (SSV) representing the aggregation of the polarities estimated for each feature extracted from the document. In this section, the steps carried out for implementing our approach are presented.
3.1 Feature Extraction
The first task consists in the detection of the features that are exploited for building the sentiment model. The proposed approach has been designed upon two main desiderata:
-
1.
The need of preserving and exploiting semantic relationships between document terms, requires to find a structured representation of information able to address this issue. In particular, we want to store linguistic information of each term together with its semantic relationships with the other ones;
-
2.
The described approach addresses the problem of sentiment analysis in a multi-domain environment; therefore, each extracted feature has to enclose domain-specific information in order to exploit them during the estimation of document polarity.
Addressing the two pillars described above, requires to parse raw texts in order to extract significant linguistic and semantic information. The proposed solution for extracting the set of features is based on the use of a native natural language processing library, namely the Stanford NLP Core Toolkit [62].
For each document of the training set, we applied the Stanford parser for extracting the terms dependencies. Such dependencies are taken into account for preserving the semantic between terms in the structured representation used for representing document content.
As an example, let’s consider the following sentence:
“I came here to reflect my happiness by fishing.”
By applying the Stanford parser, we obtain the following list of dependencies between terms:
Each dependency is composed by three elements: the name of the “relation” (R), the “governor” (G) that is the first term of the dependency, and the “dependent” (D) that is the second one. First of all, we removed from the dependencies list, ones containing a stop wordFootnote 1 as governor or dependent element. Exceptions are made when one of the two terms contained in a dependency is an adjective. From the dependencies list presented above, the pruned list is the following:
Then, for each dependency contained in the pruned list, we compile a set of pairs “field - value”. Each pair is a “feature” associated with the dependency extracted from the document. Table 1 show, by using as example the dependency “dobj(reflect-5, happiness-7)”, the list of extracted features.
There are three considerations explaining the rationale of using the presented set of six features.
-
The choice of considering the governor and the dependent in both orders is to meet the possibility that the parser may produce different output based on how the text is written within the sentence. Such an order is affected also by the parser used. In our approach we decided to adopt the Stanford parser, but, obviously, any parser producing a list of dependencies like the one presented above can be used.
-
For the same reason, we decided to extract features pruned by the relation element, because different parsers may use different kind of dependencies. The meaning of these features (the third and fourth ones) is to track the co-occurrence of terms independently by the relationship between them.
-
Finally, the “G” and “D” features are used as backup purpose. Indeed, if, for training a particular model, a small number of samples is available, the use of single terms allows to apply a bag-of-words approach as a backup for computing document polarity. For these two features only nouns, verbs, adverbs, or adjectives are considered.
The set of features extracted from each dependencies is given as input to the component that will combine such features with either the polarity and domain information in order to construct the final representation of each document.
3.2 Structured Representation Construction
Once all features have been extracted, they are passed to the component in charge of structuring and storing them in the model repository that, for simplicity, we call “index”. As mentioned early, to each feature, the domain and polarity information are associated for building its equivalent structured representation. Where, the polarity associated with each feature contained in the model is the average of the polarities of the document in which each feature occurs. This shrewdness is necessary for distinguishing the polarities that each feature may assume in different domains. Indeed, classic approached based on the use of polarized vocabularies do not consider the possibility that a particular feature may assume different polarities depending on the context in which they occur. An example has been presented in Sect. 1.
On the light of this, the construction of the structured representation of each feature has to consider two aspects: (i) each feature may appear in different domains, and (ii) for each feature an estimation of the polarity for each domain has to be computed.
Therefore, each feature is translated into the correspondent structured representation shown below. By considering as example the feature “RGD - dobj-reflect-happiness”, we have the following structure:
The estimation of \(polarity_i\) values associated with each domain is done by analyzing only the explicit information extracted from the training set. Values are computed as:
where F is the feature taken into account, index i refers to domain \(D_i\) which the feature belongs to, n is the number of domains available in the training set, \(k_C^i\) is the arithmetic sum of the polarities observed for the feature F in the training set restricted to domain \(D_i\), and \(T_C^i\) is the number of instances of the training set, restricted to domain \(D_i\), in which feature F occurs.
Once all structured representation are built, they are stored in the repository. Such repository represents a multi-domain model for sentiment analysis purpose.
3.3 Polarity Computation
When an unclassified document needs to be evaluated, a procedure similar to the one adopted for building the model is used for computing its polarity.
A document is given as input to the Stanford parser and the list of dependencies is extracted and pruned by the ones containing stop words. Then, for each valid dependency, we build the related structured representation and we use it for estimating the polarity by analyzing information contained in the model. The final document polarity will be the average of the polarities estimated for each extracted dependency.
Let’s consider the following sentence:
“I feel good and I feel healthy.”
After the execution of the Stanford parser and the pruning of exceeding dependencies by using the same strategy described early, we obtain the following set of dependencies:
From these two dependencies, we generate the following two structures:
For each structure I presented above, for which the domain D is given, we computed the SSV representing the polarity of the structure I in the domain which the structure belongs to. The Equation below, show how the SSV is computed.
where DP is the function extracting the polarity of the feature I for the domain D, and AVG refers to the averaging operation of all detected polarities.
4 Experimental Evaluation
In this Section, we present the results obtained from our experimental campaign where we compared our representation in different settings.
Dataset construction And Baselines The training and testing of the system has been done on two different dataset. For creating the training model, we built structured document representation by using reviews contained in the Blitzer dataset. In particular, we used the balanced version of the dataset in order to same number of positive and negative samples. Concerning the test operation, we created a test set of 32.000 reviews compiled by using the same strategy used for building the Blitzer datasetFootnote 2. Test set is even balanced with respect to the number of positive and negative opinions. The same philosophy has been used for the domains, where, for each of the 16 domains used in the test set, we had 1.000 positive, and as many negative, reviews.
Our approach (Structured Domain Dependent, SDD) has been compared with three baselines:
-
Most Frequent Polarity: the accuracy obtained by the system if it guesses the same polarity for all samples contained in the test set.
-
Structured Domain Independent: the accuracy obtained by using the proposed structured representation without considering domain information.
-
Bag-Of-Word Domain Dependent: the accuracy obtained by using the classic statistical bag-of-words approach by considering also domain information.
Results and Discussion. Table 2 shows the results obtained by the three baselines and by the proposed approach. First column contains the name of the approach, while the second one the accuracy obtained on the test set.
Results show that the proposed approach leads to better results with respect to all the baselines. Beside this, there is also a significant difference between the accuracies obtained by using domain-dependent features (BDD and SDD approaches) and the one obtained without considering domain information.
By focusing on the two approaches exploiting domain information, in Table 3, we reported the detailed accuracy obtained on each domain by the two approaches exploiting such information. First column contain the name of the domain, second column the number of features for each domain and the last two columns the accuracies obtained by the BDD and SDD approaches respectively.
By observing the results reported in Table 3, no particular correlations between the number of features and the accuracy of the approach can be noticed. Unexpectedly, the worst result is obtained for the domain having the higher number of features, and one of the best results, obtained on the “tools_hardware” domain, is reported with a very low number of features compared to the others. One of the possible reasons may be the significant presence, in the set of documents used for building the model, of features having uncertain polarity, Indeed, if many features are used in either positive and negative contexts, it is difficult for the system to exploiting such information during the test phase for estimating document polarity. Further investigation in this direction may clarify this aspect.
Finally, we may notice that for the two domains, “gourmet_food” and “baby”, the performance of the bag of words approach, outperform the semantic one.
5 ESWC-2018 SSA Challenge Tasks #1 and #2
Tables 4 and 5 shows the full results of Tasks #1 and #2 participants.
Approach Limits. As we mentioned at the end of Sect. 2, the approach presented in this paper is a first attempt of exploring the use of structured representation of documents for addressing the sentiment analysis problem. For this reason, we performed a critical analysis of our work in order to highlight which are its limits and to outline a roadmap for future implementations. In particular, we detected three directions for extending the proposed approach:
-
Improve dependencies pruning: in the feature extraction process, we pruned part of the dependencies extracted by the Stanford parser. In the light of the results reported in Table 3, we inferred that having a huge number of features is not preparatory for obtaining higher results. Therefore, a more restrictive policy should be implemented in pruning dependencies by trying to detect the most significant features despite the ones causing information overlapping between domains.
-
Language coverage: a typical problem affecting the construction of language models is the language coverage of such models. Indeed, without having a large corpus for training the system, a significant number of terms information might be excluded. This issue is strictly connected with the next one and it may share the possible solution.
-
Improve the semantic aspect: one of the possibility for addressing the problem of language coverage, is the adoption of external semantic resources, for instance WordNet, for extending the meaning of each feature. This way, we will be able to reduce the total number of features, due to the use of a concept-based representation of each feature instead of a term-based one, and, at the same time, to increase the language coverage. Working in this direction will mean that the current structured representation will have to be revised accordingly.
6 Conclusion
In this paper, we described a system exploiting a structured representation of document for the problem of multi-domain sentiment analysis. Even if the representation used for structuring documents and the metric adopted for estimating document polarity is quite simple, the system obtained reasonable performances in the provided evaluation. Future work will address the possibility to exploit more sophisticated metrics considering the belonging of a document to a certain domain not in a binary but in a fuzzy fashion, measuring some sort of semantic relatedness of the sentence under test with each domain and using such measures as weights for the polarity detection phase. Moreover, we intend to explore the integration of knowledge bases in order to move toward a more cognitive technique able to improve the language coverage of the approach.
Notes
- 1.
The list of stop words used in this work is the one provided by Apache with the Lucene and Solr packages.
- 2.
The test set is available at https://goo.gl/siOJbZ.
References
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? sentiment classification using machine learning techniques. In: Proceedings of EMNLP, Philadelphia, Association for Computational Linguistics, pp. 79–86 (July 2002)
Liu, B., Zhang, L.: A survey of opinion mining and sentiment analysis. In: Aggarwal, C.C., Zhai, C.X. (eds.) Mining Text Data, pp. 415–463. Springer, Berlin (2012)
Blitzer, J., Dredze, M., Pereira, F.: Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. In: ACL, pp. 187–205 (2007)
Pang, B., Lee, L.: A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: ACL, pp. 271–278 (2004)
Dave, K., Lawrence, S., Pennock, D.M.: Mining the peanut gallery: opinion extraction and semantic classification of product reviews. In: WWW, pp. 519–528 (2003)
Paltoglou, G., Thelwall, M.: A study of information retrieval weighting schemes for sentiment analysis. In: ACL, pp. 1386–1395 (2010)
Tan, S., Wang, Y., Cheng, X.: Combining learn-based and lexicon-based techniques for sentiment detection without using labeled examples. In: SIGIR, pp. 743–744 (2008)
Qiu, L., Zhang, W., Hu, C., Zhao, K.: Selc: a self-supervised model for sentiment classification. In: CIKM, pp. 929–936 (2009)
Melville, P., Gryc, W., Lawrence, R.D.: Sentiment analysis of blogs by combining lexical knowledge with text classification. In: KDD, pp. 1275–1284 (2009)
Taboada, M., Brooke, J., Tofiloski, M., Voll, K.D., Stede, M.: Lexicon-based methods for sentiment analysis. Comput. Linguist. 37(2), 267–307 (2011)
Somasundaran, S.: Discourse-level relations for Opinion Analysis. Ph.D. thesis, University of Pittsburgh (2010)
Wang, H., Zhou, G.: Topic-driven multi-document summarization. In: IALP, pp. 195–198 (2010)
Dragoni, M.: Shellfbk: an information retrieval-based system for multi-domain sentiment analysis. In: Proceedings of the 9th International Workshop on Semantic Evaluation. SemEval ’2015, Denver, Colorado, pp. 502–509. Association for Computational Linguistics (June 2015)
Petrucci, G., Dragoni, M.: An information retrieval-based system for multi-domain sentiment analysis. In: Gandon, F., Cabrio, E., Stankovic, M., Zimmermann, A. (eds.) SemWebEval 2015. CCIS, vol. 548, pp. 234–243. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25518-7_20
Rexha, A., Kröll, M., Dragoni, M., Kern, R.: Exploiting propositions for opinion mining. In: Sack, H., Dietze, S., Tordai, A., Lange, C. (eds.) SemWebEval 2016. CCIS, vol. 641, pp. 121–125. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46565-4_9
Federici, M., Dragoni, M.: A knowledge-based approach for aspect-based opinion mining. In: Sack, H., Dietze, S., Tordai, A., Lange, C. (eds.) SemWebEval 2016. CCIS, vol. 641, pp. 141–152. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46565-4_11
Rexha, A., Kröll, M., Dragoni, M., Kern, R.: Opinion mining with a clause-based approach. In: Dragoni, M., Solanki, M., Blomqvist, E. (eds.) SemWebEval 2017. CCIS, vol. 769, pp. 166–175. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69146-6_15
Federici, M., Dragoni, M.: Aspect-based opinion mining using knowledge bases. In: Dragoni, M., Solanki, M., Blomqvist, E. (eds.) SemWebEval 2017. CCIS, vol. 769, pp. 133–147. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69146-6_13
Dragoni, M., Tettamanzi, A.G., da Costa Pereira, C.: Propagating and aggregating fuzzy polarities for concept-level sentiment analysis. Cogn. Comput. 7(2), 186–197 (2015)
Dragoni, M., Tettamanzi, A.G.B., da Costa Pereira, C.: A fuzzy system for concept-level sentiment analysis. In: Presutti, V., Stankovic, M., Cambria, E., Cantador, I., Di Iorio, A., Di Noia, T., Lange, C., Reforgiato Recupero, D., Tordai, A. (eds.) SemWebEval 2014. CCIS, vol. 475, pp. 21–27. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-12024-9_2
Petrucci, G., Dragoni, M.: The IRMUDOSA system at ESWC-2016 challenge on semantic sentiment analysis. In: Sack, H., Dietze, S., Tordai, A., Lange, C. (eds.) SemWebEval 2016. CCIS, vol. 641, pp. 126–140. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46565-4_10
Dragoni, M., Petrucci, G.: A fuzzy-based strategy for multi-domain sentiment analysis. Int. J. Approx. Reason. 93, 59–73 (2018)
Petrucci, G., Dragoni, M.: The IRMUDOSA system at ESWC-2017 challenge on semantic sentiment analysis. In: Dragoni, M., Solanki, M., Blomqvist, E. (eds.) SemWebEval 2017. CCIS, vol. 769, pp. 148–165. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69146-6_14
da Costa Pereira, C., Dragoni, M., Pasi, G.: A prioritized “and” aggregation operator for multidimensional relevance assessment. In: Serra, R., Cucchiara, R. (eds.) AI*IA 2009. LNCS (LNAI), vol. 5883, pp. 72–81. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-10291-2_8
Federici, M., Dragoni, M.: Towards unsupervised approaches for aspects extraction. In: Dragoni, M., Recupero, D.R., Denecke, K., Deng, Y., Declerck, T. ( eds.) Joint Proceedings of the 2th Workshop on Emotions, Modality, Sentiment Analysis and the Semantic Web and the 1st International Workshop on Extraction and Processing of Rich Semantics from Medical Texts co-located with ESWC 2016, Heraklion, Greece, May 29, 2016. Volume 1613 of CEUR Workshop Proceedings (2016). www.CEUR-WS.org
Federici, M., Dragoni, M.: A branching strategy for unsupervised aspect-based sentiment analysis. In: Dragoni, M., Recupero, D.R. (eds.)Proceedings of the 3rd International Workshop at ESWC on Emotions, Modality, Sentiment Analysis and the Semantic Web co-located with 14th ESWC 2017, Portroz, Slovenia, May 28, 2017. Volume 1874 of CEUR Workshop Proceedings (2017). www.CEUR-WS.org
Riloff, E., Patwardhan, S., Wiebe, J.: Feature subsumption for opinion analysis. In: EMNLP, pp. 440–448 (2006)
Wilson, T., Wiebe, J., Hwa, R.: Recognizing strong and weak opinion clauses. Comput. Intell. 22(2), 73–99 (2006)
Palmero Aprosio, A., Corcoglioniti, F., Dragoni, M., Rospocher, M.: Supervised opinion frames detection with RAID. In: Gandon, F., Cabrio, E., Stankovic, M., Zimmermann, A. (eds.) SemWebEval 2015. CCIS, vol. 548, pp. 251–263. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25518-7_22
Hatzivassiloglou, V., Wiebe, J.: Effects of adjective orientation and gradability on sentence subjectivity. In: COLING, pp. 299–305 (2000)
Kim, S.M., Hovy, E.H.: Crystal: analyzing predictive opinions on the web. In: EMNLP-CoNLL, pp. 1056–1064 (2007)
Rexha, A., Kröll, M., Dragoni, M., Kern, R.: Polarity classification for target phrases in tweets: a Word2Vec approach. In: Sack, H., Rizzo, G., Steinmetz, N., Mladenić, D., Auer, S., Lange, C. (eds.) ESWC 2016. LNCS, vol. 9989, pp. 217–223. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-47602-5_40
Rexha, A., Kröll, M., Kern, R., Dragoni, M.: An embedding approach for microblog polarity classification. In: Dragoni, M., Recupero, D.R. (eds.) Proceedings of the 3rd International Workshop on Emotions, Modality, Sentiment Analysis and the Semantic Web co-located with 14th ESWC 2017, Portroz, Slovenia, May 28, 2017. Volume 1874 of CEUR Workshop Proceedings (2017). www.CEUR-WS.org
Recupero, D.R., Dragoni, M., Presutti, V.: ESWC 15 challenge on concept-level sentiment analysis. In: Gandon, F., Cabrio, E., Stankovic, M., Zimmermann, A. (eds.) SemWebEval 2015. CCIS, vol. 548, pp. 211–222. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25518-7_18
Dragoni, M., Reforgiato Recupero, D.: Challenge on fine-grained sentiment analysis within ESWC2016. In: Sack, H., Dietze, S., Tordai, A., Lange, C. (eds.) SemWebEval 2016. CCIS, vol. 641, pp. 79–94. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46565-4_6
Dragoni, M., Solanki, M., Blomqvist, E. (eds.): SemWebEval 2017. CCIS, vol. 769. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69146-6
Jakob, N., Gurevych, I.: Extracting opinion targets in a single and cross-domain setting with conditional random fields. In: EMNLP, pp. 1035–1045 (2010)
Jin, W., Ho, H.H., Srihari, R.K.: Opinionminer: a novel machine learning system for web opinion mining and extraction. In: KDD, pp. 1195–1204 (2009)
Liu, B., Hu, M., Cheng, J.: Opinion observer: analyzing and comparing opinions on the web. In: WWW, pp. 342–351 (2005)
Wu, Y., Zhang, Q., Huang, X., Wu, L.: Phrase dependency parsing for opinion mining. In: EMNLP, pp. 1533–1541 (2009)
Su, Q., Xu, X., Guo, H., Guo, Z., Wu, X., Zhang, X., Swen, B., Su, Z.: Hidden sentiment association in chinese web opinion mining. In: WWW, pp. 959–968(2008)
Dragoni, M.: NEUROSENT-PDI at semeval-2018 task 1: Leveraging a multi-domain sentiment model for inferring polarity in micro-blog text. In: Apidianaki, M., Mohammad, S.M., May, J., Shutova, E., Bethard, S., Carpuat, M. (eds.) Proceedings of The 12th International Workshop on Semantic Evaluation, SemEval@NAACL-HLT, New Orleans, Louisiana, June 5–6, 2018, pp. 102–108. Association for Computational Linguistics (2018)
Dragoni, M.: NEUROSENT-PDI at semeval-2018 task 3: Understanding irony in social networks through a multi-domain sentiment model. In: Apidianaki, M., Mohammad, S.M., May, J., Shutova, E., Bethard, S., Carpuat, M. (eds.) Proceedings of The 12th International Workshop on Semantic Evaluation, SemEval@NAACL-HLT, New Orleans, Louisiana, June 5–6, 2018, pp. 512–519. Association for Computational Linguistics (2018)
Dragoni, M., Azzini, A., Tettamanzi, A.G.B.: A novel similarity-based crossover for artificial neural network evolution. In: Schaefer, R., Cotta, C., Kołodziej, J., Rudolph, G. (eds.) PPSN 2010. LNCS, vol. 6238, pp. 344–353. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15844-5_35
Qiu, G., Liu, B., Bu, J., Chen, C.: Opinion word expansion and target extraction through double propagation. Comput. Linguist. 37(1), 9–27 (2011)
Dragoni, M., da Costa Pereira, C., Tettamanzi, A.G.B., Villata, S.: Combining argumentation and aspect-based opinion mining: the smack system. AI Commun. 31(1), 75–95 (2018)
Dragoni, M.: A three-phase approach for exploiting opinion mining in computational advertising. IEEE Intell. Syst. 32(3), 21–27 (2017)
Dragoni, M., Petrucci, G.: A neural word embeddings approach for multi-domain sentiment analysis. IEEE Trans. Affect. Comput. 8(4), 457–470 (2017)
Dragoni, M.: Computational advertising in social networks: an opinion mining-based approach. In: Haddad, H.M., Wainwright, R.L., Chbeir, R. (eds.) Proceedings of the 33rd Annual ACM Symposium on Applied Computing, SAC 2018, Pau, France, April 09–13, 2018, pp. 1798–1804. ACM (2018)
Barbosa, L., Feng, J.: Robust sentiment detection on twitter from biased and noisy data. In: COLING (Posters), pp. 36–44 (2010)
Bermingham, A., Smeaton, A.F.: Classifying sentiment in microblogs: is brevity an advantage? In: CIKM, pp. 1833–1836 (2010)
Go, A., Bhayani, R., Huang, L.: Twitter sentiment classification using distant supervision. CS224N Project Report, Standford University (2009)
Cambria, E., Hussain, A.: Sentic computing: a common-sense-based framework for concept-level sentiment analysis (2015)
Cambria, E., Hussain, A.: Sentic album: Content-, concept-, and context-based online personal photo management system. Cogn. Comput. 4(4), 477–496 (2012)
Wang, Q.F., Cambria, E., Liu, C.L., Hussain, A.: Common sense knowledge for handwritten chinese recognition. Cogn. Comput. 5(2), 234–242 (2013)
Pan, S.J., Ni, X., Sun, J.T., Yang, Q., Chen, Z.: Cross-domain sentiment classification via spectral feature alignment. In: WWW, pp. 751–760 (2010)
Yoshida, Y., Hirao, T., Iwata, T., Nagata, M., Matsumoto, Y.: Transfer learning for multiple-domain sentiment analysis–identifying domain dependent/independent word polarity. In: AAAI, pp. 1286–1291 (2011)
Ponomareva, N., Thelwall, M.: Semi-supervised vs. cross-domain graphs for sentiment analysis. In: RANLP, pp. 571–578 (2013)
Huang, S., Niu, Z., Shi, C.: Automatic construction of domain-specific sentiment lexicon based on constrained label propagation. Knowl.-Based Syst. 56, 191–200 (2014)
Dragoni, M., da Costa Pereira, C., Tettamanzi, A.G.B., Villata, S.: Smack: an argumentation framework for opinion mining. In: Kambhampati, S. (ed.) Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, New York, NY, USA, 9–15 July 2016, pp. 4242–4243. IJCAI/AAAI Press (2016)
da Costa Pereira, C., Dragoni, M., Pasi, G.: Multidimensional relevance: prioritized aggregation in a personalized information retrieval setting. Inf. Process. Manag. 48(2), 340–357 (2012)
Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S.J., McClosky, D.: The Stanford CoreNLP natural language processing toolkit. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Baltimore, Maryland, pp. 55–60. Association for Computational Linguistics (June 2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Dragoni, M. (2018). The FeatureSent System at ESWC-2018 Challenge on Semantic Sentiment Analysis. In: Buscaldi, D., Gangemi, A., Reforgiato Recupero, D. (eds) Semantic Web Challenges. SemWebEval 2018. Communications in Computer and Information Science, vol 927. Springer, Cham. https://doi.org/10.1007/978-3-030-00072-1_17
Download citation
DOI: https://doi.org/10.1007/978-3-030-00072-1_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00071-4
Online ISBN: 978-3-030-00072-1
eBook Packages: Computer ScienceComputer Science (R0)