Skip to main content
Log in

Automatic sentiment-oriented summarization of multi-documents using soft computing

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

This paper presents an automatic sentiment-oriented summarization of multi-documents using soft computing (called ASMUS). It integrates two main phases: sentiment analysis and sentiment summarization. Sentiment analysis phase includes multiple strategies to tackle the following drawbacks: (1) word coverage limit of an individual lexicon; (2) contextual polarity; (3) sentence types, while the sentiment summarization phase is a graph-based ranking model that integrates the sentiment information, statistical and linguistic methods to improve the sentence ranking result. We found that the current methods are suffering from the following problems: (1) they do not consider the semantic and syntactic information in comparison between two sentences when they share the similar bag-of-words (capturing meaning); (2) vocabulary mismatch problem (lexical gaps). Furthermore, ASMUS also considers content coverage and redundancy. We conduct the experiments on the Document Understanding Conference datasets. The results present the excellent outcomes of the ASMUS in sentiment-oriented summarization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. http://dev.mysql.com/doc/refman/5.5/en/fulltext-stopwords.html.

  2. http://jmlr.csail.mit.edu/papers/volume5/lewis04a/a11-smart-stop-list/english.stop.

  3. http://www.cs.cmu.edu/~mccallum/bow/rainbow/.

  4. We use the negation words collected in Kolchyna et al. (2015) as a basic set, such as “no”, “not”, “don’t”, “hardly”, “none”, “never”, “are not”, “was not”, “did not”, “seldom”, “nothing”, “isn’t”,….

  5. Not only”, “not wholly”, “not all”, “not just”, “not quite”, “not least”, “no question”, ….

  6. Like “but”, “with the exception of”, “except that”, “except for”, “however”, “yet”, “unfortunately”,….

  7. We use the negation words collected in Abdi et al. (2016) as a basic set, such as “therefore”, “thus”, “consequently”, “hence”, “as a result”, “to conclude”, “in conclusion”, “as a result”,” in short”, ….

  8. https://en.wikipedia.org/wiki/Normalization_(statistics).

  9. http://stackoverflow.com/questions/10364575/normalization-in-variable-range-x-y-in-matlab.

  10. http://duc.nist.gov.

References

  • Abdi A, Idris N (2014) Automated summarization assessment system: quality assessment without a reference summary. In: The international conference on advances in applied science and environmental engineering—ASEE 2014. IRED Press

  • Abdi SA, Idris N (2014b) An analysis on student-written summaries: automatic assessment of summary writing. Int J Enhanc Res Sci Technol Eng 3:466–472

    Google Scholar 

  • Abdi A, Idris N, Alguliev RM, Aliguliyev RM (2015a) Automatic summarization assessment through a combination of semantic and syntactic information for intelligent educational systems. Inf Process Manag 51:340–358

    Article  Google Scholar 

  • Abdi A, Idris N, Alguliyev RM, Aliguliyev RM (2015b) Query-based multi-documents summarization using linguistic knowledge and content word expansion. Soft Comput. https://doi.org/10.1007/s00500-015-1881-4

    Google Scholar 

  • Abdi A, Idris N, Alguliyev RM, Aliguliyev RM (2016) An automated summarization assessment algorithm for identifying summarizing strategies. PLoS ONE 11:e0145809

    Article  Google Scholar 

  • Abdi A, Shamsuddin SM, Aliguliyev RM (2018a) QMOS: query-based multi-documents opinion-oriented summarization. Inf Process Manag 54:318–338

    Article  Google Scholar 

  • Abdi A, Shamsuddin SM, Hasan S, Piran J (2018b) Machine learning-based multi-documents sentiment-oriented summarization using linguistic treatment. Expert Syst Appl 109:66–85

    Article  Google Scholar 

  • Alfaro C, Cano-Montero J, Gómez J, Moguerza JM, Ortega F (2016) A multi-stage method for content classification and opinion mining on weblog comments. Ann Oper Res 236:197–213

    Article  MATH  Google Scholar 

  • Alguliyev RM, Aliguliyev RM, Isazade NR (2015) An unsupervised approach to generating generic summaries of documents. Appl Soft Comput 34:236–250

    Article  Google Scholar 

  • Baccianella S, Esuli A, Sebastiani F (2010) SentiWordNet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: LREC, pp 2200–2204

  • Bahrainian S-A, Dengel A (2013) Sentiment analysis and summarization of twitter data. In: IEEE 16th international conference on computational science and engineering (CSE). IEEE, pp 227–234

  • Balahur A, Kabadjov M, Steinberger J, Steinberger R, Montoyo A (2012) Challenges and solutions in the opinion summarization of user-generated content. J Intell Inf Syst 39:375–398

    Article  Google Scholar 

  • Cambria E, Poria S, Bajpai R, Schuller BW (2016) SenticNet 4: a semantic resource for sentiment analysis based on conceptual primitives. In: COLING, pp 2666–2677

  • Chen T, Xu R, He Y, Wang X (2017) Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN. Expert Syst Appl 72:221–230

    Article  Google Scholar 

  • Cohen J (1968) Weighted kappa: nominal scale agreement provision for scaled disagreement or partial credit. Psychol Bull 70:213

    Article  Google Scholar 

  • Deshwal A, Sharma SK (2016) Twitter sentiment analysis using various classification algorithms. In: 5th International conference on reliability, infocom technologies and optimization (trends and future directions) (ICRITO). IEEE, pp 251–257

  • Di Capua M, Petrosino A (2016) A deep learning approach to deal with data uncertainty in sentiment analysis. In: International workshop on fuzzy logic and applications. Springer, pp 172–184

  • Edmundson HP (1969) New methods in automatic extracting. J ACM (JACM) 16:264–285

    Article  MATH  Google Scholar 

  • Ferreira R, de Souza Cabral L, Freitas F, Lins RD, de França Silva G, Simske SJ, Favaro L (2014) A multi-document summarization system based on statistics and linguistic treatment. Expert Syst Appl 41:5780–5787

    Article  Google Scholar 

  • Fleiss JL (1971) Measuring nominal scale agreement among many raters. Psychol Bull 76:378

    Article  Google Scholar 

  • Gambhir M, Gupta V (2017) Recent automatic text summarization techniques: a survey. Artif Intell Rev 47:1–66

    Article  Google Scholar 

  • Gupta V, Lehal GS (2010) A survey of text summarization extractive techniques. J Emerg Technol Web Intell 2:258–268

    Google Scholar 

  • Gupta P, Tiwari R, Robert N (2016) Sentiment analysis and text summarization of online reviews: a survey. In: International conference on communication and signal processing (ICCSP). IEEE, pp 0241–0245

  • Hu M, Liu B (2004) Mining and summarizing customer reviews. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 168–177

  • Hu Y-H, Chen Y-L, Chou H-L (2017) Opinion mining from online hotel reviews: a text summarization approach. Inf Process Manag 53:436–449

    Article  Google Scholar 

  • Hung C, Chen S-J (2016) Word sense disambiguation based sentiment lexicons for sentiment classification. Knowl Based Syst 110:224–232

    Article  Google Scholar 

  • Jaccard P (1912) The distribution of the flora in the alpine zone. New Phytol 11:37–50

    Article  Google Scholar 

  • Kabadjov M, Balahur A, Boldrini E (2009) Sentiment intensity: is it a good summary indicator? In: Language and technology conference. Springer, pp 203–212

  • Khan FH, Qamar U, Bashir S (2016) SWIMS: semi-supervised subjective feature weighting and intelligent model selection for sentiment analysis. Knowl Based Syst 100:97–111

    Article  Google Scholar 

  • Kim S, Calvo R (2011) Sentiment-oriented summarisation of peer reviews. In: Artificial intelligence in education. Springer, pp 491–493

  • Kiyoumarsi F (2015) Evaluation of automatic text summarizations based on human summaries. Procedia Soc Behav Sci 192:83–91

    Article  Google Scholar 

  • Kolchyna O, Souza TT, Treleaven P, Aste T (2015) Twitter sentiment analysis: Lexicon method, machine learning method and their combination. arXiv preprint: arXiv:150700955

  • Landauer TK (2002) On the computational basis of learning and cognition: arguments from LSA. Psychol Learn Motiv 41:43–84

    Article  Google Scholar 

  • Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33:159–174

    Article  MATH  Google Scholar 

  • Li Y, McLean D, Bandar ZA, O’shea JD, Crockett K (2006) Sentence similarity based on semantic nets and corpus statistics. IEEE Trans Knowl Data Eng 18:1138–1150

    Article  Google Scholar 

  • Lin C-Y (2004) Rouge: a package for automatic evaluation of summaries. In: Text summarization branches out: proceedings of the ACL-04 workshop, pp 74–81

  • Lin C-Y, Hovy E (1997) Identifying topics by position. In: Proceedings of the fifth conference on applied natural language processing. Association for Computational Linguistics, pp 283–290

  • Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5:1–167

    Article  Google Scholar 

  • Lloret E, Saggion H, Palomar M (2010) Experiments on summary-based opinion classification. In: Proceedings of the NAACL HLT 2010 workshop on computational approaches to analysis and generation of emotion in text. Association for Computational Linguistics, pp 107–115

  • Mendoza M, Bonilla S, Noguera C, Cobos C, León E (2014) Extractive single-document summarization based on genetic operators and guided local search. Expert Syst Appl 41:4158–4169

    Article  Google Scholar 

  • Mishra R, Bian J, Fiszman M, Weir CR, Jonnalagadda S, Mostafa J, Del Fiol G (2014) Text summarization in the biomedical domain: a systematic review of recent research. J Biomed Inform 52:457–467

    Article  Google Scholar 

  • Mohammad SM, Kiritchenko S, Zhu X (2013) NRC-Canada: building the state-of-the-art in sentiment analysis of tweets. arXiv preprint: arXiv:13086242

  • Narayanan R, Liu B, Choudhary A (2009) Sentiment analysis of conditional sentences. In: Proceedings of the 2009 conference on empirical methods in natural language processing, vol 1. Association for Computational Linguistics, pp 180–189

  • Nielsen FÅ (2011) A new ANEW: evaluation of a word list for sentiment analysis in microblogs. arXiv preprint: arXiv:11032903

  • O’Connor B, Balasubramanyan R, Routledge BR, Smith NA (2010) From tweets to polls: linking text sentiment to public opinion time series. ICWSM 11:1–2

    Google Scholar 

  • Pérez D, Gliozzo AM, Strapparava C, Alfonseca E, Rodríguez P, Magnini B (2005) Automatic assessment of students’ free-text answers underpinned by the combination of a BLEU-inspired algorithm and latent semantic analysis. In: FLAIRS conference, pp 358–363

  • Rana TA, Cheah Y-N (2016) Aspect extraction in sentiment analysis: comparative analysis and survey. Artif Intell Rev 46:459–483

    Article  Google Scholar 

  • Riloff E, Wiebe J (2003) Learning extraction patterns for subjective expressions. In: Proceedings of the 2003 conference on empirical methods in natural language processing. Association for Computational Linguistics, pp 105–112

  • Saggion H (2014) Creating summarization systems with SUMMA. In: LREC, pp 4157–4163

  • Saggion H, Poibeau T (2013) Automatic text summarization: past, present and future. In: Multi-source, multilingual information extraction and summarization. Springer, pp 3–21

  • Sankarasubramaniam Y, Ramanathan K, Ghosh S (2014) Text summarization using Wikipedia. Inf Process Manag 50:443–461

    Article  Google Scholar 

  • Sarker A, Mollá D, Paris C (2013) An approach for query-focused text summarisation for evidence based medicine. In: Artificial intelligence in medicine. Springer, pp 295–304

  • Statistics L (2015) Wilcoxon signed-rank test using SPSS statistics. In: Statistical tutorials and software guides

  • Stone PJ, Hunt EB (1963) A computer approach to content analysis: studies using the general inquirer system. In: Proceedings of the spring joint computer conference, 21–23 May 1963. ACM, pp 241–256

  • Strapparava C, Valitutti A (2004) WordNet affect: an affective extension of WordNet. In: LREC. Citeseer, pp 1083–1086

  • Taboada M, Brooke J, Tofiloski M, Voll K, Stede M (2011) Lexicon-based methods for sentiment analysis. Comput Linguist 37:267–307

    Article  Google Scholar 

  • Tayal MA, Raghuwanshi MM, Malik LG (2017) ATSSC: development of an approach based on soft computing for text summarization. Comput Speech Lang 41:214–235

    Article  Google Scholar 

  • Teufel S, Moens M (1997) Sentence extraction as a classification task. In: Proceedings of the ACL, vol 1997, pp 58–65

  • Vani K, Gupta D (2014) Using K-means cluster based techniques in external plagiarism detection. In: International conference on contemporary computing and informatics (IC3I). IEEE, pp 1268–1273

  • Xia R, Xu F, Yu J, Qi Y, Cambria E (2016) Polarity shift detection, elimination and ensemble: a three-stage model for document-level sentiment analysis. Inf Process Manag 52:36–45

    Article  Google Scholar 

  • Yadav N, Chatterjee N (2016) Text summarization using sentiment analysis for DUC data. In: International conference on information technology (ICIT). IEEE, pp 229–234

  • Yadav CS, Sharan A (2015) Hybrid approach for single text document summarization using statistical and sentiment features. Int J Inf Retr Res (IJIRR) 5:46–70

    Google Scholar 

  • Zhang J, Sun L, Zhou Q (2005) A cue-based hub-authority approach for multi-document text summarization. In: Proceedings of 2005 IEEE international conference on natural language processing and knowledge engineering. IEEE NLP-KE’05. IEEE, pp 642–645

Download references

Acknowledgements

The authors would like to thank Research Management Centre (RMC), Universiti Teknologi Malaysia (UTM) for the support in R&D, UTM Big Data Centre (BDC), for the inspiration in making this study a success. The authors would also like to thank the anonymous reviewers who have contributed enormously to this work.

Funding

This work is supported by The Ministry of Higher Education (MOHE) under “Q.J130000.21A2.03E53 - STATISTICAL MACHINE LEARNING METHODS TO TEXT SUMMARIZATIONS” and “13H82 (INTELLIGENT PREDICTIVE ANALYTICS FOR RETAIL INDUSTRY).”

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Asad Abdi.

Ethics declarations

Conflict of interest

I hereby and on behalf of the co-authors declare that all the authors agreed to submit the article exclusively to this journal and also declare that there is no conflict of interests regarding the publication of this article.

Human and animal rights

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Abdi, A., Shamsuddin, S.M., Hasan, S. et al. Automatic sentiment-oriented summarization of multi-documents using soft computing. Soft Comput 23, 10551–10568 (2019). https://doi.org/10.1007/s00500-018-3653-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-018-3653-4

Keywords

Navigation