Automatic sentiment-oriented summarization of multi-documents using soft computing

Abdi, Asad; Shamsuddin, Siti Mariyam; Hasan, Shafaatunnur; Piran, Jalil

doi:10.1007/s00500-018-3653-4

Automatic sentiment-oriented summarization of multi-documents using soft computing

Methodologies and Application
Published: 17 December 2018

Volume 23, pages 10551–10568, (2019)
Cite this article

Soft Computing Aims and scope Submit manuscript

Asad Abdi¹,
Siti Mariyam Shamsuddin¹,
Shafaatunnur Hasan¹ &
…
Jalil Piran²

421 Accesses
10 Citations
Explore all metrics

Abstract

This paper presents an automatic sentiment-oriented summarization of multi-documents using soft computing (called ASMUS). It integrates two main phases: sentiment analysis and sentiment summarization. Sentiment analysis phase includes multiple strategies to tackle the following drawbacks: (1) word coverage limit of an individual lexicon; (2) contextual polarity; (3) sentence types, while the sentiment summarization phase is a graph-based ranking model that integrates the sentiment information, statistical and linguistic methods to improve the sentence ranking result. We found that the current methods are suffering from the following problems: (1) they do not consider the semantic and syntactic information in comparison between two sentences when they share the similar bag-of-words (capturing meaning); (2) vocabulary mismatch problem (lexical gaps). Furthermore, ASMUS also considers content coverage and redundancy. We conduct the experiments on the Document Understanding Conference datasets. The results present the excellent outcomes of the ASMUS in sentiment-oriented summarization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

User Intention-Based Document Summarization on Heterogeneous Sentence Networks

Sentence Similarity Using Syntactic and Semantic Features for Multi-document Summarization

Automatic Multi-Document Summarization Based on Keyword Density and Sentence-Word Graphs

Article 07 June 2018

Notes

http://dev.mysql.com/doc/refman/5.5/en/fulltext-stopwords.html.
http://jmlr.csail.mit.edu/papers/volume5/lewis04a/a11-smart-stop-list/english.stop.
http://www.cs.cmu.edu/~mccallum/bow/rainbow/.
We use the negation words collected in Kolchyna et al. (2015) as a basic set, such as “no”, “not”, “don’t”, “hardly”, “none”, “never”, “are not”, “was not”, “did not”, “seldom”, “nothing”, “isn’t”,….
“Not only”, “not wholly”, “not all”, “not just”, “not quite”, “not least”, “no question”, ….
Like “but”, “with the exception of”, “except that”, “except for”, “however”, “yet”, “unfortunately”,….
We use the negation words collected in Abdi et al. (2016) as a basic set, such as “therefore”, “thus”, “consequently”, “hence”, “as a result”, “to conclude”, “in conclusion”, “as a result”,” in short”, ….
https://en.wikipedia.org/wiki/Normalization_(statistics).
http://stackoverflow.com/questions/10364575/normalization-in-variable-range-x-y-in-matlab.
http://duc.nist.gov.

References

Abdi A, Idris N (2014) Automated summarization assessment system: quality assessment without a reference summary. In: The international conference on advances in applied science and environmental engineering—ASEE 2014. IRED Press
Abdi SA, Idris N (2014b) An analysis on student-written summaries: automatic assessment of summary writing. Int J Enhanc Res Sci Technol Eng 3:466–472
Google Scholar
Abdi A, Idris N, Alguliev RM, Aliguliyev RM (2015a) Automatic summarization assessment through a combination of semantic and syntactic information for intelligent educational systems. Inf Process Manag 51:340–358
Article Google Scholar
Abdi A, Idris N, Alguliyev RM, Aliguliyev RM (2015b) Query-based multi-documents summarization using linguistic knowledge and content word expansion. Soft Comput. https://doi.org/10.1007/s00500-015-1881-4
Google Scholar
Abdi A, Idris N, Alguliyev RM, Aliguliyev RM (2016) An automated summarization assessment algorithm for identifying summarizing strategies. PLoS ONE 11:e0145809
Article Google Scholar
Abdi A, Shamsuddin SM, Aliguliyev RM (2018a) QMOS: query-based multi-documents opinion-oriented summarization. Inf Process Manag 54:318–338
Article Google Scholar
Abdi A, Shamsuddin SM, Hasan S, Piran J (2018b) Machine learning-based multi-documents sentiment-oriented summarization using linguistic treatment. Expert Syst Appl 109:66–85
Article Google Scholar
Alfaro C, Cano-Montero J, Gómez J, Moguerza JM, Ortega F (2016) A multi-stage method for content classification and opinion mining on weblog comments. Ann Oper Res 236:197–213
Article MATH Google Scholar
Alguliyev RM, Aliguliyev RM, Isazade NR (2015) An unsupervised approach to generating generic summaries of documents. Appl Soft Comput 34:236–250
Article Google Scholar
Baccianella S, Esuli A, Sebastiani F (2010) SentiWordNet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: LREC, pp 2200–2204
Bahrainian S-A, Dengel A (2013) Sentiment analysis and summarization of twitter data. In: IEEE 16th international conference on computational science and engineering (CSE). IEEE, pp 227–234
Balahur A, Kabadjov M, Steinberger J, Steinberger R, Montoyo A (2012) Challenges and solutions in the opinion summarization of user-generated content. J Intell Inf Syst 39:375–398
Article Google Scholar
Cambria E, Poria S, Bajpai R, Schuller BW (2016) SenticNet 4: a semantic resource for sentiment analysis based on conceptual primitives. In: COLING, pp 2666–2677
Chen T, Xu R, He Y, Wang X (2017) Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN. Expert Syst Appl 72:221–230
Article Google Scholar
Cohen J (1968) Weighted kappa: nominal scale agreement provision for scaled disagreement or partial credit. Psychol Bull 70:213
Article Google Scholar
Deshwal A, Sharma SK (2016) Twitter sentiment analysis using various classification algorithms. In: 5th International conference on reliability, infocom technologies and optimization (trends and future directions) (ICRITO). IEEE, pp 251–257
Di Capua M, Petrosino A (2016) A deep learning approach to deal with data uncertainty in sentiment analysis. In: International workshop on fuzzy logic and applications. Springer, pp 172–184
Edmundson HP (1969) New methods in automatic extracting. J ACM (JACM) 16:264–285
Article MATH Google Scholar
Ferreira R, de Souza Cabral L, Freitas F, Lins RD, de França Silva G, Simske SJ, Favaro L (2014) A multi-document summarization system based on statistics and linguistic treatment. Expert Syst Appl 41:5780–5787
Article Google Scholar
Fleiss JL (1971) Measuring nominal scale agreement among many raters. Psychol Bull 76:378
Article Google Scholar
Gambhir M, Gupta V (2017) Recent automatic text summarization techniques: a survey. Artif Intell Rev 47:1–66
Article Google Scholar
Gupta V, Lehal GS (2010) A survey of text summarization extractive techniques. J Emerg Technol Web Intell 2:258–268
Google Scholar
Gupta P, Tiwari R, Robert N (2016) Sentiment analysis and text summarization of online reviews: a survey. In: International conference on communication and signal processing (ICCSP). IEEE, pp 0241–0245
Hu M, Liu B (2004) Mining and summarizing customer reviews. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 168–177
Hu Y-H, Chen Y-L, Chou H-L (2017) Opinion mining from online hotel reviews: a text summarization approach. Inf Process Manag 53:436–449
Article Google Scholar
Hung C, Chen S-J (2016) Word sense disambiguation based sentiment lexicons for sentiment classification. Knowl Based Syst 110:224–232
Article Google Scholar
Jaccard P (1912) The distribution of the flora in the alpine zone. New Phytol 11:37–50
Article Google Scholar
Kabadjov M, Balahur A, Boldrini E (2009) Sentiment intensity: is it a good summary indicator? In: Language and technology conference. Springer, pp 203–212
Khan FH, Qamar U, Bashir S (2016) SWIMS: semi-supervised subjective feature weighting and intelligent model selection for sentiment analysis. Knowl Based Syst 100:97–111
Article Google Scholar
Kim S, Calvo R (2011) Sentiment-oriented summarisation of peer reviews. In: Artificial intelligence in education. Springer, pp 491–493
Kiyoumarsi F (2015) Evaluation of automatic text summarizations based on human summaries. Procedia Soc Behav Sci 192:83–91
Article Google Scholar
Kolchyna O, Souza TT, Treleaven P, Aste T (2015) Twitter sentiment analysis: Lexicon method, machine learning method and their combination. arXiv preprint: arXiv:150700955
Landauer TK (2002) On the computational basis of learning and cognition: arguments from LSA. Psychol Learn Motiv 41:43–84
Article Google Scholar
Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33:159–174
Article MATH Google Scholar
Li Y, McLean D, Bandar ZA, O’shea JD, Crockett K (2006) Sentence similarity based on semantic nets and corpus statistics. IEEE Trans Knowl Data Eng 18:1138–1150
Article Google Scholar
Lin C-Y (2004) Rouge: a package for automatic evaluation of summaries. In: Text summarization branches out: proceedings of the ACL-04 workshop, pp 74–81
Lin C-Y, Hovy E (1997) Identifying topics by position. In: Proceedings of the fifth conference on applied natural language processing. Association for Computational Linguistics, pp 283–290
Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5:1–167
Article Google Scholar
Lloret E, Saggion H, Palomar M (2010) Experiments on summary-based opinion classification. In: Proceedings of the NAACL HLT 2010 workshop on computational approaches to analysis and generation of emotion in text. Association for Computational Linguistics, pp 107–115
Mendoza M, Bonilla S, Noguera C, Cobos C, León E (2014) Extractive single-document summarization based on genetic operators and guided local search. Expert Syst Appl 41:4158–4169
Article Google Scholar
Mishra R, Bian J, Fiszman M, Weir CR, Jonnalagadda S, Mostafa J, Del Fiol G (2014) Text summarization in the biomedical domain: a systematic review of recent research. J Biomed Inform 52:457–467
Article Google Scholar
Mohammad SM, Kiritchenko S, Zhu X (2013) NRC-Canada: building the state-of-the-art in sentiment analysis of tweets. arXiv preprint: arXiv:13086242
Narayanan R, Liu B, Choudhary A (2009) Sentiment analysis of conditional sentences. In: Proceedings of the 2009 conference on empirical methods in natural language processing, vol 1. Association for Computational Linguistics, pp 180–189
Nielsen FÅ (2011) A new ANEW: evaluation of a word list for sentiment analysis in microblogs. arXiv preprint: arXiv:11032903
O’Connor B, Balasubramanyan R, Routledge BR, Smith NA (2010) From tweets to polls: linking text sentiment to public opinion time series. ICWSM 11:1–2
Google Scholar
Pérez D, Gliozzo AM, Strapparava C, Alfonseca E, Rodríguez P, Magnini B (2005) Automatic assessment of students’ free-text answers underpinned by the combination of a BLEU-inspired algorithm and latent semantic analysis. In: FLAIRS conference, pp 358–363
Rana TA, Cheah Y-N (2016) Aspect extraction in sentiment analysis: comparative analysis and survey. Artif Intell Rev 46:459–483
Article Google Scholar
Riloff E, Wiebe J (2003) Learning extraction patterns for subjective expressions. In: Proceedings of the 2003 conference on empirical methods in natural language processing. Association for Computational Linguistics, pp 105–112
Saggion H (2014) Creating summarization systems with SUMMA. In: LREC, pp 4157–4163
Saggion H, Poibeau T (2013) Automatic text summarization: past, present and future. In: Multi-source, multilingual information extraction and summarization. Springer, pp 3–21
Sankarasubramaniam Y, Ramanathan K, Ghosh S (2014) Text summarization using Wikipedia. Inf Process Manag 50:443–461
Article Google Scholar
Sarker A, Mollá D, Paris C (2013) An approach for query-focused text summarisation for evidence based medicine. In: Artificial intelligence in medicine. Springer, pp 295–304
Statistics L (2015) Wilcoxon signed-rank test using SPSS statistics. In: Statistical tutorials and software guides
Stone PJ, Hunt EB (1963) A computer approach to content analysis: studies using the general inquirer system. In: Proceedings of the spring joint computer conference, 21–23 May 1963. ACM, pp 241–256
Strapparava C, Valitutti A (2004) WordNet affect: an affective extension of WordNet. In: LREC. Citeseer, pp 1083–1086
Taboada M, Brooke J, Tofiloski M, Voll K, Stede M (2011) Lexicon-based methods for sentiment analysis. Comput Linguist 37:267–307
Article Google Scholar
Tayal MA, Raghuwanshi MM, Malik LG (2017) ATSSC: development of an approach based on soft computing for text summarization. Comput Speech Lang 41:214–235
Article Google Scholar
Teufel S, Moens M (1997) Sentence extraction as a classification task. In: Proceedings of the ACL, vol 1997, pp 58–65
Vani K, Gupta D (2014) Using K-means cluster based techniques in external plagiarism detection. In: International conference on contemporary computing and informatics (IC3I). IEEE, pp 1268–1273
Xia R, Xu F, Yu J, Qi Y, Cambria E (2016) Polarity shift detection, elimination and ensemble: a three-stage model for document-level sentiment analysis. Inf Process Manag 52:36–45
Article Google Scholar
Yadav N, Chatterjee N (2016) Text summarization using sentiment analysis for DUC data. In: International conference on information technology (ICIT). IEEE, pp 229–234
Yadav CS, Sharan A (2015) Hybrid approach for single text document summarization using statistical and sentiment features. Int J Inf Retr Res (IJIRR) 5:46–70
Google Scholar
Zhang J, Sun L, Zhou Q (2005) A cue-based hub-authority approach for multi-document text summarization. In: Proceedings of 2005 IEEE international conference on natural language processing and knowledge engineering. IEEE NLP-KE’05. IEEE, pp 642–645

Download references

Acknowledgements

The authors would like to thank Research Management Centre (RMC), Universiti Teknologi Malaysia (UTM) for the support in R&D, UTM Big Data Centre (BDC), for the inspiration in making this study a success. The authors would also like to thank the anonymous reviewers who have contributed enormously to this work.

Funding

This work is supported by The Ministry of Higher Education (MOHE) under “Q.J130000.21A2.03E53 - STATISTICAL MACHINE LEARNING METHODS TO TEXT SUMMARIZATIONS” and “13H82 (INTELLIGENT PREDICTIVE ANALYTICS FOR RETAIL INDUSTRY).”

Author information

Authors and Affiliations

UTM Big Data Centre (BDC), Universiti Teknologi Malaysia, 81310, Skudai, Johor, Malaysia
Asad Abdi, Siti Mariyam Shamsuddin & Shafaatunnur Hasan
Department of Computer Science and Engineering, Sejong University, Seoul, South Korea
Jalil Piran

Authors

Asad Abdi
View author publications
You can also search for this author in PubMed Google Scholar
Siti Mariyam Shamsuddin
View author publications
You can also search for this author in PubMed Google Scholar
Shafaatunnur Hasan
View author publications
You can also search for this author in PubMed Google Scholar
Jalil Piran
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Asad Abdi.

Ethics declarations

Conflict of interest

I hereby and on behalf of the co-authors declare that all the authors agreed to submit the article exclusively to this journal and also declare that there is no conflict of interests regarding the publication of this article.

Human and animal rights

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Abdi, A., Shamsuddin, S.M., Hasan, S. et al. Automatic sentiment-oriented summarization of multi-documents using soft computing. Soft Comput 23, 10551–10568 (2019). https://doi.org/10.1007/s00500-018-3653-4

Download citation

Published: 17 December 2018
Issue Date: October 2019
DOI: https://doi.org/10.1007/s00500-018-3653-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic sentiment-oriented summarization of multi-documents using soft computing

Abstract

Access this article

Similar content being viewed by others

User Intention-Based Document Summarization on Heterogeneous Sentence Networks

Sentence Similarity Using Syntactic and Semantic Features for Multi-document Summarization

Automatic Multi-Document Summarization Based on Keyword Density and Sentence-Word Graphs

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Human and animal rights

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Automatic sentiment-oriented summarization of multi-documents using soft computing

Abstract

Access this article

Similar content being viewed by others

User Intention-Based Document Summarization on Heterogeneous Sentence Networks

Sentence Similarity Using Syntactic and Semantic Features for Multi-document Summarization

Automatic Multi-Document Summarization Based on Keyword Density and Sentence-Word Graphs

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Human and animal rights

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation