Skip to main content

Towards Explainable Summary of Crowdsourced Reviews Through Text Mining

  • Conference paper
  • First Online:

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1601))

Abstract

With the ever broad availability of technology, organizations and merchants are now able to collect large amounts of reviews online. Each review usually consists of an interval datum, such as a one to five ranking, along with a text comment. While an aggregated numeric scale commonly serves as an overall rating; a human user usually needs to read text manually to comprehend it. The aim of this work is to generate an explainable summary computationally through mining all text comments with natural language processing (NLP). In this initial work, we are able to derive an overall numeric scale of the reviews through sentiment analysis. We further combine methods of text summarization together with document clustering to obtain an explainable text summary that is easy for human users to comprehend. As an experiment, our approach has produced an explainable summary from a review dataset publicly available from Amazon.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Instead of Eq. (1), one may apply fuzzy memberships to categorize the sentiment of each \(d \in D\).

References

  1. Hutto C.J., Gilbert, E.E.: VADER: a parsimonious rule-based model for sentiment analysis of social media text. In: Eighth International Conference on Weblogs and Social Media, ICWSM 2914, Ann Arbor, MI, June 2014

    Google Scholar 

  2. Lahitani, A.R., Permanasari, A.E. Setiawan, N.A.: Cosine similarity to determine similarity measure: study case in online essay assessment. In: 4th International Conference on Cyber and IT Service Management, pp. 1–6 (2016)

    Google Scholar 

  3. Lin, C.-Y.: ROUGE: a package for automatic evaluation of summaries. In: Proceedings of the Workshop on Text Summarization, WAS 2004, Barcelona, Spain, July 25–26 (2004)

    Google Scholar 

  4. Mihalcea, R., Tarau, P.: Textrank: bringing order into text. In: Proceedings of the 2004 Conference On Empirical Methods In Natural Language Processing (2004)

    Google Scholar 

  5. Qurashi, A.W., Holmes, V., Johnson, A.P.: Document processing: methods for semantic text similarity analysis. In: International Conference on INnovations in Intelligent SysTems and Applications, INISTA, pp. 1–6 (2020)

    Google Scholar 

  6. Schluter, N.: The limits of automatic summarisation according to ROUGE. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Valencia, Spain, vol. 2(Short Papers), pp. 41–45, 3–7 April (2017)

    Google Scholar 

  7. Singh, V.K., Tiwari, N., Garg, S.: Document clustering using k-means, heuristic k-means and fuzzy c-means. In: 2011 International Conference on Computational Intelligence and Communication Networks, pp. 297–301. IEEE (2011)

    Google Scholar 

  8. Tenney, I, Das, D., Pavlick, E: BERT Rediscovers the Classical NLP Pipeline (2019). https://arxiv.org/abs/1905.05950

  9. Vaswani, A., et al.: Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS 2017, pp. 6000–6010. Curran Associates Inc., Red Hook (2017)

    Google Scholar 

  10. Wikipedia, GTP-3. https://en.wikipedia.org/wiki/GPT-3

  11. Wikipedia, NLTK. https://en.wikipedia.org/wiki/Natural_Language_Toolkit

  12. Wikipedia, tf-idf. https://en.wikipedia.org/wiki/Tf-idf

  13. Zhan, H., Zhang K., Hu, C., Sheng, V.S.: HGATs: hierarchical graph attention networks for multiple comments integration. In: ASONAM 2021: Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (2021)

    Google Scholar 

  14. Zhan, H., Zhang K., Hu, C., Sheng, V.S.: K2-GNN: multiple users’ comments integration with probabilistic K-Hop knowledge graph neural networks. In: Proceedings of the 13th Asian Conference on Machine Learning (2021)

    Google Scholar 

Download references

Acknowledgment

This work is partially supported by the US National Science Foundation through the grant award NSF/OIA-1946391. The authors would also very much like to express their sincere appreciations to the contributors of high quality Python software tools. Applying such publicly available tools, we are able to significantly improve the efficiency and effectiveness of this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chenyi Hu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Moody, A., Hu, C., Zhan, H., Spurling, M., Sheng, V.S. (2022). Towards Explainable Summary of Crowdsourced Reviews Through Text Mining. In: Ciucci, D., et al. Information Processing and Management of Uncertainty in Knowledge-Based Systems. IPMU 2022. Communications in Computer and Information Science, vol 1601. Springer, Cham. https://doi.org/10.1007/978-3-031-08971-8_44

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-08971-8_44

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-08970-1

  • Online ISBN: 978-3-031-08971-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics