skip to main content
10.1145/3539618.3591986acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper

Fairness for both Readers and Authors: Evaluating Summaries of User Generated Content

Published: 18 July 2023 Publication History

Abstract

Summarization of textual content has many applications, ranging from summarizing long documents to recent efforts towards summarizing user generated text (e.g., tweets, Facebook or Reddit posts). Traditionally, the focus of summarization has been to generate summaries which can best satisfy the readers. In this work, we look at summarization of user-generated content as a two-sided problem where satisfaction of both readers and authors is crucial. Through three surveys, we show that for user-generated content, traditional evaluation approach of measuring similarity between reference summaries and algorithmic summaries cannot capture author satisfaction. We propose an author satisfaction-based evaluation metric CROSSEM which, we show empirically, can potentially complement the current evaluation paradigm. We further propose the idea of inequality in satisfaction, to account for individual fairness amongst readers and authors. To our knowledge, this is the first attempt towards developing a fair summary evaluation framework for user generated content, and is likely to spawn lot of future research in this space.

References

[1]
Mehdi Allahyari, Seyedamin Pouriyeh, Mehdi Assefi, Saeid Safaei, Elizabeth D. Trippe, Juan B. Gutierrez, and Krys Kochut. 2017. Text Summarization Techniques: A Brief Survey. International Journal of Advanced Computer Science and Applications (2017).
[2]
Arpita Biswas, Gourab K. Patro, Niloy Ganguly, Krishna P. Gummadi, and Abhijnan Chakraborty. 2021. Toward Fair Recommendation in Two-Sided Platforms. ACM Trans. Web (2021).
[3]
Robin Burke, Nasim Sonboli, and Aldo Ordonez-Gauger. 2018. Balanced Neighborhoods for Multi-sided Fairness in Recommendation. In ACM FAccT.
[4]
Luis Adrián Cabrera-Diego and Juan-Manuel Torres-Moreno. 2018. SummTriver: A new trivergent model to evaluate summaries automatically without human references. Data & Knowledge Engineering (2018).
[5]
Luis Adrián Cabrera-Diego, Juan-Manuel Torres-Moreno, and Barthélémy Durette. 2016. Evaluating Multiple Summaries Without Human Models: A First Experiment with a Trivergent Model. In Natural Language Processing and Information Systems.
[6]
Elisa Celis, Vijay Keswani, Damian Straszak, Amit Deshpande, Tarun Kathuria, and Nisheeth Vishnoi. 2018. Fair and diverse DPP-based data summarization. In ICML.
[7]
Abhijnan Chakraborty, Gourab K Patro, Niloy Ganguly, Krishna P Gummadi, and Patrick Loiseau. 2019. Equality of voice: Towards fair representation in crowdsourced top-k recommendations. In ACM FAccT.
[8]
Abhisek Dash, Anurag Shandilya, Arindam Biswas, Kripabandhu Ghosh, Saptarshi Ghosh, and Abhijnan Chakraborty. 2019. Summarizing user-generated textual content: Motivation and methods for fairness in algorithmic summaries. ACM CSCW (2019).
[9]
Virginie Do, Sam Corbett-Davies, Jamal Atif, and Nicolas Usunier. 2021. Twosided fairness in rankings via Lorenz dominance. In NeurIPS.
[10]
Robert L. Donaway, Kevin W. Drummey, and Laura A. Mather. 2000. A Comparison of Rankings Produced by Summarization Evaluation Measures. In NAACLANLP.
[11]
Yue Dong. 2018. A Survey on Neural Network-Based Summarization Methods. arXiv preprint arXiv:1804.04589 (2018).
[12]
Wafaa S El-Kassas, Cherif R Salama, Ahmed A Rafea, and Hoda K Mohamed. 2021. Automatic text summarization: A comprehensive survey. Expert Systems with Applications (2021).
[13]
Günes Erkan and Dragomir R. Radev. 2004. LexRank: Graph-based Lexical Centrality As Salience in Text Summarization. Journal of Artificial Intelligence Research (2004).
[14]
Alexander R. Fabbri, Wojciech Kryściński, Bryan McCann, Caiming Xiong, Richard Socher, and Dragomir Radev. 2021. SummEval: Re-evaluating Summarization Evaluation. Transactions of the ACL (2021).
[15]
Ruoyuan Gao and Chirag Shah. 2020. Counteracting Bias and Increasing Fairness in Search and Recommender Systems. In ACM RecSys.
[16]
Ruoyuan Gao and Chirag Shah. 2021. Addressing Bias and Fairness in Search Systems. In ACM SIGIR.
[17]
Nikhil Garg, Benoit Favre, Korbinian Reidhammer, and Dilek Hakkani-Tuür. 2009. Clusterrank: a graph based method for meeting summarization. In Proc. Interspeech.
[18]
Dan Gillick and Yang Liu. 2010. Non-Expert Evaluation of Summarization Systems is Risky. In NAACL HLT.
[19]
Corrado Gini. 1912. Variabilità e mutabilità: contributo allo studio delle distribuzioni e delle relazioni statistiche.[Fasc. I.]. Tipogr. di P. Cuppini.
[20]
Yihong Gong and Xin Liu. 2001. Generic Text Summarization Using Relevance Measure and Latent Semantic Analysis. In ACM SIGIR.
[21]
Andrew Griffin. 2015. Twitter and Google team up, so tweets now go straight into Google search results. https://www.independent.co.uk/tech/twitter-and-googleteam-up-so-tweets-now-go-straight-into-google-search-results-10262417.html.
[22]
Vishal Gupta and Gurpreet Singh Lehal. 2010. A Survey of Text Summarization Extractive Techniques. IEEE Journal of Emerging Tech. in Web Intelligence (2010).
[23]
Zhanying He, Chun Chen, Jiajun Bu, Can Wang, Lijun Zhang, Deng Cai, and Xiaofei He. 2012. Document Summarization Based on Data Reconstruction. In AAAI.
[24]
David I. Inouye and Jugal K. Kalita. 2011. Comparing Twitter Summarization Algorithms for Multiple Post Summaries. In IEEE SocialCom.
[25]
Neslihan Iskender, Tim Polzehl, and Sebastian Möller. 2021. Reliability of Human Evaluation for Text Summarization: Lessons Learned and Challenges Ahead. In Proceedings of the Workshop on Human Evaluation of NLP Systems (HumEval).
[26]
M. G. Kendall. 1938. A New Measure of Rank Correlation. Biometrika (1938).
[27]
Chin-Yew Lin. 2004. ROUGE: A Package for Automatic Evaluation of Summaries. In Text Summarization Branches Out.
[28]
Annie Louis and Ani Nenkova. 2009. Automatically Evaluating Content Selection in Summarization without Human Models. In EMNLP.
[29]
Annie Louis and Ani Nenkova. 2013. Automatically Assessing Machine Summary Content Without a Gold Standard. In COLING.
[30]
H. P. Luhn. 1958. The Automatic Creation of Literature Abstracts. IBM Journal of Research and Development (1958).
[31]
S. Mackie, R. McCreadie, C. Macdonald, and I. Ounis. 2014. Comparing Algorithms for Microblog Summarisation. In CLEF.
[32]
Aadi Swadipto Mondal, Rakesh Bal, Sayan Sinha, and Gourab K Patro. 2021. Two-Sided Fairness in Non-Personalised Recommendations (Student Abstract). In AAAI.
[33]
Mohammed Elsaid Moussa, Ensaf Hussein Mohamed, and Mohamed Hassan Haggag. 2018. A survey on opinion summarization techniques for social media. Future Computing and Informatics Journal (2018).
[34]
Rajdeep Mukherjee, Hari Chandana Peruri, Uppada Vishnu, Pawan Goyal, Sourangshu Bhattacharya, and Niloy Ganguly. 2020. Read what you need: Controllable aspect-based opinion summarization of tourist reviews. In ACM SIGIR.
[35]
Rajdeep Mukherjee, Uppada Vishnu, Hari Chandana Peruri, Sourangshu Bhattacharya, Koustav Rudra, Pawan Goyal, and Niloy Ganguly. 2022. MTLTS: A Multi-Task Framework To Obtain Trustworthy Summaries From Crisis-Related Microblogs. In ACM WSDM.
[36]
Ramesh Nallapati, Feifei Zhai, and Bowen Zhou. 2017. SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents. In AAAI.
[37]
Ani Nenkova and Lucy Vanderwende. 2005. The impact of frequency on summarization. Technical Report. Microsoft Research.
[38]
Library of Congress. 2017. Update on the Twitter Archive at the Library of Congress. https://blogs.loc.gov/loc/files/2017/12/2017dec_twitter_whitepaper.pdf.
[39]
Gourab K Patro, Arpita Biswas, Niloy Ganguly, Krishna P. Gummadi, and Abhijnan Chakraborty. 2020. FairRec: Two-Sided Fairness for Personalized Recommendations in Two-Sided Platforms. In WWW.
[40]
Gourab K. Patro, Abhijnan Chakraborty, Niloy Ganguly, and Krishna Gummadi. 2020. Incremental Fairness in Two-Sided Market Platforms: On Smoothly Updating Recommendations. In AAAI.
[41]
Dragomir R. Radev, Simone Teufel, Horacio Saggion,Wai Lam, John Blitzer, Hong Qi, Arda Çelebi, Danyu Liu, and Elliott Drabek. 2003. Evaluation Challenges in Large-Scale Document Summarization. In ACL.
[42]
Horacio Saggion, Juan-Manuel Torres-Moreno, Iria da Cunha, and Eric SanJuan. 2010. Multilingual Summarization Evaluation without Human Models. In COLING.
[43]
Elaheh ShafieiBavani, Mohammad Ebrahimi, Raymond Wong, and Fang Chen. 2018. Summarization Evaluation in the Absence of Human Model Summaries Using the Compositionality of Word Embeddings. In COLING.
[44]
Anurag Shandilya, Abhisek Dash, Abhijnan Chakraborty, Kripabandhu Ghosh, and Saptarshi Ghosh. 2020. Fairness for Whom? Understanding the Reader's Perception of Fairness in Text Summarization. In FILA workshop, IEEE Big Data.
[45]
B. Sharifi, M. Hutton, and J. K. Kalita. 2010. Experiments in Microblog Summarization. In IEEE Conference on Social Computing.
[46]
C. Spearman. 1987. The Proof and Measurement of Association between Two Things. The American Journal of Psychology (1987).
[47]
Haolun Wu, Chen Ma, Bhaskar Mitra, Fernando Diaz, and Xue Liu. 2022. A Multi-Objective Optimization Framework for Multi-Stakeholder Fairness-Aware Recommendation. ACM Transactions on Information Systems (2022).
[48]
Yao Wu, Jian Cao, Guandong Xu, and Yudong Tan. 2021. TFROM: A Two-Sided Fairness-Aware Recommendation Model for Both Customers and Providers. In ACM SIGIR.
[49]
Wei Xu, Ralph Grishman, Adam Meyers, and Alan Ritter. 2013. A Preliminary Study of Tweet Summarization using Information Extraction. In Language in Social Media (LASM).
[50]
Tianyi Zhang*, Varsha Kishore*, Felix Wu*, Kilian Q. Weinberger, and Yoav Artzi. 2020. BERTScore: Evaluating Text Generation with BERT. In ICLR.
[51]
Wei Zhao, Maxime Peyrard, Fei Liu, Yang Gao, Christian M. Meyer, and Steffen Eger. 2019. MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance. In EMNLP-IJCNLP.

Cited By

View all
  • (2024)A Novel Summarization Framework based on Reference-Free Evaluation of Multiple Large Language Models2024 IEEE International Conference on Metaverse Computing, Networking, and Applications (MetaCom)10.1109/MetaCom62920.2024.00047(247-252)Online publication date: 12-Aug-2024

Index Terms

  1. Fairness for both Readers and Authors: Evaluating Summaries of User Generated Content

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval
      July 2023
      3567 pages
      ISBN:9781450394086
      DOI:10.1145/3539618
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 18 July 2023

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. author satisfaction
      2. fair summarization
      3. summary evaluation

      Qualifiers

      • Short-paper

      Conference

      SIGIR '23
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 792 of 3,983 submissions, 20%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)57
      • Downloads (Last 6 weeks)4
      Reflects downloads up to 05 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)A Novel Summarization Framework based on Reference-Free Evaluation of Multiple Large Language Models2024 IEEE International Conference on Metaverse Computing, Networking, and Applications (MetaCom)10.1109/MetaCom62920.2024.00047(247-252)Online publication date: 12-Aug-2024

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media