skip to main content
10.1145/3469096.3469874acmconferencesArticle/Chapter ViewAbstractPublication PagesdocengConference Proceedingsconference-collections
research-article

Ordering sentences and paragraphs with pre-trained encoder-decoder transformers and pointer ensembles

Published:16 August 2021Publication History

ABSTRACT

Passage ordering aims to maximise discourse coherence in document generation or document modification tasks such as summarisation or storytelling. This paper extends the passage ordering task from sentences to paragraphs, i.e., passages with multiple sentences. Increasing the passage length increases the task's difficulty. To account for this, we propose the combination of a pre-trained encoder-decoder Transformer model, namely BART, with variations of pointer networks. We empirically evaluate the proposed models for sentence and paragraph ordering. Our best model outperforms previous state of the art methods by 0.057 Kendall's Tau on one of three sentence ordering benchmarks (arXiv, VIST, ROC-Story). For paragraph ordering, we construct two novel datasets from Wikipedia and CNN-DailyMail on which we achieve 0.67 and 0.47 Kendall's Tau. The best model variation utilises multiple pointer networks in an ensemble-like fashion. We hypothesise that the use of multiple pointers better reflects the multitude of possible orders of paragraphs in more complex texts. Our code, data, and models are publicly available1.

References

  1. Regina Barzilay, Noemie Elhadad, and Kathleen R. McKeown. 2001. Sentence Ordering in Multidocument Summarization. In Proceedings of the First International Conference on Human Language Technology Research (San Diego) (HLT '01). Association for Computational Linguistics, USA, 1--7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Regina Barzilay and Mirella Lapata. 2008. Modeling local coherence: An entity-based approach. Computational Linguistics 34, 1 (2008), 1--34.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Yoshua Bengio and Yann LeCun (Eds.). 2015. 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings. https://iclr.cc/archive/www/doku.php%3Fid=iclr2015: accepted-main.htmlGoogle ScholarGoogle Scholar
  4. Xinchi Chen, Xipeng Qiu, and Xuanjing Huang. 2016. Neural Sentence Ordering. arXiv (2016), arXiv-1607.Google ScholarGoogle Scholar
  5. Baiyun Cui, Yingming Li, Ming Chen, and Zhongfei Zhang. 2018. Deep Attentive Sentence Ordering Network. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 4340--4349. Google ScholarGoogle ScholarCross RefCross Ref
  6. Baiyun Cui, Yingming Li, and Zhongfei Zhang. 2020. BERT-enhanced Relational Sentence Ordering Network. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Online, 6310--6320. Google ScholarGoogle ScholarCross RefCross Ref
  7. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171--4186. Google ScholarGoogle ScholarCross RefCross Ref
  8. Jingjing Gong, Xinchi Chen, Xipeng Qiu, and Xuanjing Huang. 2016. End-to-End Neural Sentence Ordering Using Pointer Network. arXiv (2016), arXiv-1611.Google ScholarGoogle Scholar
  9. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Ting-Hao Kenneth Huang, Francis Ferraro, Nasrin Mostafazadeh, Ishan Misra, Aishwarya Agrawal, Jacob Devlin, Ross Girshick, Xiaodong He, Pushmeet Kohli, Dhruv Batra, C. Lawrence Zitnick, Devi Parikh, Lucy Vanderwende, Michel Galley, and Margaret Mitchell. 2016. Visual Storytelling. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, San Diego, California, 1233--1239. Google ScholarGoogle ScholarCross RefCross Ref
  11. Monisha Kanakaraj and Ram Mohana Reddy Guddeti. 2015. NLP based sentiment analysis on Twitter data using ensemble classifiers. In 2015 3rd International Conference on Signal Processing, Communication and Networking (ICSCN). IEEE, Chennai, India, 1--5. Google ScholarGoogle ScholarCross RefCross Ref
  12. Vijay Kotu and Bala Deshpande. 2015. Chapter 2: Data Mining Process. Predictive Analytics and Data Mining. Elsevier (2015), 26.Google ScholarGoogle Scholar
  13. Mirella Lapata. 2003. Probabilistic Text Structuring: Experiments with Sentence Ordering. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1 (Sapporo, Japan) (ACL '03). Association for Computational Linguistics, USA, 545--552. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Mirella Lapata. 2006. Automatic Evaluation of Information Ordering: Kendall's Tau. Computational Linguistics 32, 4 (2006), 471--484.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 7871--7880. Google ScholarGoogle ScholarCross RefCross Ref
  16. Jiwei Li and Dan Jurafsky. 2017. Neural Net Models of Open-domain Discourse Coherence. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark, 198--209. Google ScholarGoogle ScholarCross RefCross Ref
  17. Lajanugen Logeswaran, Honglak Lee, and Dragomir Radev. 2018. Sentence Ordering and Coherence Modeling using Recurrent Neural Networks. Proceedings of the AAAI Conference on Artificial Intelligence 32, 1 (Apr. 2018). https://ojs.aaai.org/index.php/AAAI/article/view/11997Google ScholarGoogle Scholar
  18. Nasrin Mostafazadeh, Nathanael Chambers, Xiaodong He, Devi Parikh, Dhruv Batra, Lucy Vanderwende, Pushmeet Kohli, and James Allen. 2016. A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, San Diego, California, 839--849. Google ScholarGoogle ScholarCross RefCross Ref
  19. Robi Polikar. 2006. Ensemble based systems in decision making. IEEE Circuits and systems magazine 6, 3 (2006), 21--45.Google ScholarGoogle ScholarCross RefCross Ref
  20. Shrimai Prabhumoye, Ruslan Salakhutdinov, and Alan W Black. 2020. Topological Sort for Sentence Ordering. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 2783--2792. Google ScholarGoogle ScholarCross RefCross Ref
  21. A. Radford. 2018. Improving Language Understanding by Generative Pre-Training.Google ScholarGoogle Scholar
  22. Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research 21, 140 (2020), 1--67. http://jmlr.org/papers/v21/20-074.htmlGoogle ScholarGoogle Scholar
  23. Georg Rehm, Peter Bourgonje, Stefanie Hegele, Florian Kintzel, Julian Moreno Schneider, Malte Ostendorff, Karolina Zaczynska, Armin Berger, Stefan Grill, Sören Räuchle, Jens Rauenbusch, Lisa Rutenburg, Andre Schmidt, Mikka Wild, Henry Hoffmann, Julian Fink, Sarah Schulz, Jurica Seva, Joachim Quantz, Joachim Böttger, Josefine Matthey, Rolf Fricke, Jan Thomsen, Adrian Paschke, Jamal Al Qundus, Thomas Hoppe, Naouel Karam, Frauke Weichhardt, Christian Fillies, Clemens Neudecker, Mike Gerber, Kai Labusch, Vahid Rezanezhad, Robin Schaefer, David Zellhöfer, Daniel Siewert, Patrick Bunk, Lydia Pintscher, Elena Aleynikova, and Franziska Heine. 2020. QURATOR: Innovative Technologies for Content and Data Curation. In Proceedings of QURATOR 2020 - The conference for intelligent content solutions. Conference on Digital Curation Technologies (QURATOR-2020), January 20-21, Berlin, Germany, Adrian Paschke, Clemens Neudecker, Georg Rehm, Jamal Al Qundus, and Lydia Pintscher (Eds.). CEUR Workshop Proceedings. Volume 2535.Google ScholarGoogle Scholar
  24. Georg Rehm, Karolina Zaczynska, Julián Moreno-Schneider, Malte Ostendorff, Peter Bourgonje, Maria Berger, Jens Rauenbusch, André Schmidt, and Mikka Wild. 2020. Towards Discourse Parsing-inspired Semantic Storytelling. arXiv e-prints (2020), arXiv-2004.Google ScholarGoogle Scholar
  25. Julian Risch, A. Stoll, Marc Ziegele, and Ralf Krestel. 2019. hpiDEDIS at GermEval 2019: Offensive Language Identification using a German BERT model. In KONVENS. KONVENS, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany.Google ScholarGoogle Scholar
  26. Lior Rokach. 2010. Ensemble-based classifiers. Artificial intelligence review 33, 1-2 (2010), 1--39.Google ScholarGoogle Scholar
  27. Abigail See, Peter J. Liu, and Christopher D. Manning. 2017. Get To The Point: Summarization with Pointer-Generator Networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Vancouver, Canada, 1073--1083. Google ScholarGoogle ScholarCross RefCross Ref
  28. Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to Sequence Learning with Neural Networks. In Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2 (Montreal, Canada) (NIPS'14). MIT Press, Cambridge, MA, USA, 3104--3112.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, undefinedukasz Kaiser, and Illia Polosukhin. 2017. Attention is All You Need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (Long Beach, California, USA) (NIPS'17). Curran Associates Inc., Red Hook, NY, USA, 6000--6010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly. 2015. Pointer Networks. In Advances in Neural Information Processing Systems, C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett (Eds.), Vol. 28. Curran Associates, Inc., Palais des Congrès de Montréal, Montréal CANADA. https://proceedings.neurips.cc/paper/2015/file/29921001f2f04bd3baee84a12e98098f-Paper.pdfGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  31. Tianming Wang and Xiaojun Wan. 2019. Hierarchical Attention Networks for Sentence Ordering. Proceedings of the AAAI Conference on Artificial Intelligence 33, 01 (Jul. 2019), 7184--7191. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander M. Rush. 2019. HuggingFace's Transformers: State-of-the-art Natural Language Processing. ArXiv abs/1910.03771 (2019).Google ScholarGoogle Scholar

Index Terms

  1. Ordering sentences and paragraphs with pre-trained encoder-decoder transformers and pointer ensembles

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          DocEng '21: Proceedings of the 21st ACM Symposium on Document Engineering
          August 2021
          178 pages
          ISBN:9781450385961
          DOI:10.1145/3469096

          Copyright © 2021 Owner/Author

          This work is licensed under a Creative Commons Attribution International 4.0 License.

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 16 August 2021

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate178of537submissions,33%
        • Article Metrics

          • Downloads (Last 12 months)29
          • Downloads (Last 6 weeks)1

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader