skip to main content
10.1145/3269206.3271711acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Question Headline Generation for News Articles

Published: 17 October 2018 Publication History

Abstract

In this paper, we introduce and tackle the Question Headline Generation (QHG) task. The motivation comes from the investigation of a real-world news portal where we find that news articles with question headlines often receive much higher click-through ratio than those with non-question headlines. The QHG task can be viewed as a specific form of the Question Generation (QG) task, with the emphasis on creating a natural question from a given news article by taking the entire article as the answer. A good QHG model thus should be able to generate a question by summarizing the essential topics of an article. Based on this idea, we propose a novel dual-attention sequence-to-sequence model (DASeq2Seq) for the QHG task. Unlike traditional sequence-to-sequence models which only employ the attention mechanism in the decoding phase for better generation, our DASeq2Seq further introduces a self-attention mechanism in the encoding phase to help generate a good summary of the article. We investigate two ways of the self-attention mechanism, namely global self-attention and distributed self-attention. Besides, we employ a vocabulary gate over both generic and question vocabularies to better capture the question patterns. Through the offline experiments, we show that our approach can significantly outperform the state-of-the-art question generation or headline generation models. Furthermore, we also conduct online evaluation to demonstrate the effectiveness of our approach using A/B test.

References

[1]
Michael Alley, Madeline Schreiber, Katrina Ramsdell, and John Muffo. 2006. How the design of headlines in presentation slides affects audience retention. Technical communication, Vol. 53, 2 (2006), 225--234.
[2]
Shiqi Shen Ayana, Zhiyuan Liu, and Maosong Sun. 2016. Neural headline generation with minimum risk training. arXiv preprint arXiv:1604.01904 (2016).
[3]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In ICLR .
[4]
Michele Banko, Vibhu O Mittal, and Michael J Witbrock. 2000. Headline generation based on statistical translation. In ACL. Association for Computational Linguistics, 318--325.
[5]
Jaime Carbonell and Jade Goldstein. 1998. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In SIGIR. ACM, 335--336.
[6]
Abhijnan Chakraborty, Bhargavi Paranjape, Sourya Kakarla, and Niloy Ganguly. 2016. Stop clickbait: Detecting and preventing clickbaits in online news media. In Advances in Social Networks Analysis and Mining (ASONAM), 2016 IEEE/ACM International Conference on. IEEE, 9--16.
[7]
Yllias Chali and Sina Golestanirad. 2016. Ranking Automatically Generated Questions Using Common Human Queries. In INLG . 217--221.
[8]
Qian Chen, Xiaodan Zhu, Zhenhua Ling, Si Wei, and Hui Jiang. 2016. Distraction-Based Neural Networks for Document Summarization. In IJCAI .
[9]
Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In EMNLP .
[10]
Sumit Chopra, Michael Auli, Alexander M Rush, and SEAS Harvard. 2016. Abstractive Sentence Summarization with Attentive Recurrent Neural Networks. In HLT-NAACL . 93--98.
[11]
Carlos A Colmenares, Marina Litvak, Amin Mantrach, and Fabrizio Silvestri. 2015. HEADS: Headline Generation as Sequence Prediction Using an Abstract Feature-Rich Space. In HLT-NAACL. 133--142.
[12]
Michael Denkowski and Alon Lavie. 2014. Meteor universal: Language specific translation evaluation for any target language. In Proceedings of the ninth workshop on statistical machine translation . 376--380.
[13]
Bonnie Dorr, David Zajic, and Richard Schwartz. 2003. Hedge trimmer: A parse-and-trim approach to headline generation. In HLT-NAACL. Association for Computational Linguistics, 1--8.
[14]
Xinya Du, Junru Shao, and Claire Cardie. 2017. Learning to Ask: Neural Question Generation for Reading Comprehension. In ACL .
[15]
HP Edmundson. 1964. Problems in automatic abstracting. Commun. ACM, Vol. 7, 4 (1964), 259--263.
[16]
Günes Erkan and Dragomir R Radev. 2004. Lexrank: Graph-based lexical centrality as salience in text summarization. JAIR, Vol. 22 (2004), 457--479.
[17]
Michael Heilman and Noah A Smith. 2010. Good question! statistical ranking for question generation. In HLT-NAACL . 609--617.
[18]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation, Vol. 9, 8 (1997), 1735--1780.
[19]
Hongyan Jing and Kathleen R McKeown. 1999. The decomposition of human-written summary sentences. In Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 129--136.
[20]
Diederik Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In ICLR .
[21]
Girish Kumar, Rafael E Banchs, and Luis Fernando D'Haro Enriquez. 2015. Revup: Automatic gap-fill question generation from educational texts.
[22]
Linda Lai and Audun Farbrot. 2014. What makes you click? The effect of question headlines on readership in computer-mediated communication. Social Influence, Vol. 9, 4 (2014), 289--299.
[23]
Paul LaRocque. 2003. Heads You Win: An Easy Guide to Better Headline and Caption Writing .Marion Street Press, Inc.
[24]
Jiwei Li, Minh-Thang Luong, and Dan Jurafsky. 2015. A hierarchical neural autoencoder for paragraphs and documents. In ACL .
[25]
Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out: Proceedings of the ACL-04 workshop, Vol. 8. Barcelona, Spain.
[26]
Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio. 2017. A structured self-attentive sentence embedding. In ICLR .
[27]
David Lindberg, Fred Popowich, John Nesbit, and Phil Winne. 2013. Generating natural language questions to support learning on-line. (2013).
[28]
Hans Peter Luhn. 1958. The automatic creation of literature abstracts. IBM Journal of research and development, Vol. 2, 2 (1958), 159--165.
[29]
Prashanth Mannem, Rashmi Prasad, and Aravind Joshi. 2010. Question generation from paragraphs at UPenn: QGSTEC system description. In Proceedings of QG2010. 84--91.
[30]
Betty A Mathis, James E Rush, and Carol E Young. 1973. Improvement of automatic abstracts by the use of structural analysis. Journal of the Association for Information Science and Technology, Vol. 24, 2 (1973), 101--109.
[31]
Gerardo Atienza Merino and Leonor Varela Lema. 2008. Needs and demands of policy-makers. HEALTH TECHNOLOGY ASSESSMENT AND HEALTH POLICY-MAKING IN EUROPE (2008), 137.
[32]
Rada Mihalcea and Paul Tarau. 2004. TextRank: Bringing Order into Text. In EMNLP, Vol. 4. 404--411.
[33]
Nasrin Mostafazadeh, Ishan Misra, Jacob Devlin, Margaret Mitchell, Xiaodong He, and Lucy Vanderwende. 2016. Generating natural questions about an image. In ACL .
[34]
Ramesh Nallapati, Feifei Zhai, and Bowen Zhou. 2017. SummaRuNNer: A recurrent neural network based sequence model for extractive summarization of documents. In AAAI .
[35]
Ramesh Nallapati, Bowen Zhou, Caglar Gulcehre, Bing Xiang, et almbox. 2016. Abstractive text summarization using sequence-to-sequence rnns and beyond. arXiv preprint arXiv:1602.06023 (2016).
[36]
Juntiga Nasunee. 2004. An analysis of catchy words and sentences in Volkswagen beetle advertisements in the United States. Unpublished master's project. Srinakharinwirot University (2004).
[37]
Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank citation ranking: Bringing order to the web. Technical Report. Stanford InfoLab.
[38]
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: a method for automatic evaluation of machine translation. In ACL. ACL, 311--318.
[39]
Daniele Pighin, Marco Cornolti, Enrique Alfonseca, and Katja Filippova. 2014. Modelling Events through Memory-based, Open-IE Patterns for Abstractive Summarization. In ACL . 892--901.
[40]
David Lindberg Fred Popowich and John Nesbit Phil Winne. 2013. Generating Natural Language Questions to Support Learning On-Line. ENLG (2013), 105.
[41]
Dragomir R Radev and Kathleen R McKeown. 1998. Generating natural language summaries from multiple on-line sources. Computational Linguistics, Vol. 24, 3 (1998), 470--500.
[42]
Stephen E Robertson and Steve Walker. 1994. Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In SIGIR. Springer-Verlag New York, Inc., 232--241.
[43]
David E Rumelhart, Geoffrey E Hinton, Ronald J Williams, et almbox. 1988. Learning representations by back-propagating errors. Cognitive modeling, Vol. 5, 3 (1988), 1.
[44]
Vasile Rus, Brendan Wyse, Paul Piwek, Mihai Lintean, Svetlana Stoyanchev, and Cristian Moldovan. 2010. The first question generation shared task evaluation challenge. In INLG .
[45]
Alexander M Rush, Sumit Chopra, and Jason Weston. 2015. A neural attention model for abstractive sentence summarization. In EMNLP .
[46]
Gerard Salton, Amit Singhal, Mandar Mitra, and Chris Buckley. 1997. Automatic text structuring and summarization. Information Processing & Management, Vol. 33, 2 (1997), 193--207.
[47]
Iulian Vlad Serban, Alberto Garc'ia-Durán, Caglar Gulcehre, Sungjin Ahn, Sarath Chandar, Aaron Courville, and Yoshua Bengio. 2016. Generating factoid questions with recurrent neural networks: The 30m factoid question-answer corpus. In ACL .
[48]
Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Shirui Pan, and Chengqi Zhang. 2018. Disan: Directional self-attention network for rnn/cnn-free language understanding. In AAAI .
[49]
Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In NIPS. 3104--3112.
[50]
Sho Takase, Jun Suzuki, Naoaki Okazaki, Tsutomu Hirao, and Masaaki Nagata. 2016. Neural Headline Generation on Abstract Meaning Representation. In EMNLP . 1054--1059.
[51]
Jiwei Tan, Xiaojun Wan, and Jianguo Xiao. 2017. From Neural Sentence Summarization to Headline Generation: A Coarse-to-Fine Approach. In IJCAI .
[52]
Kristian Woodsend, Yansong Feng, and Mirella Lapata. 2010. Generation with quasi-synchronous grammar. In EMNLP. Association for Computational Linguistics, 513--523.
[53]
Songhua Xu, Shaohui Yang, and Francis Chi-Moon Lau. 2010. Keyword Extraction and Headline Generation Using Novel Word Features. In AAAI . 1461--1466.
[54]
Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alexander J Smola, and Eduard H Hovy. 2016. Hierarchical Attention Networks for Document Classification. In HLT-NAACL . 1480--1489.

Cited By

View all
  • (2024)How to generate popular post headlines on social media?AI Open10.1016/j.aiopen.2023.12.0025(1-9)Online publication date: 2024
  • (2024)Personalized EDM Subject Generation via Co-factored User-Subject EmbeddingAdvances in Knowledge Discovery and Data Mining10.1007/978-981-97-2253-2_5(55-67)Online publication date: 25-Apr-2024
  • (2023)General then Personal: Decoupling and Pre-training for Personalized Headline GenerationTransactions of the Association for Computational Linguistics10.1162/tacl_a_0062111(1588-1607)Online publication date: 14-Dec-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management
October 2018
2362 pages
ISBN:9781450360142
DOI:10.1145/3269206
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. question headline generation
  2. self-attention mechanism

Qualifiers

  • Research-article

Funding Sources

Conference

CIKM '18
Sponsor:

Acceptance Rates

CIKM '18 Paper Acceptance Rate 147 of 826 submissions, 18%;
Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)32
  • Downloads (Last 6 weeks)2
Reflects downloads up to 10 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)How to generate popular post headlines on social media?AI Open10.1016/j.aiopen.2023.12.0025(1-9)Online publication date: 2024
  • (2024)Personalized EDM Subject Generation via Co-factored User-Subject EmbeddingAdvances in Knowledge Discovery and Data Mining10.1007/978-981-97-2253-2_5(55-67)Online publication date: 25-Apr-2024
  • (2023)General then Personal: Decoupling and Pre-training for Personalized Headline GenerationTransactions of the Association for Computational Linguistics10.1162/tacl_a_0062111(1588-1607)Online publication date: 14-Dec-2023
  • (2023)Put Your Voice on Stage: Personalized Headline Generation for News ArticlesACM Transactions on Knowledge Discovery from Data10.1145/362916818:3(1-20)Online publication date: 9-Dec-2023
  • (2023)Semantic-Enhanced Differentiable Search Index Inspired by Learning StrategiesProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599903(4904-4913)Online publication date: 6-Aug-2023
  • (2023)Style-Driven Multi-Perspective Relevance Mining Model for Hotspot Reprint Paragraph Prediction2023 IEEE International Conference on Intelligence and Security Informatics (ISI)10.1109/ISI58743.2023.10297268(01-06)Online publication date: 2-Oct-2023
  • (2023)Fact-Preserved Personalized News Headline Generation2023 IEEE International Conference on Data Mining (ICDM)10.1109/ICDM58522.2023.00197(1493-1498)Online publication date: 1-Dec-2023
  • (2022)Matching news articles and wikipedia tables for news augmentationKnowledge and Information Systems10.1007/s10115-022-01815-065:4(1713-1734)Online publication date: 27-Dec-2022
  • (2021)Towards automatic generated content website based on content classification and auto-article generationProceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining10.1145/3487351.3488414(436-438)Online publication date: 8-Nov-2021
  • (2021)Algorithmic copywriting: automated generation of health-related advertisements to improve their performanceInformation Retrieval Journal10.1007/s10791-021-09392-6Online publication date: 13-Apr-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media