Skip to main content

Learning to Extract Coherent Keyphrases from Online News

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7097))

Abstract

Keyphrases extracted from news articles can be used to concisely represent the main content of news events. In this paper, we first present several criteria of high-quality news keyphrases. After that, in order to integrate those criteria into the keyphrase extraction task, we propose a novel formulation which coverts the task to a learning to rank problem. Our approach involves two phases: selecting candidate keyphrases and ranking all possible sub-permutations among the candidates. Three kinds of feature sets: lexical feature set, locality feature set and coherence feature set are introduced to rank the candidates, and then the best sub-permutation provides the keyphrases. The proposed method is evaluated on a multi-news collection and experimental results verify that our proposed method is effective to extract coherent news keyphrases.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cao, Z., Qin, T., Liu, T.Y., Tsai, M.F., Li, H.: Learning to rank: From pairwise approach to listwise approach. In: Proceedings of the International Conference on Machine Learning (2007)

    Google Scholar 

  2. Frank, E., Paynter, G.W., Witten, I.H., Gutwin, C., Nevill-Manning, C.G.: Domain-specific keyphrase extraction. In: Proceedings of IJCAI, pp. 668–673 (1999)

    Google Scholar 

  3. Hulth, A.: Improved automatic keyword extraction given more linguistic knowledge. In: Proceedings of EMNLP, pp. 216–223 (2003)

    Google Scholar 

  4. Jiang, X., Hu, Y.H., Li, H.: A aanking approach to keyphase extraction. In: Proceedings of SIGIR (2009)

    Google Scholar 

  5. Kumaran, G., Carvalho, V.R.: Reducing long queries using query quality predictors. In: Proceedings of SIGIR (2009)

    Google Scholar 

  6. Liu, Z.Y., Li, P., Zheng, Y.B., Sun, M.S.: Clustering to find exemplar terms for keyphrase extraction. In: Proceedings of EMNLP, pp. 257–266 (2009)

    Google Scholar 

  7. Liu, Z., Huang, W., Zheng, Y., Sun, M.: Automatic keyphrase extraction via topic decomposition. In: Proceedings of EMNLP, pp. 366–376 (2010)

    Google Scholar 

  8. Matsuo, Y., Ishizuka, M.: Keyword extraction from a single document using word co-occurrence statistical information. International Journal on Artificial Intelligence Tools (2004)

    Google Scholar 

  9. Mihalcea, R., Tarau, P.: Textrank: Bringing order into texts. In: Proceedings of EMNLP, pp. 404–411 (2004)

    Google Scholar 

  10. Nguyen, T.D., Kan, M.Y.: Keyphrase Extraction in Scientific Publications. In: Goh, D.H.-L., Cao, T.H., Sølvberg, I.T., Rasmussen, E. (eds.) ICADL 2007. LNCS, vol. 4822, pp. 317–326. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  11. Turney, P.D.: Learning algorithms for keyphrase extraction. Information Retrieval, 303–336 (2000)

    Google Scholar 

  12. Turney, P.D.: Coherent keyphrase extraction via web mining. In: Proceedings of IJCAI, pp. 434–439 (2003)

    Google Scholar 

  13. Wan, X., Xiao, J.: Single document keyphrase extraction using neighborhood knowledge. In: Proceedings of AAAI, pp. 855–860 (2008)

    Google Scholar 

  14. Xia, F., Liu, T.Y., Wang, J., Zhang, W., Li, H.: Listwise approach to learning to rank - theory and algorithm. In: Proceedings of ICML (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ding, Z., Zhang, Q., Huang, X. (2011). Learning to Extract Coherent Keyphrases from Online News. In: Salem, M.V.M., Shaalan, K., Oroumchian, F., Shakery, A., Khelalfa, H. (eds) Information Retrieval Technology. AIRS 2011. Lecture Notes in Computer Science, vol 7097. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25631-8_43

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-25631-8_43

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-25630-1

  • Online ISBN: 978-3-642-25631-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics