Skip to main content

Automatic Keyword Extraction from Single-Sentence Natural Language Queries

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7458))

Abstract

This paper presents a novel algorithm of extracting keywords from single-sentence natural language queries in English. The process involves applying a series of rules to a parsed query in order to pick out potential keywords based on part-of-speech and the surrounding phrase structure. A supervised machine learning method is also explored in order to find suitable rules, which has shown promising results when cross-validated with various training sets.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hulth, A., Karlgren, J., Jonsson, A., Boström, H., Asker, L.: Automatic Keyword Extraction Using Domain Knowledge. In: Gelbukh, A. (ed.) CICLing 2001. LNCS, vol. 2004, pp. 472–482. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  2. Hulth, A.: Improved automatic keyword extraction given more linguistic knowledge. In: Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, EMNLP 2003, pp. 216–223. Association for Computational Linguistics, Stroudsburg (2003)

    Google Scholar 

  3. Hulth, A.: Enhancing linguistically oriented automatic keyword extraction. In: Proceedings of HLT-NAACL 2004: Short Papers. HLT-NAACL-Short 2004, pp. 17–20. Association for Computational Linguistics, Stroudsburg (2004)

    Chapter  Google Scholar 

  4. Orchestr8: Alchemyapi, http://www.alchemyapi.com (last accessed February 10, 2012)

  5. Judge, J., Cahill, A., van Genabith, J.: Questionbank: creating a corpus of parse-annotated questions. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, ACL-44, pp. 497–504. Association for Computational Linguistics, Stroudsburg (2006)

    Google Scholar 

  6. Apache: Opennlp, http://opennlp.sourceforge.net (last accessed December 21, 2011)

  7. Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, ACL 2003, vol. 1, pp. 423–430. Association for Computational Linguistics, Stroudsburg (2003)

    Google Scholar 

  8. Bies, A., Ferguson, M., Katz, K., MacIntyre, R., Tredinnick, V., Kim, G., Marcinkiewicz, M.A., Schasberger, B.: Bracketing guidelines for treebank ii style penn treebank project. Technical report, University of Pennsylvania (1995)

    Google Scholar 

  9. C.C.G., Illinois chunker, http://cogcomp.cs.illinois.edu/page/software_view/Chunker (last accessed January 17, 2012)

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, D.X., Gao, X., Andreae, P. (2012). Automatic Keyword Extraction from Single-Sentence Natural Language Queries. In: Anthony, P., Ishizuka, M., Lukose, D. (eds) PRICAI 2012: Trends in Artificial Intelligence. PRICAI 2012. Lecture Notes in Computer Science(), vol 7458. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32695-0_56

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-32695-0_56

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-32694-3

  • Online ISBN: 978-3-642-32695-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics