ABSTRACT
We present a novel approach to extract keyphrases based on Augmented Transition Networks (abbreviated as ATNs) followed by statistical methods from any given article, notes on a particular subject, or any other document source. The use of ATNs has completely ruled out the need of background corpora in identifying the potential keywords and keyphrases. Moreover, the use of ATNs has greatly reduced the search space for the statistical methods. We have devised two new methods namely, relaxed statistical analysis and stringent statistical analysis to identify the separability of phrases into sub phrases. In this paper, the two tier process is discussed in detail and illustrated with examples. We have also discussed the applications of this process briefly.
- Woods, William A (1970). Transition Network Grammars for Natural Language Analysis. Communications of the ACM 13 (10): 591--606. doi:10.1145/355598.362773 Google ScholarDigital Library
- Eibe Frank, Gordon W. Paynter, Ian H. Witten, Carl Gutwin, and Craig G. Nevill-Manning. 1999. Domain-specific keyphrase extraction. In IJCAI, pages 668--673. Google ScholarDigital Library
- Peter D. Turney. 2000. Learning algorithms for keyphrase extraction. Information Retrieval, 2(4):303--336. Google ScholarDigital Library
- Alan L. Tharp. Augmented Transition Networks As A Design Tool For Personalized Database Systems. Computer Science Department, North Carolina State University, Raleigh, North Carolina. Google ScholarDigital Library
- Fei Liu, Feifan Liu, and Yang Liu. A Supervised Framework for Keyword Extraction From Meeting Transcripts. IEEE transactions on Audio, Speech, and Language Processing. Google ScholarDigital Library
- Bilkent University, Department of Computer Science, Turkey. Yasin Uzun. http://www.cs.bilkent.edu.tr/~guvenir/courses/cs550/Workshop/Yasin_Uzun.pdfGoogle Scholar
- MSc. Dipl.-Inf. Elena Demidova, Erklärung http://www.l3s.de/~demidova/students/thesis_oelze.pdfGoogle Scholar
- Kristina Toutanova and Christopher D. Manning. 2000. Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger. In Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/VLC-2000), pp. 63--70. Google ScholarDigital Library
- Kristina Toutanova, Dan Klein, Christopher Manning, and Yoram Singer. 2003. Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network. In Proceedings of HLT-NAACL 2003, pp. 252--259. Google ScholarDigital Library
Index Terms
- A novel approach to keyphrase extraction using augmented transition networks and statistical tools
Recommendations
Domain-specific keyphrase extraction
CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge managementDocument keyphrases provide semantic metadata characterizing documents and producing an overview of the content of a document. They can be used in many text-mining and knowledge management related applications. This paper describes a Keyphrase ...
Exploiting neighborhood knowledge for single document summarization and keyphrase extraction
Document summarization and keyphrase extraction are two related tasks in the IR and NLP fields, and both of them aim at extracting condensed representations from a single text document. Existing methods for single document summarization and keyphrase ...
Automatic keyphrase extraction for Arabic news documents based on KEA system
A keyphrase is a sequence of words that play an important role in the identification of the topics that are embedded in a given document. Keyphrase extraction is a process which extracts such phrases. This has many important applications such as document ...
Comments