Skip to main content

Chinese Patent Mining Based on Sememe Statistics and Key-Phrase Extraction

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4632))

Abstract

Recently, key-phrase extraction from patent document has received considerable attention. However, the current statistical approaches of Chinese key-phrase extraction did not realize the semantic comprehension, thereby resulting in inaccurate and partial extraction. In this study, a Chinese patent mining approach based on sememe statistics and key-phrase extraction has been proposed to extract key-phrases from patent document. The key-phrase extraction algorithm is based on semantic knowledge structure of HowNet, and statistical approach is adopted to calculate the chosen value of the phrase in the patent document. With an experimental data set, the results showed that the proposed algorithm had improvements in recall from 62% to 73% and in precision from 72% to 81% compared with term frequency statistics algorithm.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chien, L.F., Pu, H.T.: Important Issues on Chinese Information Retrieval. Computational Linguistics and Chinese Language Processing 1, 205–221 (1996)

    Google Scholar 

  2. Schatz, B., Chen, H.: Digital Libraries: Technological Advancements and Social Impacts. IEEE Computer 2, 45–50 (1999)

    Article  Google Scholar 

  3. Chen, H., Houston, A.L., Sewell, R.R., Schatz, B.R.: Internet Browsing and Searching: User Evaluation of Category Map and Concept Space Techniques. Journal of the American Society for Information Science 7, 582–603 (1998)

    Article  Google Scholar 

  4. Wang, H., Li, S., Yu, S.: Automatic Keyphrase Extraction from Chinese News Documents. In: Wang, L., Jin, Y. (eds.) FSKD 2005. LNCS (LNAI), vol. 3614, pp. 648–657. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  5. Freitag, D.: Machine Learning for Information Extraction in Informal Domains. Journal Machine Learning 39, 169–202 (2000)

    Article  MATH  Google Scholar 

  6. Ong, T.H., Chen, H.: Updateable PAT-Tree Approach to Chinese Key Phrase Extraction using Mutual Information: A Linguistic Foundation for Knowledge Management. In: Proceedings of the Second Asian Digital Library Conference, Taiwan, pp. 63–84 (1999)

    Google Scholar 

  7. Dong, Z.D.: Bigger Context and Better Understanding: Expectation on Future MT Technology. In: Proceedings of the International Conference on Machine Translation & Computer Language Information, Beijing, pp. 17–25 (1996)

    Google Scholar 

  8. Damerau, F.J.: Generating and Evaluating Domain-Oriented Multi-word Terms from Texts. Information Processing & Management 4, 433–447 (1993)

    Article  Google Scholar 

  9. Ji, H., Luo, Z., Wan, M., Gao, X.: Research on Automatic Summarization Based on Concept Counting and Semantic Hierarchy Analysis for English Texts. Journal of Chinese Information Processing 2, 14–20 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Jin, B., Teng, HF., Shi, YJ., Qu, FZ. (2007). Chinese Patent Mining Based on Sememe Statistics and Key-Phrase Extraction. In: Alhajj, R., Gao, H., Li, J., Li, X., Zaïane, O.R. (eds) Advanced Data Mining and Applications. ADMA 2007. Lecture Notes in Computer Science(), vol 4632. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73871-8_48

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73871-8_48

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73870-1

  • Online ISBN: 978-3-540-73871-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics