Skip to main content

A Method of Collecting Four Character Medicine Effect Phrases in TCM Patents Based on Semi-supervised Learning

  • Conference paper
  • First Online:
Book cover Complex, Intelligent, and Software Intensive Systems (CISIS 2019)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 993))

Included in the following conference series:

Abstract

As a result of historical reasons and writing habits, the effects of medicine in Traditional Chinese Medicine (TCM) patents are often described using four character phrases. These four character phrases are not easily identified by the Chinese word segmentation system, thus greatly affects the results of patent analysis and mining. This paper proposes a semi-supervised learning method to collect four character effect phrases from the abstracts texts of TCM patents, which can help enrich the lexicon of Chinese word segmentation system, and also provide support for semantic patent retrieval and analysis. The experimental results show the validity of the method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Lupu, M., Fujii, A., Oard, D.W., Iwayama, M., Kando, N.: Patent-Related Tasks at NTCIR. Current Challenges in Patent Information Retrieval Series, vol. 37. Springer, Berlin, Heidelberg, New York (2017)

    Google Scholar 

  2. Roda, G., Tait, J., Piroi, F., Zenz, V.: CLEF-IP 2009: Retrieval Experiments in the Intellectual Property Domain. Lecture Notes in Computer Science, vol. 6241. Springer, Berlin, Heidelberg, New York (2009)

    Google Scholar 

  3. Sharma, P., Tripathi, R., Singh, V.K., Tripathi, R.C.: Automated patents search through semantic similarity. In: International Conference on Computer, Communication and Control (IC4). IEEE, Piscataway, NJ (2016)

    Google Scholar 

  4. Wang, F., Lin, L.: Domain lexicon-based query expansion for patent retrieval. In: International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery, pp. 1543–1547. IEEE, Piscataway, NJ (2016)

    Google Scholar 

  5. Zhang, L., Lei, L., Tao, L.: Patent mining: a survey. ACM Sigkdd Explor. Newsl. 16(2), 1–19 (2015)

    Article  Google Scholar 

  6. Magali, R.G.M., Juan, R.S., Zenilton, K.G., Paulo, E.M.: Automatic patent clustering using SOM and bibliographic coupling. Braz. J. Inf. Syst. 10(1), 6–18 (2017)

    Google Scholar 

  7. Shanie, T., Suprijadi, J.: Text Grouping in Patent Analysis Using Adaptive K-means Clustering Algorithm. American Institute of Physics Conference Series, vol. 1827. AIP Publishing (2017) Article ID 020041

    Google Scholar 

  8. Shamsi, F.A., Aung, Z.: Automatic patent classification by a three-phase model with document frequency matrix and boosted tree. In: 5th International Conference on Electronic Devices, Systems and Applications, pp. 1–4. IEEE, Piscataway, NJ (2017)

    Google Scholar 

  9. Li, W.Q., Li, Y., Chen, J., Hou, C.Y.: Product Functional Information Based Automatic Patent Classification: Method and Experimental Studies, Information Systems, vol. 67, pp. 71–82. Elsevier, Amsterdam (2017)

    Google Scholar 

  10. Triulzi, G., Alstott, J., Magee, C.L.: Predicting technology performance improvement rates by mining patent data. In: SSRN Electronic Journal. SSRN, Rochester, NY (2017)

    Google Scholar 

  11. Fu, T., Lei, Z., Lee, W.C.: Patent citation recommendation for examiners. In: IEEE International Conference on Data Mining, pp. 751–756. IEEE, Piscataway, NJ (2016)

    Google Scholar 

  12. Wang, F., Lin, L. F., Yang, Z.: An ontology-based automatic semantic annotation approach for patent document retrieval in product innovation design. In: Applied Mechanics and Materials, vol. 446–447, pp. 1581–1590. Trans Tech Publications Inc, Switzerland (2013)

    Google Scholar 

  13. Okamoto, M., Shan, Z., Orihara, R.: Applying information extraction for patent structure analysis. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 989–992. ACM, New York (2017)

    Google Scholar 

  14. Xu, M., Sun, F., Jiang, X.: Multi-label learning with co-training based on semi-supervised regression. In: 2014 International Conference on Security, Pattern Analysis, and Cybernetic, pp. 175–180. IEEE, Piscataway, NJ (2014)

    Google Scholar 

  15. Wang, W., Lee, X. D., Hu, A.L., Geng, G.G.: Co-training based Semi-supervised web spam detection. In: International Conference on Fuzzy Systems & Knowledge Discovery, pp. 789–793. IEEE, Piscataway, NJ (2013)

    Google Scholar 

  16. Iosifidis, V., Ntoutsi, E.: Large scale sentiment learning with limited labels. In: Acm Sigkdd International Conference on Knowledge Discovery & Data Mining, pp. 1823–1832. ACM, New York (2017)

    Google Scholar 

  17. Blum, A.: Combining labeled and unlabeled data with co-training. In: Conference on Computational Learning Theory, pp. 92–100. ACM, New York (1998)

    Google Scholar 

  18. Deng, N., Chen, X., Ruan, O., Wang, C., Ye, Z., & Tian, J.: The construction method of clue words thesaurus in Chinese patents based on iteration and self-filtering. In: International Conference on Emerging Internetworking. Springer, Berlin, Heidelberg, New York (2017)

    Google Scholar 

  19. Deng, N., Chen, X., Li, D.: Intelligent recommendation of Chinese traditional medicine patents supporting new medicine’s R&D. J. Comput. Theor. Nanosci. 13, 5907–5913 (2016)

    Google Scholar 

  20. Na, D., Xu, C.: Automatically generation and evaluation of stop words list for Chinese patents. Telkomnika 13(4), 1414–1421 (2015)

    Article  Google Scholar 

  21. Deng, N., Chen, X., Ruan, O., Wang, C., Ye, Z., Tian, J.: PaEffExtr: a method to extract effect statements automatically from patents. In: 11th International Conference on Complex, Intelligent and Software Intensive Systems. Springer, Berlin, Heidelberg, New York (2017)

    Google Scholar 

  22. Chen, X., Deng, N.: A semi-supervised machine learning method for Chinese patent effect annotation. In: 2015 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, 243–250. IEEE

    Google Scholar 

Download references

Acknowledgments

This research is supported by National Key Research and Development Program of China under grant number 2017YFC1405403, National Natural Science Foundation of China under grant number 61075059, Green Industry Technology Leding Project (product development category) of Hubei University of Technology under grant number CPYF2017008, Natural Science Foundation of Anhui Province under grant number 1708085MF161, and Key Project of Natural Science Research of Universities in Anhui under grant number KJ2015A236.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chen Xu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Na, D. et al. (2020). A Method of Collecting Four Character Medicine Effect Phrases in TCM Patents Based on Semi-supervised Learning. In: Barolli, L., Hussain, F., Ikeda, M. (eds) Complex, Intelligent, and Software Intensive Systems. CISIS 2019. Advances in Intelligent Systems and Computing, vol 993. Springer, Cham. https://doi.org/10.1007/978-3-030-22354-0_41

Download citation

Publish with us

Policies and ethics