skip to main content
10.1145/3404555.3404600acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccaiConference Proceedingsconference-collections
research-article

A Semi-Supervised Learning Framework for TRIZ-Based Chinese Patent Classification

Authors Info & Claims
Published:20 August 2020Publication History

ABSTRACT

Automatic patent classification based on the TRIZ inventive principles is essential for patent management and industrial analysis. However, acquiring labels for deep learning methods is extraordinarily difficult and costly. This paper proposes a new two-stage semi-supervised learning framework called TRIZ-ESSL, which stands for Enhanced Semi-Supervised Learning for TRIZ. TRIZ-ESSL makes full use of both labeled and unlabeled data to improve the prediction performance. TRIZ-ESSL takes the advantages of semi-supervised sequence learning and mixed objective function, a combination of cross-entropy, entropy minimization, adversarial and virtual adversarial loss functions. Firstly, TRIZ-ESSL uses unlabeled data to train a recurrent language model. Secondly, TRIZ-ESSL initializes the weights of the LSTM-based model with the pre-trained recurrent language model and then trains the text classification model using mixed objective function on both labeled and unlabeled sets. On 3 TRIZ-based classification tasks, TRIZ-ESSL outperforms the current popular semi-supervised training methods and Bert in terms of accuracy score.

References

  1. Bahdanau, D., Cho, K. and Bengio, Y. 2015. Neural Machine Translation by Jointly Learning to Align and Translate. 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings (2015).Google ScholarGoogle Scholar
  2. Blum, A. and Mitchell, T. 1977. Combining labeled and unlabeled data with co-training. (1977).Google ScholarGoogle Scholar
  3. Dai, A.M. and Le, Q.V. 2015. Semi-supervised Sequence Learning. Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada (2015), 3079--3087. Google ScholarGoogle Scholar
  4. Devlin, J., Chang, M.-W., Lee, K. and Toutanova, K. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. (2018).Google ScholarGoogle Scholar
  5. Fawcett, T. 2006. An introduction to ROC analysis. Pattern Recognition Letters. 27, 8 (2006), 861--874. DOI:https://doi.org/10.1016/j.patrec.2005.10.010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Grawe, M.F., Martins, C.A. and Bonfante, A.G. 2017. Automated patent classification using word embedding. 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA) (2017), 408--111.Google ScholarGoogle ScholarCross RefCross Ref
  7. Hu, Y., Wei, T., Dou, Z., Huang, Y., Liang, R. and Chang, H. The Technological Development Path of Transformation and Upgrading of Knife-Scissor Industries in Guangdong Based on TRIZ Patents Analysis(in Chinese). Data Analysis and Knowledge Discovery. 1 -13.Google ScholarGoogle Scholar
  8. Huang, R., Zhou, P. and Zhang, L. 2014. A LDA-based approach for semi-supervised document clustering. International Journal of Machine Learning and Computing. 4, 4 (2014), 313.Google ScholarGoogle ScholarCross RefCross Ref
  9. Ioffe, S. and Szegedy, C. 2015. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015 (2015), 448--456. Google ScholarGoogle Scholar
  10. Jo, T. and Japkowicz, N. 2004. Class imbalances versus small disjuncts. SIGKDD Explorations. 6, 1 (2004), 40--49. DOI:https://doi.org/10.1145/1007730.1007737. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Joachims, T. 1999. Transductive inference for text classification using support vector machines. Icml (1999), 200--209. Google ScholarGoogle Scholar
  12. Kalchbrenner, N., Grefenstette, E. and Blunsom, P. 2014. A Convolutional Neural Network for Modelling Sentences. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (2014).Google ScholarGoogle ScholarCross RefCross Ref
  13. Li, S., Hu, J., Cui, Y. and Hu, J. 2018. DeepPatent: patent classification with convolutional neural networks and word embedding. Scientometrics. 117, 2 (2018), 721 -744. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Loshchilov, I. and Hutter, F. 2017. Fixing Weight Decay Regularization in Adam. CoRR. abs/1711.05101, (2017).Google ScholarGoogle Scholar
  15. Lyu, L. and Han, T. 2019. A Comparative Study of Chinese Patent Literature Automatic Classification Based on Deep Learning. 2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL) (2019), 345--346.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Micikevicius, P., Narang, S., Alben, J., Diamos, G.F., Elsen, E., García, D., Ginsburg, B., Houston, M., Kuchaiev, O., Venkatesh, G. and Wu, H. 2018. Mixed Precision Training. 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30-May 3, 2018, Conference Track Proceedings (2018).Google ScholarGoogle Scholar
  17. Miyato, T., Dai, A.M. and Goodfellow, I.J. 2017. Adversarial Training Methods for Semi-Supervised Text Classification. 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings (2017).Google ScholarGoogle Scholar
  18. Nigam, K., McCallum, A.K., Thrun, S. and Mitchell, T. 2000. Text classification from labeled and unlabeled documents using EM. Machine learning. 39, 2--3 (2000), 103--134. Google ScholarGoogle Scholar
  19. Pawar, P.Y. and Gawande, S. 2012. A comparative study on different types of approaches to text categorization. International Journal of Machine Learning and Computing. 2, 4 (2012), 423.Google ScholarGoogle ScholarCross RefCross Ref
  20. Ramachandran, P., Zoph, B. and Le, Q.V. 2017. Swish: a self-gated activation function. arXiv preprint arXiv:1710.05941. 7, (2017).Google ScholarGoogle Scholar
  21. Sachan, D.S., Zaheer, M. and Salakhutdinov, R. 2019. Revisiting LSTM Networks for Semi-Supervised Text Classification via Mixed Objective Function. Proceedings of the AAAI Conference on Artificial Intelligence. 33, (Jul. 2019), 6940--6948. DOI:https://doi.org/10.1609/aaai.v33i01.33016940.Google ScholarGoogle Scholar
  22. Shalaby, M., Stutzki, J., Schubert, M. and Günnemann, S. 2018. An lstm approach to patent classification based on fixed hierarchy vectors. Proceedings of the 2018 SIAM International Conference on Data Mining (2018), 495--503.Google ScholarGoogle ScholarCross RefCross Ref
  23. Wu, J.-L. 2019. Patent Quality Classification System Using the Feature Extractor of Deep Recurrent Neural Network. 2019 IEEE International Conference on Big Data and Smart Computing (BigComp) (2019), 1--8.Google ScholarGoogle Scholar
  24. Xiao, L., Wang, G. and Zuo, Y. 2018. Research on Patent Text Classification Based on Word2Vec and LSTM. 2018 11th International Symposium on Computational Intelligence and Design (ISCID) (2018), 71--74.Google ScholarGoogle ScholarCross RefCross Ref
  25. Yarowsky, D. 1995. Unsupervised word sense disambiguation rivaling supervised methods. 33rd annual meeting of the association for computational linguistics (1995), 189--196. Google ScholarGoogle Scholar
  26. Zhu, X., Ghahramani, Z. and Lafferty, J.D. 2003. Semi-supervised learning using gaussian fields and harmonic functions. Proceedings of the 20th International conference on Machine learning (ICML-03) (2003), 912--919. Google ScholarGoogle Scholar

Index Terms

  1. A Semi-Supervised Learning Framework for TRIZ-Based Chinese Patent Classification

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ICCAI '20: Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence
      April 2020
      563 pages
      ISBN:9781450377089
      DOI:10.1145/3404555

      Copyright © 2020 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 20 August 2020

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited
    • Article Metrics

      • Downloads (Last 12 months)18
      • Downloads (Last 6 weeks)3

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader