ABSTRACT
Automatic patent classification based on the TRIZ inventive principles is essential for patent management and industrial analysis. However, acquiring labels for deep learning methods is extraordinarily difficult and costly. This paper proposes a new two-stage semi-supervised learning framework called TRIZ-ESSL, which stands for Enhanced Semi-Supervised Learning for TRIZ. TRIZ-ESSL makes full use of both labeled and unlabeled data to improve the prediction performance. TRIZ-ESSL takes the advantages of semi-supervised sequence learning and mixed objective function, a combination of cross-entropy, entropy minimization, adversarial and virtual adversarial loss functions. Firstly, TRIZ-ESSL uses unlabeled data to train a recurrent language model. Secondly, TRIZ-ESSL initializes the weights of the LSTM-based model with the pre-trained recurrent language model and then trains the text classification model using mixed objective function on both labeled and unlabeled sets. On 3 TRIZ-based classification tasks, TRIZ-ESSL outperforms the current popular semi-supervised training methods and Bert in terms of accuracy score.
- Bahdanau, D., Cho, K. and Bengio, Y. 2015. Neural Machine Translation by Jointly Learning to Align and Translate. 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings (2015).Google Scholar
- Blum, A. and Mitchell, T. 1977. Combining labeled and unlabeled data with co-training. (1977).Google Scholar
- Dai, A.M. and Le, Q.V. 2015. Semi-supervised Sequence Learning. Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada (2015), 3079--3087. Google Scholar
- Devlin, J., Chang, M.-W., Lee, K. and Toutanova, K. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. (2018).Google Scholar
- Fawcett, T. 2006. An introduction to ROC analysis. Pattern Recognition Letters. 27, 8 (2006), 861--874. DOI:https://doi.org/10.1016/j.patrec.2005.10.010. Google ScholarDigital Library
- Grawe, M.F., Martins, C.A. and Bonfante, A.G. 2017. Automated patent classification using word embedding. 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA) (2017), 408--111.Google ScholarCross Ref
- Hu, Y., Wei, T., Dou, Z., Huang, Y., Liang, R. and Chang, H. The Technological Development Path of Transformation and Upgrading of Knife-Scissor Industries in Guangdong Based on TRIZ Patents Analysis(in Chinese). Data Analysis and Knowledge Discovery. 1 -13.Google Scholar
- Huang, R., Zhou, P. and Zhang, L. 2014. A LDA-based approach for semi-supervised document clustering. International Journal of Machine Learning and Computing. 4, 4 (2014), 313.Google ScholarCross Ref
- Ioffe, S. and Szegedy, C. 2015. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015 (2015), 448--456. Google Scholar
- Jo, T. and Japkowicz, N. 2004. Class imbalances versus small disjuncts. SIGKDD Explorations. 6, 1 (2004), 40--49. DOI:https://doi.org/10.1145/1007730.1007737. Google ScholarDigital Library
- Joachims, T. 1999. Transductive inference for text classification using support vector machines. Icml (1999), 200--209. Google Scholar
- Kalchbrenner, N., Grefenstette, E. and Blunsom, P. 2014. A Convolutional Neural Network for Modelling Sentences. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (2014).Google ScholarCross Ref
- Li, S., Hu, J., Cui, Y. and Hu, J. 2018. DeepPatent: patent classification with convolutional neural networks and word embedding. Scientometrics. 117, 2 (2018), 721 -744. Google ScholarDigital Library
- Loshchilov, I. and Hutter, F. 2017. Fixing Weight Decay Regularization in Adam. CoRR. abs/1711.05101, (2017).Google Scholar
- Lyu, L. and Han, T. 2019. A Comparative Study of Chinese Patent Literature Automatic Classification Based on Deep Learning. 2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL) (2019), 345--346.Google ScholarDigital Library
- Micikevicius, P., Narang, S., Alben, J., Diamos, G.F., Elsen, E., García, D., Ginsburg, B., Houston, M., Kuchaiev, O., Venkatesh, G. and Wu, H. 2018. Mixed Precision Training. 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30-May 3, 2018, Conference Track Proceedings (2018).Google Scholar
- Miyato, T., Dai, A.M. and Goodfellow, I.J. 2017. Adversarial Training Methods for Semi-Supervised Text Classification. 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings (2017).Google Scholar
- Nigam, K., McCallum, A.K., Thrun, S. and Mitchell, T. 2000. Text classification from labeled and unlabeled documents using EM. Machine learning. 39, 2--3 (2000), 103--134. Google Scholar
- Pawar, P.Y. and Gawande, S. 2012. A comparative study on different types of approaches to text categorization. International Journal of Machine Learning and Computing. 2, 4 (2012), 423.Google ScholarCross Ref
- Ramachandran, P., Zoph, B. and Le, Q.V. 2017. Swish: a self-gated activation function. arXiv preprint arXiv:1710.05941. 7, (2017).Google Scholar
- Sachan, D.S., Zaheer, M. and Salakhutdinov, R. 2019. Revisiting LSTM Networks for Semi-Supervised Text Classification via Mixed Objective Function. Proceedings of the AAAI Conference on Artificial Intelligence. 33, (Jul. 2019), 6940--6948. DOI:https://doi.org/10.1609/aaai.v33i01.33016940.Google Scholar
- Shalaby, M., Stutzki, J., Schubert, M. and Günnemann, S. 2018. An lstm approach to patent classification based on fixed hierarchy vectors. Proceedings of the 2018 SIAM International Conference on Data Mining (2018), 495--503.Google ScholarCross Ref
- Wu, J.-L. 2019. Patent Quality Classification System Using the Feature Extractor of Deep Recurrent Neural Network. 2019 IEEE International Conference on Big Data and Smart Computing (BigComp) (2019), 1--8.Google Scholar
- Xiao, L., Wang, G. and Zuo, Y. 2018. Research on Patent Text Classification Based on Word2Vec and LSTM. 2018 11th International Symposium on Computational Intelligence and Design (ISCID) (2018), 71--74.Google ScholarCross Ref
- Yarowsky, D. 1995. Unsupervised word sense disambiguation rivaling supervised methods. 33rd annual meeting of the association for computational linguistics (1995), 189--196. Google Scholar
- Zhu, X., Ghahramani, Z. and Lafferty, J.D. 2003. Semi-supervised learning using gaussian fields and harmonic functions. Proceedings of the 20th International conference on Machine learning (ICML-03) (2003), 912--919. Google Scholar
Index Terms
- A Semi-Supervised Learning Framework for TRIZ-Based Chinese Patent Classification
Recommendations
Inductive Semi-supervised Multi-Label Learning with Co-Training
KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data MiningIn multi-label learning, each training example is associated with multiple class labels and the task is to learn a mapping from the feature space to the power set of label space. It is generally demanding and time-consuming to obtain labels for training ...
Semi-supervised multi-label classification using incomplete label information
Highlights- An inductive semi-supervised method called Smile is proposed for multi-label classification using incomplete label information.
AbstractClassifying multi-label instances using incompletely labeled instances is one of the fundamental tasks in multi-label learning. Most existing methods regard this task as supervised weak-label learning problem and assume sufficient ...
Semi-supervised partial label learning algorithm via reliable label propagation
AbstractPartial label learning (PLL) is a weakly supervised learning method that is able to predict one label as the correct answer from a given candidate label set. In PLL, when all possible candidate labels are as signed to real-world training examples, ...
Comments