Abstract
We propose an efficient training method of Confidence Weighted Learning (CWL) algorithms for semi-structured text and its application to Adaptive Regularization of Weight Vectors (AROW), which is a CWL algorithm. CWL algorithms are online learning algorithms that combines large margin training and confidence weighting of features. CWL algorithms learn confidence weights of features, therefore, it is difficult to apply kernel methods that implicitly expand features. If we expand features in advance, it leads to increased memory usage. To solve the problem, we propose a training method that dynamically extracted features from semi-structured text. In addition, we propose a pruning method for improved training speed. The pruning skips training samples classified correctly more than or equal to certain times. We compared our method using word-strings as semi-structured texts with AROW that expands all the features in advance. Experimental results of text classification tasks on an Amazon data set show that our training method contributes to improved memory usage and two to three times faster training speed while maintaining accuracy for learning longer n-grams.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
In our implementation, we randomly shuffled the training samples at the beginning and then use each of them in the shuffled order. After processing all the shuffled training samples, we shuffled the training samples again and use each of them in the shuffled order. Therefore, each training sample was used 10 times.
References
Aoe, J.: An efficient digital search algorithm by using a double-array structure. IEEE Trans. Softw. Eng. 15(9), 1066–1077 (1989)
Blitzer, J., Dredze, M., Pereira, F.: Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. In: ACL, pp. 440–447 (2007)
Collins, M.: Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms. In: Proceedings of EMNLP 2002, pp. 1–8 (2002)
Crammer, K., Dekel, O., Keshet, J., Shalev-Shwartz, S., Singer, Y.: Online passive-aggressive algorithms. J. Mach. Learn. Res. 7, 551–585 (2006)
Crammer, K., Dredze, M., Pereira, F.: Confidence-weighted linear classification for text categorization. J. Mach. Learn. Res. 13, 1891–1926 (2012)
Crammer, K., Kulesza, A., Dredze, M.: Adaptive regularization of weight vectors. Mach. Learn. 91(2), 155–187 (2013)
Hoi, S.C.H., Wang, J., Zhao, P.: Exact soft confidence-weighted learning. In: Proceedings of ICML 2012 (2012)
Kudo, T., Maeda, E., Matsumoto, Y.: An application of boosting to graph classification. In: NIPS 2004, pp. 729–736 (2004)
Kudo, T., Matsumoto, Y.: Fast methods for kernel-based text analysis. In: Proceedings of the ACL 2003, pp. 24–31 (2003)
Kudo, T., Matsumoto, Y.: A boosting algorithm for classification of semi-structured text. In: EMNLP 2004, pp. 301–308 (2004)
Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of ICML 2001, pp. 282–289 (2001)
Leslie, C., Eskin, E., Noble, W.S.: The spectrum kernel: a string kernel for SVM protein classification. In: Proceedings of the 7th Pacific Symposium on Biocomputing, pp. 564–575 (2002)
Lodhi, H., Saunders, C., Shawe-Tayor, J., Cristianini, N., Watkins, C.: Text classification using string kernels. J. Mach. Learn. Res. 2, 419–444 (2002)
Rosenblatt, F.: The perceptron: a probabilistic model for information storage and organization in the brain. 65(6), 386–408 (1958)
Suzuki, J., Isozaki, H., Maeda, E.: Convolution kernels with feature selection for natural language processing tasks. In: Proceedings of ACL 2004, pp. 119–126 (2004)
Yan, X., Han, J.: gSpan: Graph-based substructure pattern mining (2002)
Yoshikawa, H., Iwakura, T.: Fast training of a graph boosting for large-scale text classification. In: Booth, R., Zhang, M.-L. (eds.) PRICAI 2016. LNCS (LNAI), vol. 9810, pp. 638–650. Springer, Cham (2016). doi:10.1007/978-3-319-42911-3_53
Yoshinaga, N., Kitsuregawa, M.: Kernel slicing: scalable online training with conjunctive features. In: Proceedings of COLING 2010, pp. 1245–1253 (2010)
Zaki, M.: Efficiently mining frequent trees in a forest. In: Proceedings of SIGKDD 2002, pp. 71–80 (2002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Iwakura, T. (2017). Efficient Training of Adaptive Regularization of Weight Vectors for Semi-structured Text. In: Kim, J., Shim, K., Cao, L., Lee, JG., Lin, X., Moon, YS. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2017. Lecture Notes in Computer Science(), vol 10235. Springer, Cham. https://doi.org/10.1007/978-3-319-57529-2_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-57529-2_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57528-5
Online ISBN: 978-3-319-57529-2
eBook Packages: Computer ScienceComputer Science (R0)