ABSTRACT
Semi-supervised learning is an essential approach to classification when the available labeled data is insufficient and we need to also make use of unlabeled data in the learning process. Numerous research efforts have focused on designing algorithms to improve the F1 score, but have any mechanism to control precision or recall individually. However, many applications have precision/recall preferences. For instance, an email spam classifier requires a precision of 0.9 to mitigate the false dismissal of useful emails. In this paper, we propose a method that allows to specify a precision/recall preference while maximising the F1 score. Our key idea is that we divide the semi-supervised learning process into multiple rounds of supervised learning, and the classifier learned at each round is calibrated using a sub-set of the labeled dataset before we use it on the unlabeled dataset for enlarging the training dataset. Our idea is applicable to a number of learning models such as Support Vector Machines (SVMs), Bayesian networks and neural networks. We focus our research and the implementation of our idea on SVMs. We conduct extensive experiments to validate the effectiveness of our method. The experimental results show that our method can train classifiers with a precision/recall preference, while the popular semi-supervised SVM training algorithm (which we use as the baseline) cannot. When we specify the precision preference and the recall preference to be the same, which indicates to maximise the F1 score only as the baseline does, our method achieves better or similar F1 scores to the baseline. An additional advantage of our method is that it converges much faster than the baseline.
- Tong Zhang and F. Oles. The value of unlabeled data for classification problems. In ICML, pages 1191--1198. Citeseer, 2000.Google Scholar
- Matthias Seeger et al. Learning with labeled and unlabeled data. Technical report, University of Edinburgh, 2001.Google Scholar
- D. Sculley and Gabriel M. Wachman. Relaxed online svms for spam filtering. In SIGIR conference on Research and development in information retrieval, pages 415--422, 2007. Google ScholarDigital Library
- Susan C. Harvey, Berta Geller, Robert G. Oppenheimer, Melanie Pinet, Leslie Riddell, and Brian Garra. Increase in cancer detection and recall rates with independent double interpretation of screening mammography. American Journal of Roentgenology, 180(5):1461--1467, 2003.Google ScholarCross Ref
- Nello Cristianini and John Shawe-Taylor. An introduction to support vector machines and other kernel-based learning methods. Cambridge university press, 2000. Google ScholarDigital Library
- Ira Cohen, Nicu Sebe, F. G. Gozman, Marcelo Cesar Cirelo, and Thomas S. Huang. Learning bayesian network classifiers for facial expression recognition both labeled and unlabeled data. In Conference on Computer Vision and Pattern Recognition, volume 1, pages I--595. IEEE, 2003. Google ScholarDigital Library
- Judith E. Dayhoff. Neural network architectures: an introduction. Van Nostrand Reinhold Co., 1990. Google ScholarDigital Library
- Thorsten Joachims. Transductive inference for text classification using support vector machines. In ICML, volume 99, pages 200--209, 1999. Google ScholarDigital Library
- Xiaojin Zhu. Semi-supervised learning literature survey. University of Wisconsin-Madison, 2:3, 2006.Google Scholar
- Edgar Osuna, Robert Freund, and Federico Girosi. An improved training algorithm for support vector machines. In Proceedings of the 1997 IEEE Workshop, Neural Networks for Signal Processing VII., pages 276--285, 1997.Google ScholarCross Ref
- John C. Platt. Fast training of SVMs using sequential minimal optimization. In Advances in kernel methods, pages 185--208. MIT Press, 1999. Google ScholarDigital Library
- Thorsten Joachims. Training linear SVMs in linear time. In KDD, pages 217--226, 2006. Google ScholarDigital Library
- Shai Shalev-Shwartz, Yoram Singer, Nathan Srebro, and Andrew Cotter. Pegasos: Primal estimated sub-gradient solver for svm. Mathematical Programming, 127(1):3--30, 2011. Google ScholarDigital Library
- Kristin Bennett, Ayhan Demiriz, et al. Semi-supervised support vector machines. Advances in Neural Information processing systems, pages 368--374, 1999. Google ScholarDigital Library
- Ayhan Demiriz and Kristin P. Bennett. Optimization approaches to semi-supervised learning. In Complementarity: Applications, Algorithms and Extensions, pages 121--141. Springer, 2001.Google Scholar
- Tijl De Bie and Nello Cristianini. Semi-supervised learning using semi-definite programming. Semi-supervised learning. MIT Press, Cambridge-Massachussets, 32, 2006.Google Scholar
- Hwanjo Yu, Jiawei Han, and Kevin Chen-Chuan Chang. PEBL: positive example based learning for web page classification using SVM. In KDD, pages 239--248, 2002. Google ScholarDigital Library
- Gabriel Pui Cheong Fung, Jeffrey Xu Yu, Hongjun Lu, and Philip S. Yu. Text classification without negative examples revisit. Knowledge and Data Engineering, IEEE Transactions on, 18(1):6--20, 2006. Google ScholarDigital Library
- Bing Liu, Yang Dai, Xiaoli Li, Wee Sun Lee, and Philip S. Yu. Building text classifiers using positive and unlabeled examples. In Data Mining, 2003. ICDM 2003. Third IEEE International Conference on, pages 179--186. IEEE, 2003. Google ScholarDigital Library
- Kristin P. Bennett and Erin J. Bredensteiner. Duality and geometry in svm classifiers. In ICML, pages 57--64, 2000. Google ScholarDigital Library
- Bernhard E. Boser, Isabelle M. Guyon, and Vladimir N. Vapnik. A training algorithm for optimal margin classifiers. In Workshop on Computational Learning Theory, pages 144--152, 1992. Google ScholarDigital Library
- Geoff Gordon and Ryan Tibshirani. Karush-kuhn-tucker conditions. Optimization, 10(725/36):725.Google Scholar
- CUDA NVIDIA. NVIDIA CUDA programming guide, 2011.Google Scholar
- Bryan Catanzaro, Narayanan Sundaram, and Kurt Keutzer. Fast SVM training and classification on graphics processors. In ICML, pages 104--111, 2008. Google ScholarDigital Library
- Steven M. LaValle, Michael S. Branicky, and Stephen R. Lindemann. On the relationship between classical grid search and probabilistic roadmaps. The International Journal of Robotics Research, 23(7-8):673--692, 2004.Google ScholarCross Ref
Index Terms
- Enabling Precision/Recall Preferences for Semi-supervised SVM Training
Recommendations
Inductive Semi-supervised Multi-Label Learning with Co-Training
KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data MiningIn multi-label learning, each training example is associated with multiple class labels and the task is to learn a mapping from the feature space to the power set of label space. It is generally demanding and time-consuming to obtain labels for training ...
Probabilistic Labeled Semi-supervised SVM
ICDMW '09: Proceedings of the 2009 IEEE International Conference on Data Mining WorkshopsSemi-supervised learning has been paid increasing attention and is widely used in many fields such as data mining, information retrieval and knowledge management as it can utilize both labeled and unlabeled data. Laplacian SVM (LapSVM) is a very ...
A novel inductive semi-supervised SVM with graph-based self-training
IScIDE'12: Proceedings of the third Sino-foreign-interchange conference on Intelligent Science and Intelligent Data EngineeringIn this paper, a novel inductive support vector machine for semi-supervised learning, named IS3VM, is proposed, which aims to improve SVM by bootstrapping unlabeled data with self-training. The SVM classifier is iteratively refined through the ...
Comments