Abstract
Positive and unlabeled learning (PU learning) has been studied to address the situation in which only positive and unlabeled examples are available. Most of the previous work has been devoted to identifying negative examples from the unlabeled data, so that the supervised learning approaches can be applied to build a classifier. However, for the remaining unlabeled data, they either exclude them from the learning phase or force them to belong to a class, and this always limits the performance of PU learning. In addition, previous PU methods assume the training data and the testing data have the same features representations. However, we can always collect the features that the training data have while the test data do not have, these kinds of features are called privileged information. In this paper, we propose a new method, which is based on similarity approach for the problem of positive and unlabeled learning with privileged information (SPUPIL), which consists of two steps. The proposed SPUPIL method first conducts KNN method to generate the similarity weights and then the similarity weights and privileged information are incorporated to the learning model based on Ranking SVM to build a more accurate classifier. We also use the Lagrangian method to transform the original model into its dual problem, and solve it to obtain the classifier. Extensive experiments on the real data sets show that the performance of the SPUPIL is better than the state-of-the-art PU learning methods.
Similar content being viewed by others
Notes
References
Xiong Y, Zuo R (2021) A positive and unlabeled learning algorithm for mineral prospectivity mapping. Comput. Geosci 147:104667. https://doi.org/10.1016/j.cageo.2020.104667
Gong C, Shi H, Liu T, Zhang C, Yang J, Tao D (2021) Loss decomposition and centroid estimation for positive and unlabeled learning. IEEE Trans Pattern Anal Mach Intell 43(3):918–932. https://doi.org/10.1109/TPAMI.2019.2941684
Latulippe M, Drouin A, Giguère P, Laviolette F (2013) Accelerated robust point cloud registration in natural environments through positive and unlabeled learning. In: IJCAI 2013 proceedings of the 23rd international joint conference on artificial intelligence, August 3-9, 2013. http://www.aaai.org/ocs/index.php/IJCAI/IJCAI13/paper/view/6848, Beijing, China, pp 2480–2487
Scott C, Blanchard G (2009) Novelty detection: Unlabeled data definitely help,. In: Proceedings of the twelfth international conference on artificial intelligence and Statistics, AISTATS 2009, April 16-18 2009. http://proceedings.mlr.press/v5/scott09a.html, Clearwater, Beach, Florida, USA, pp 464–471
Liu B, Dai Y, Li X, Lee WS, Yu PS (2003) Building text classifiers using positive and unlabeled examples. In: Proceedings of the 3rd IEEE international conference on data mining (ICDM 2003), 19-22 December 2003. https://doi.org/10.1109/ICDM.2003.1250918, Melbourne, Florida, USA, pp 179–188
Joachims T (2002) Optimizing search engines using clickthrough data. In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining, July 23-26, 2002. https://doi.org/10.1145/775047.775067, Edmonton, Alberta, Canada, pp 133–142
Liu B, Lee WS, Yu PS, Li X (2002) Partially supervised classification of text documents. In: Machine learning, proceedings of the nineteenth international conference (ICML 2002), July 8-12, 2002, University of New South Wales, Sydney, Australia, pp 387–394
Li X, Liu B (2003) Learning to classify texts using positive and unlabeled data. In: IJCAI-03, Proceedings of the eighteenth international joint conference on artificial intelligence, August 9-15, 2003. http://ijcai.org/Proceedings/03/Papers/087.pdf, Acapulco, Mexico, pp 587–594
Yu H, Han J, Chang KC (2004) PEBL: web page classification without negative examples. IEEE Trans Knowl Data Eng 16(1):70–81. https://doi.org/10.1109/TKDE.2004.1264823
Lee WS, Liu B (2003) Learning with positive and unlabeled examples using weighted logistic regression. In: Machine learning, proceedings of the twentieth international conference (ICML 2003), August 21-24, 2003. http://www.aaai.org/Library/ICML/2003/icml03-060.php, Washington, DC, USA, pp 448–455
Elkan C, Noto K (2008) Learning classifiers from only positive and unlabeled data. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, August 24-27, 2008. https://doi.org/10.1145/1401890.1401920, Las Vegas, Nevada, USA, pp 213–220
Du Plessis MC, Niu G, Sugiyama M (2015) Convex formulation for learning from positive and unlabeled data. In: Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, 6-11 July 2015. http://jmlr.org/proceedings/papers/v37/plessis15.html, Lille, France, pp 1386–1394
Du Plessis MC, Niu G, Sugiyama M (2014) Analysis of learning from positive and unlabeled data. In: NIPS 2014, pp 703–711
Fernández-Francos D, Fontenla-Romero O, Alonso-Betanzos A (2016) One-class classification algorithm based on convex hull. In: 24th european symposium on artificial neural networks, ESANN 2016, April 27–29, 2016. http://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2016-136.pdf, Bruges, Belgium
Khan SS, Madden MG (2014) One-class classification: taxonomy of study and review of techniques. Knowl Eng Rev 29(3):345–374. https://doi.org/10.1017/S026988891300043X
Andrew AM (2000) An introduction to support vector machines and other kernel-based learning methods by nello christianini and john shawe-taylor, cambridge university press, cambridge, 2000, xiii+ 189 pp., ISBN 0-521-78019-5 (hbk, £ 27.50). Robotica 18(6):687–689. http://journals.cambridge.org/action/displayAbstract?aid=73725
Rabaoui A, Davy M, Rossignol S, Lachiri Z, Ellouze N (2007) Improved one-class SVM classifier for sounds classification. In: Fourth IEEE international conference on advanced video and signal based surveillance, AVSS 2007, 5-7 September 2007. https://doi.org/10.1109/AVSS.2007.4425296, Queen Mary, University of London, London, United Kingdom, pp 117–122
Shaikh TA, Ali R, Beg MMS (2020) Transfer learning privileged information fuels CAD diagnosis of breast cancer. Mach Vis Appl 31(1-2):9. https://doi.org/10.1007/s00138-020-01058-5
Wu Z, Weise T, Zou L, Sun F, Tan M (2020) Skeleton based action recognition using a stacked denoising autoencoder with constraints of privileged information. arXiv:2003.05684
Pechyony D, Vapnik V On the theory of learning with privileged information, nips–2010, Digit database (with Poetic and Ying-Yang descriptions by N. Pavlovitch). http://ml.nec-labs.com/download/data/svm+/mnist
Wang S, Chen S, Chen T, Shi X (2018) Learning with privileged information for multi-label classification. Pattern Recogn 81:60–70. https://doi.org/10.1016/j.patcog.2018.03.033
Sharmanska V, Quadrianto N, Lampert C. H (2013) Learning to rank using privileged information. In: IEEE international conference on computer vision, ICCV 2013, December 1-8, 2013. https://doi.org/10.1109/ICCV.2013.107, Sydney, Australia, pp 825–832
Feyereisl J, Aickelin U (2012) Privileged information for data clustering. Inf Sci 194:4–23. https://doi.org/10.1016/j.ins.2011.04.025
Calvo B, Larrañaga P., Lozano JA (2007) Learning bayesian classifiers from positive and unlabeled examples. Pattern Recogn Lett 28(16):2375–2384. https://doi.org/10.1016/j.patrec.2007.08.003
Xie C, Cheng Q, Liang J, Chen L, Xiao Y (2020) Collective loss function for positive and unlabeled learning. arXiv:2005.03228
Kato M, Teshima T, Honda J (2019) Learning from positive and unlabeled data with a selection bias. In: 7th International Conference on Learning Representations, ICLR 2019, May 6-9, 2019. https://openreview.net/forum?id=rJzLciCqKm. OpenReview.net, New Orleans, LA, USA
Xiao Y, Liu B, Yin J, Cao L, Zhang C, Hao Z (2011) Similarity-based approach for positive and unlabeled learning. In: IJCAI 2011 Proceedings of the 22nd international joint conference on artificial intelligence, July 16-22, 2011. https://doi.org/10.5591/978-1-57735-516-8/IJCAI11-265, Barcelona, Catalonia, Spain, pp 1577–1582
Nguyen MN, Li X, Ng SK (2011) Positive unlabeled learning for time series classification. In: International joint conference on artificial intelligence
Li W, Guo Q, Elkan C (2011) A positive and unlabeled learning algorithm for one-class classification of remote-sensing data. IEEE Trans Geosci Remote Sens 49(2):717–725. https://doi.org/10.1109/TGRS.2010.2058578
Liu R, Li W, Liu X, Lu X, Li T, Guo Q (2018) An ensemble of classifiers based on positive and unlabeled data in one-class remote sensing classification. IEEE J Sel Top Appl Earth Observ Remote Sens 11(2):572–584. https://doi.org/10.1109/JSTARS.2017.2789213
Zhu F, Ye N, Yu W, Xu S, Li G (2014) Boundary detection and sample reduction for one-class support vector machines. Neurocomputing 123:166–173. https://doi.org/10.1016/j.neucom.2013.07.002
Gao L, Zhang L, Liu C, Wu S (2020) Handling imbalanced medical image data: A deep-learning-based one-class classification approach. Artif Intell Med 108:101935. https://doi.org/10.1016/j.artmed.2020.101935
Shi H, Pan S, Yang J, Gong C (2018) Positive and unlabeled learning via loss decomposition and centroid estimation. In: Proceedings of the twenty-seventh international joint conference on artificial intelligence, IJCAI 2018, July 13-19, 2018. https://doi.org/10.24963/ijcai.2018/373, Stockholm, Sweden, pp 2689–2695
Gao W, Wang L, Li Y, Zhou Z (2016) Risk minimization in the presence of label noise. In: Proceedings of the thirtieth AAAI conference on artificial intelligence, February 12-17, 2016. http://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/12372, Phoenix, Arizona, USA, pp 1575–1581
Zhou JT, Pan SJ, Tsang IW (2012) Multi-view positive and unlabeled learning. In: Proceedings of the 4th asian conference on machine learning, ACML 2012, November 4-6, 2012. http://jmlr.csail.mit.edu/proceedings/papers/v25/zhou12.html, Singapore, pp 555–570
Li W, Niu L, Xu D (2014) Exploiting privileged information from web data for image categorization. In: Computer Vision - ECCV 2014 - 13th european conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V. https://doi.org/10.1007/978-3-319-10602-1_29, pp 437–452
Wang Z, Ji Q (2015) Classifier learning with hidden information. In: IEEE conference on computer vision and pattern recognition, CVPR 2015, June 7-12, 2015. https://doi.org/10.1109/CVPR.2015.7299131, Boston, MA, USA, pp 4969–4977
Tang J, Tian Y, Zhang P, Liu X (2018) Multiview privileged support vector machines. IEEE Trans Neural Netw Learn Syst 29(8):3463–3477. https://doi.org/10.1109/TNNLS.2017.2728139
Xu W, Liu W, Chi H, Qiu S, Jin Y (2019) Self-paced learning with privileged information. Neurocomputing 362:147–155. https://doi.org/10.1016/j.neucom.2019.06.072
Yang X, Wang M, Tao D (2019) Person re-identification with metric learning using privileged information. arXiv:1904.05005
Kamienny P, Arulkumaran K, Behbahani F, Boehmer W, Whiteson S (2020) Privileged information dropout in reinforcement learning. arXiv:2005.09220
Gu Z, Niu L, Zhang L (2019) Hard pixels mining: Learning using privileged information for semantic segmentation. arXiv:1906.11437
Vapnik V, Vashist A (2009) A new learning paradigm: Learning using privileged information. Neural Netw 22(5-6):544–557. https://doi.org/10.1016/j.neunet.2009.06.042
Fouad S, Tiño P, Raychaudhury S, Schneider P (2013) Incorporating privileged information through metric learning. IEEE Trans Neural Netw Learn Syst 24(7):1086–1098. https://doi.org/10.1109/TNNLS.2013.2251470
Xu X, Li W, Xu D (2015) Distance metric learning using privileged information for face verification and person re-identification. IEEE Trans Neural Netw Learn Syst 26(12):3150–3162
Chevalier M, Thome N, Hénaff G, Cord M (2018) Classifying low-resolution images by integrating privileged information in deep cnns. Pattern Recogn Lett 116:29–35. https://doi.org/10.1016/j.patrec.2018.09.007
Chen Y, Jin X, Feng J, Yan S (2017) Training group orthogonal neural networks with privileged information. In: Proceedings of the twenty-sixth international joint conference on artificial intelligence, IJCAI 2017, August 19-25, 2017. https://doi.org/10.24963/ijcai.2017/212, Melbourne, Australia, pp 1532–1538
Tang F, Xiao C, Wang F, Zhou J, Lehman LH (2019) Retaining privileged information for multi-task learning. In: Teredesai A, Kumar V, Li Y, Rosales R, Terzi E, Karypis G (eds) Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, KDD 2019, August 4-8, 2019. https://doi.org/10.1145/3292500.3330907. ACM, Anchorage, AK, USA, pp 1369–1377
Lee W, Lee J, Kim D, Ham B (2020) Learning with privileged information for efficient image super-resolution. arXiv:2007.07524
Javid M, Hamidzadeh J (2020) An active multi-class classification using privileged information and belief function. Int J Mach Learn Cybern 11(3):511–524. https://doi.org/10.1007/s13042-019-00991-w
Cai F (2011) Advanced learning approaches based on svm+ methodology. University of Minnesota - Twin Cities
Liu J, Zhu W, Zhong P (2013) A new multi-class support vector algorithm based on privileged information. J Inform Comput Sci 10(2):443–450
Zhong P, Fukushima M (2006) A new multi-class support vector algorithm. Optim Methods Softw 21(3):359–372. https://doi.org/10.1080/10556780500094812
Ji Y, Sun S, Lu Y (2012) Multitask multiclass privileged information support vector machines. In: Proceedings of the 21st international conference on pattern recognition, ICPR 2012, November 11-15, 2012. http://ieeexplore.ieee.org/document/6460630/, Tsukuba, Japan, pp 2323–2326
Tang F, Tiño P, Gutiérrez PA, Chen H (2014) Support vector ordinal regression using privileged information. In: 22th european symposium on artificial neural networks, ESANN 2014, April 23–25, 2014. http://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2014-58.pdf, Bruges, Belgium
Yang H, Patras I (2013) Privileged information-based conditional regression forest for facial feature detection. In: 10th IEEE international conference and workshops on automatic face and gesture recognition, FG 2013, April 22-26, 2013. https://doi.org/10.1109/FG.2013.6553766, Shanghai China, pp 1–6
Pfannschmidt L, Jakob J, Hinder F, Biehl M, Tiño P, Hammer B (2020) Feature relevance determination for ordinal regression in the context of feature redundancies and privileged information. Neurocomputing 416:266–279. https://doi.org/10.1016/j.neucom.2019.12.133
Gautam C, Tiwari A, Tanveer M (2020) AEKOC+: kernel ridge regression-based auto-encoder for one-class classification using privileged information. Cogn Comput 12 (2):412–425. https://doi.org/10.1007/s12559-019-09705-4
Jung C, Shen Y, Jiao L (2015) Learning to rank with ensemble ranking SVM. Neural Process Lett 42(3):703–714. https://doi.org/10.1007/s11063-014-9382-5
Cao Y, Xu J, Liu T, Li H, Huang Y, Hon H (2006) Adapting ranking SVM to document retrieval. In: SIGIR 2006: Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval, August 6-11, 2006. https://doi.org/10.1145/1148170.1148205, Seattle, Washington, USA, pp 186–193
Liang X, Zhu L, Huang D (2017) Multi-task ranking SVM for image cosegmentation. Neurocomputing 247:126–136. https://doi.org/10.1016/j.neucom.2017.03.060
Liu B, Lee WS, Yu PS, Li X (2003) Partially supervised classification of text documents. In: Nineteenth international conference on machine learning
Qi Z, Tian Y, Shi Y (2014) A new classification model using privileged information and its application. Neurocomputing 129:146–152. https://doi.org/10.1016/j.neucom.2013.09.045
Everett H (1963) Generalized lagrange multiplier method for solving problems of optimum allocation of resources. Oper Res 11(3):399–417
Sellamanickam S, Garg P, Selvaraj SK (2011) A pairwise ranking based approach to learning with positive and unlabeled examples. In: Proceedings of the 20th ACM Conference on information and knowledge management, CIKM 2011, October 24-28, 2011. https://doi.org/10.1145/2063576.2063675, Glasgow, United Kingdom, pp 663–672
Li W, Dai D, Tan M, Xu D, Gool LV (2016) Fast algorithms for linear and kernel SVM+. In: 2016 IEEE conference on computer vision and pattern recognition, CVPR 2016, June 27-30, 2016. https://doi.org/10.1109/CVPR.2016.248, Las Vegas, NV, USA, pp 2258–2266
Nguyen MN, Li X, Ng SK (2011) Positive unlabeled leaning for time series classification. In: 2011 IJCAI Proceedings of the 22nd international joint conference on artificial intelligence, July 16-22, 2011, Barcelona, Catalonia, Spain
Kiryo R, Niu G, du Plessis M. C, Sugiyama M. C (2017) Positive-unlabeled learning with non-negative risk estimator. In: Advances in neural information processing systems 30: annual conference on neural information processing systems 2017, December 4-9, 2017. http://papers.nips.cc/paper/6765-positive-unlabeled-learning-with-non-negative-risk-estimator, Long Beach, CA, USA, pp 1674–1684
Zhang J, Wang Z, Meng J, Tan Y, Yuan J (2019) Boosting positive and unlabeled learning for anomaly detection with multi-features. IEEE Trans Multimed 21(5):1332–1344. https://doi.org/10.1109/TMM.2018.2871421
Oppedal K, Eftestøl T, Engan K, Beyer M. K, Aarsland D (2015) Classifying dementia using local binary patterns from different regions in magnetic resonance images. Int J Biomed Imaging 2015:572567:1–572567:14. https://doi.org/10.1155/2015/572567
Shan C (2012) Learning local binary patterns for gender classification on real-world face images. Pattern Recogn Lett 33(4):431–437. https://doi.org/10.1016/j.patrec.2011.05.016
Huang CP, Hsieh CH, Lai KT, Huang WY (2012) Human action recognition using histogram of oriented gradient of motion history image. In: First international conference on instrumentation
Kobayashi T (2013) BFO meets HOG: feature extraction based on histograms of oriented p.d.f. gradients for image classification. In: 2013 IEEE conference on computer vision and pattern recognition, June 23-28, 2013. https://doi.org/10.1109/CVPR.2013.102, Portland, OR, USA, pp 747–754
Bay H, Ess A, Tuytelaars T, Gool LV (2008) Speeded-up robust features (SURF). Comput Vis Image Underst 110(3):346–359. https://doi.org/10.1016/j.cviu.2007.09.014
Gardezi SJS, Faye I, Adjed F, Kamel N, Eltoukhy MM (2017) Mammogram classification using curvelet glcm texture features and gist features. In: Advances in intelligent systems & computing, pp 705–713
Vedaldi A, Fulkerson B (2010) Vlfeat: an open and portable library of computer vision algorithms. In: Proceedings of the 18th international conference on multimedia 2010, October, 25-29, 2010. https://doi.org/10.1145/1873951.1874249, Firenze, Italy, pp 1469–1472
Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2014) Decaf: A deep convolutional activation feature for generic visual recognition. In: Proceedings of the 31th international conference on machine learning, ICML 2014, June 21-26, 2014. http://proceedings.mlr.press/v32/donahue14.html, Beijing China, pp 647–655
Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30. http://jmlr.org/papers/v7/demsar06a.html
Derrac J, García S, Molina D, Herrera F (2011) A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm Evol Comput 1(1):3–18. https://doi.org/10.1016/j.swevo.2011.02.002
Acknowledgement
The authors would like to thank the anonymous referees for their significant comments and suggestions. This work was supported in part by the Natural Science Foundation of China under Grant 62076074, 61876044 and 61672169, in part by Guangdong Basic and Applied Basic Research Foundation Grant 2020A1515010670 and 2020A1515011501, in part by the Science and TechnologyPlanning Project of Guangzhou under Grant 202002030141.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A
Appendix A
By introducing Lagrange multipliers αa ≥ 0, αb ≥ 0, αc ≥ 0, βa ≥ 0, βb ≥ 0, βc ≥ 0, λa ≥ 0, λb ≥ 0 and λc ≥ 0, we can obtain the following Lagrange function L for objective function (2):
Setting the partial derivatives of the L with respect to w, w∗, \(\xi _{a}^{*}\), \(\xi _{b}^{*}\), \(\xi _{c}^{*}\) equal to zeros respectively, we can obtain:
After substituting (9)-(13) into (8), we can obtain the dual problem (3). This completes the proof.
Rights and permissions
About this article
Cite this article
Liu, B., Liu, Q. & Xiao, Y. A new method for positive and unlabeled learning with privileged information. Appl Intell 52, 2465–2479 (2022). https://doi.org/10.1007/s10489-021-02528-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-02528-7