Abstract
In order to protect data privacy, a new transfer group probability Naive Bayes algorithm TrGNB is proposed. TrGNB is applied to scenarios in which the source domain contains a large amount of labeled data and only a small amount of unlabeled data group probability information in the target domain. TrGNB integrates the ideology of transfer learning and group probability information into the Naive Bayes model, which not only improves the classification effect of the learning task in the target domain but also protects the data privacy. The TrGNB was verified on the 20-Newsgroups, Reuters-21578 and Email spam datasets. The experimental results show that TrGNB significantly improves the classification accuracy compared with the benchmark algorithms.
Similar content being viewed by others
References
Jordan MI, Mitchell TM (2015) Machine learning: trends, perspectives, and prospects. Science 349 (6245):255–260
Dai W, Xue GR, Yang Q et al (2007) Transferring naive bayes classifiers for text classification. In: National conference on artificial intelligence. AAAI Press, pp 540–545
Joachims T (1999) Transductive inference for text classification using support vector machines. In: Sixteenth international conference on machine learning. Morgan Kaufmann Publishers Inc
Mir A, Nasiri JA (2018) KNN-based least squares twin support vector machine for pattern classification. Appl Intell 48(12):4551–4564
Barushka A, Hajek P (2018) Spam filtering using integrated distribution-based balancing approach and regularized deep neural networks. Appl Intell 65148(10):3538–3556
Tsai CF, Hsu YF, Lin CY et al (2009) Intrusion detection by machine learning: a review. Expert Syst Appl 36(10):11994–12000
Stolpe M, Morik K (2011) Learning from label proportions by optimizing cluster model selection 913(1):349–364
Quadrianto N, Smola AJ, Caetano TS et al (2008) Estimating labels from label proportions. In: International conference on machine learning. ACM, pp 776–783
Rüping S (2010) SVM classifier estimation from group probabilities. In: Proceedings of 27th ICML, Haifa, pp 911–918
Quadrianto N, Smola AJ, Caetano TS et al (2009) Estimating labels from label proportions. J Mach Learn Res 2009(10):2349–2374
Jiang Y, Deng Z, Choi K-S et al (2015) A novel privacy-preserving probability transductive classifiers from group probabilities based on regression model. J Intell Fuzzy Syst 2015(29):917–925
Zhuang FZ, Luo P, He Q, Shi ZZ (2015) Survey on transfer learning research. Ruan Jian Xue Bao & J Softw 26(1):26–39. (in Chinese)
Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359
Day O, Khoshgoftaar TM (2017) A survey on heterogeneous transfer learning. J Big Data 4(1):29
Weiss K, Khoshgoftaar TM, Wang DD (2016) A survey of transfer learning. J Big Data 3(1):9
Gao J, Fan W, Jiang J et al (2008) Knowledge transfer via multiple model local structure mapping. In: International conference on knowledge discovery & data mining, pp 283–291
Quanz B, Huan J (2009) Large margin transductive transfer learning. In: ACM conference on information and knowledge management. ACM, pp 1327–1336
Ni T, Gu X, Wang J et al (2018) Scalable transfer support vector machine with group probabilities. Neurocompting 570–582
Long M, Wang J, Ding G et al (2014) Adaptation regularization: a general framework for transfer learning. IEEE Trans Knowl Data Eng 26(5):1076–1089
Li M, Dai Q (2008) A novel knowledge-leverage-based transfer learning algorithm. Appl Intell 48(8):2355–2372
Hong JM, Yin J, Huang Y et al (2011) TrSVM: a transfer learning algorithm using domain similarity. J Comput Res Dev 48(10):1823–1830. (in Chinese)
Dai W, Yang Q, Xue GR et al (2007) Boosting for transfer learning. In: International conference on machine learning. ACM, pp 193–200
Joachims T (2002) Learning to classify text using support vector machines: methods. Kluwer International 29(4):655–661
Mozafari AS, Jamzad M (2016) A SVM-based model-transferring method for heterogeneous domain adaptation. Elsevier Science Inc, New York
Li X, Mao W, Jiang W (2016) Extreme learning machine based transfer learning for data classification. Neurocomputing 174:203–210
Fei L, Deng Y (2019) A new divergence measure for basic probability assignment and its applications in extremely uncertain environments. Int J Intell Syst 34(4):584–600
Song Y, Deng Y (2019) A new method to measure the divergence in evidential sensor data fusion. Int J Distrib Sens Netw 15(4):1550147719841295
Feng W, Sun J, Zhang L et al (2017) A support vector machine based naive Bayes algorithm for spam filtering. In: Performance computing & communications conference. IEEE
You W, Qian K, Lo D et al (2015) Web service-enabled spam filtering with naive Bayes classification
Gumus F, Sakar CO, Erdem Z et al (2014) Online Naive Bayes classification for network intrusion detection. In: IEEE & ACM Internatiocial networks analysis and mining (ASONAM). IEEE Computer Society
Koc L, Mazzuchi TA, Sarkani S (2012) A network intrusion detection system based on a Hidden Naive Bayes multiclass classifier. Expert Syst Appl 39(18):13492–13500
Olul IU, Ozcan C, Hakdagll O (2017) Fast text classification with Naive Bayes method on Apache Spark. In: Signal processing & communications applications conference. IEEE
Nigam K, McCallum AK, Thrun S, Mitchell T (2000) Text classification from labeled and unlabeled documents using em. Mach Learn 39(2–3):103–134
Dai W, Xue GR, Yang Q et al (2007) Co-clustering based classification for out-of-domain documents. In: ACM Sigkdd international conference on knowledge discovery & data mining. ACM
Ling X, Dai W, Xue GR et al (2008) Spectral Domain-Transfer Learning. In: ACM Sigkdd international conference on knowledge discovery & data mining. ACM
Pan SJ, Tsang IW, Kwok JT et al (2011) Domain adaptation via transfer component analysis. IEEE Trans Neural Netw 22(2):199–210
Zhuang F, Luo P, Shen Z et al (2012) Mining distinction and commonality across multiple domains using generative model for text classification. IEEE Trans Knowl Data Eng 24(11):2025–2039
Long M, Wang J, Ding G et al (2012) Transfer learning with graph co-regularization. In: Twenty-sixth AAAI conference on artificial intelligence. AAAI Press
Bickel S (2006) ECML-PKDD discovery challenge 2006 overview. In: Proceedings ECML/PKDD discovery challenge workshop
Acknowledgements
This work was supported by the National Key Research and Development Plan of China under Grant No. 2016YFB0801004.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, J., Wu, W. & Xue, D. Transfer Naive Bayes algorithm with group probabilities. Appl Intell 50, 61–73 (2020). https://doi.org/10.1007/s10489-019-01512-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-019-01512-6