A new constrained maximum margin approach to discriminative learning of Bayesian classifiers

Guo, Ke; Liu, Xia-bi; Guo, Lun-hao; Li, Zong-jie; Geng, Zeng-min

doi:10.1631/FITEE.1700007

A new constrained maximum margin approach to discriminative learning of Bayesian classifiers

Published: 16 July 2018

Volume 19, pages 639–650, (2018)
Cite this article

Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

Ke Guo ORCID: orcid.org/0000-0002-9278-4046¹,
Xia-bi Liu¹,
Lun-hao Guo¹,
Zong-jie Li¹ &
…
Zeng-min Geng²

87 Accesses
2 Citations
Explore all metrics

Abstract

We propose a novel discriminative learning approach for Bayesian pattern classification, called ‘constrained maximum margin (CMM)’. We define the margin between two classes as the difference between the minimum decision value for positive samples and the maximum decision value for negative samples. The learning problem is to maximize the margin under the constraint that each training pattern is classified correctly. This nonlinear programming problem is solved using the sequential unconstrained minimization technique. We applied the proposed CMM approach to learn Bayesian classifiers based on Gaussian mixture models, and conducted the experiments on 10 UCI datasets. The performance of our approach was compared with those of the expectation-maximization algorithm, the support vector machine, and other state-of-the-art approaches. The experimental results demonstrated the effectiveness of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Max-margin based Bayesian classifier

Article 14 October 2016

Soft Margin Bayes-Point-Machine Classification via Adaptive Direction Sampling

A support vector approach based on penalty function method

Article 17 December 2021

References

Alcalá-Fdez J, Sanchez L, Garcia S, et al., 2009. KEEL: a software tool to assess evolutionary algorithms for data mining problems. Soft Comput, 13(3):307–318. https://doi.org/10.1007/s00500-008-0323-y
Article Google Scholar
Alcalá-Fdez J, Fernández A, Luengo J, et al., 2011. KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Multi-Valued Log Soft Comput, 17(2-3):255–287.
Google Scholar
Bredensteiner EJ, Bennett KP, 1999. Multicategory classification by support vector machines. In: Pang JS(Ed.), Computational Optimization. Springer US, New York, p.53–79. https://doi.org/10.1007/978-1-4615-5197-3_5
Chapter Google Scholar
Dempster AP, Laird NM, Rubin DB, 1977. Maximum likelihood from incomplete data via the EMalgorithm. J R Stat Soc B, 39(1):1–38.
MATH Google Scholar
Demšar J, 2006. Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res, 7(Jan):1–30.
MathSciNet MATH Google Scholar
Dong W, Zhou M, 2014. Gaussian classifier-based evolutionary strategy for multimodal optimization. IEEE Trans Neur Netw Learn Syst, 25(6):1200–1216. https://doi.org/10.1109/TNNLS.2014.2298402
Article Google Scholar
Dvorák J, Savický P, 2007. Softening splits in decision trees using simulated annealing. Int Conf on Adaptive and Natural Computing Algorithms, p.721–729. https://doi.org/10.1007/978-3-540-71618-1_80
Chapter Google Scholar
Fiacco AV, McCormick GP, 1990. Nonlinear Programming: Sequential Unconstrained Minimization Techniques. SIAM, Philadelphia. https://doi.org/10.1137/1.9781611971316
Google Scholar
Forsythe GE, Malcolm MA, Moler CB, 1977. Computer Methods for Mathematical Computations (1^st Ed.). Prentice Hall, New Jersey.
MATH Google Scholar
Friedman N, Geiger D, Goldszmidt M, 1997. Bayesian network classifiers. Mach Learn, 29(2-3):131–163. https://doi.org/10.1023/A:1007465528199
Article MATH Google Scholar
Gorman RP, Sejnowski TJ, 1988. Analysis of hidden units in a layered network trained to classify sonar targets. Neur Netw, 1(1):75–89. https://doi.org/10.1016/0893-6080(88)90023-8
Article Google Scholar
Hall M, Frank E, Holmes G, et al., 2009. The WEKA data mining software: an update. ACM SIGKDD Explor Newsl, 11(1):10–18. https://doi.org/10.1145/1656274.1656278
Article Google Scholar
Jiang H, 2010. Discriminative training of HMMs for automatic speech recognition: a survey. Comput Speech Lang, 24(4):589–608. https://doi.org/10.1016/j.csl.2009.08.002
Article Google Scholar
Jiang L, Zhang H, Cai Z, 2009. A novel Bayes model: hidden naïve Bayes. IEEE Trans Knowl Data Eng, 21(10): 1361–1371. https://doi.org/10.1109/TKDE.2008.234
Article Google Scholar
Jiang L, Zhang H, Cai Z, et al., 2012. Weighted average of one-dependence estimators. J Exp Theor Artif Intell, 24(2):219–230. https://doi.org/10.1080/0952813X.2011.639092
Article Google Scholar
Jiang Y, Zhou ZH, 2004. Editing training data for kNN classifiers with neural network ensemble. Advances in Neural Networks—Int Symp on Neural Networks, p.356–361. https://doi.org/10.1007/978-3-540-28647-9_60
Google Scholar
Juang BH, Katagiri S, 1992. Discriminative learning for minimum error classification (pattern recognition). IEEE Trans Signal Process, 40(12):3043–3054. https://doi.org/10.1109/78.175747
Article MATH Google Scholar
Karabatak M, 2015. A new classifier for breast cancer detection based on naïve Bayesian. Measurement, 72:32–36. https://doi.org/10.1016/j.measurement.2015.04.028
Article Google Scholar
Kim BH, Pfister HD, 2011. An iterative joint linearprogramming decoding of LDPC codes and finite-state channels. IEEE Conf on Communications, p.1–6. https://doi.org/10.1109/icc.2011.5962814
Google Scholar
Kwok JTY, 1999. Moderating the outputs of support vector machine classifiers. IEEE Trans Neur Netw, 10(5): 1018–1031. https://doi.org/10.1109/72.788642
Article Google Scholar
Moerland P, 1999. A comparison of mixture models for density estimation. 9^th Int Conf on Artificial Neural Networks, p.25–30. https://doi.org/10.1049/cp:19991079
Google Scholar
Nádas A, 1983. A decision theoretic formulation of a training problem in speech recognition and a comparison of training by unconditional versus conditional maximum likelihood. IEEE Trans Audio Speech Signal Process, 31(4):814–817. https://doi.org/10.1109/TASSP.1983.1164173
Article Google Scholar
OpenCV Team, 2015. Open Source Computer Vision Library. http://opencv.org [Accessed on July 15, 2016].
Google Scholar
Pernkopf F, Wohlmayr M, 2010. Large margin learning of Bayesian classifiers based on Gaussian mixture models. Joint European Conf on Machine Learning and Knowledge Discovery in Databases, p.50–66. https://doi.org/10.1007/978-3-642-15939-8_4
Chapter Google Scholar
Pernkopf F, Wohlmayr M, Tschiatschek S, 2012. Maximum margin Bayesian network classifiers. IEEE Trans Patt Anal Mach Intell, 34(3):521–532. https://doi.org/10.1109/TPAMI.2011.149
Article Google Scholar
Povey D, Woodland PC, 2002. Minimum phone error and I-smoothing for improved discriminative training. IEEE Int Conf on Acoustics, p.105–108. https://doi.org/10.1109/ICASSP.2002.5743665
Google Scholar
University of California, 2013. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml [Accessed on Aug. 10, 2016].
Google Scholar
Vapnik V, 2013. The Nature of Statistical Learning Theory (2^nd Ed.). Springer-Verlag, New York. https://doi.org/10.1007/978-1-4757-3264-1
Google Scholar
Vlassis N, Likas A, 1999. A kurtosis-based dynamic approach to Gaussian mixture modeling. IEEE Trans Syst Man Cybern A, 29(4):393–399. https://doi.org/10.1109/3468.769758
Article Google Scholar
Webb GI, Boughton JR, Wang Z, 2005. Not so naïve Bayes: aggregating one-dependence estimators. Mach Learn, 58(1):5–24. https://doi.org/10.1007/s10994-005-4258-6
Article MATH Google Scholar
Woodland PC, Povey D, 2002. Large scale discriminative training of hidden Markov models for speech recognition. Comput Speech Lang, 16(1):25–47. https://doi.org/10.1006/csla.2001.0182
Article Google Scholar

Download references

Author information

Authors and Affiliations

Beijing Laboratory of Intelligent Information Technology, School of Computer Science, Beijing Institute of Technology, Beijing, 100081, China
Ke Guo, Xia-bi Liu, Lun-hao Guo & Zong-jie Li
Computer Information Center, Beijing Institute of Fashion Technology, Beijing, 100029, China
Zeng-min Geng

Authors

Ke Guo
View author publications
You can also search for this author inPubMed Google Scholar
Xia-bi Liu
View author publications
You can also search for this author inPubMed Google Scholar
Lun-hao Guo
View author publications
You can also search for this author inPubMed Google Scholar
Zong-jie Li
View author publications
You can also search for this author inPubMed Google Scholar
Zeng-min Geng
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Zeng-min Geng.

Additional information

Project supported by the National Natural Science Foundation of China (Nos. 60973059 and 81171407) and the Program for New Century Excellent Talents in University, China (No. NCET-10-0044)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Guo, K., Liu, Xb., Guo, Lh. et al. A new constrained maximum margin approach to discriminative learning of Bayesian classifiers. Frontiers Inf Technol Electronic Eng 19, 639–650 (2018). https://doi.org/10.1631/FITEE.1700007

Download citation

Received: 04 January 2017
Accepted: 21 August 2017
Published: 16 July 2018
Issue Date: May 2018
DOI: https://doi.org/10.1631/FITEE.1700007

Key words

CLC number

TP391

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A new constrained maximum margin approach to discriminative learning of Bayesian classifiers

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Max-margin based Bayesian classifier

Soft Margin Bayes-Point-Machine Classification via Adaptive Direction Sampling

A support vector approach based on penalty function method

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

CLC number

Subscribe and save

Buy Now