Abstract
Probably approximately correct (PAC) learnability of classification models is crucial in machine learning. Many classification algorithms were introduced and simply validated on benchmark data. And they were not further discussed on under what condition they are assured to be learned successfully, because it is commonly hard to address such PAC learning issues. As one may accept, it would be even crucial to investigate the PAC learnability of the classification models if they are exploited to deal with some special data, such as gene microarray data. Rough hypercuboid classifier (RHC) is a novel classifier introduced for classification based on gene microarray data. After analyzing the VC-dimension and the time complexity of RHC, this paper proved that RHC is a PAC-learning model. The proof gives support to the further RHC applications in classifying cancers based on gene microarray data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Dudoit, S., Fridlyand, J., Speed, T.P.: Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data. J. Am. Statistical Assoc. 97, 77–87 (1999)
Lee, Y., Lee, C.K.: Classification of Multiple Cancer Types by Multicategory Support Vector Machines Using Gene Expression Data. Bioinformatics 19(9), 1132–1139 (2003)
Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gassenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science 286, 531–537 (1999)
Ooi, C., Tan, P.: Genetic Algorithms Applied to Multi-Class Prediction for the Analysis of Gene Expression Data. Bioinformatics 19, 37–44 (2003)
Wang, L.P., Chu, F., Xie, W.: Accurate Cancer Classification Using Expressions of Very Few Genes. IEEE/ACM Trans. Computational Biology and Bioinformatics 4(1), 40–53 (2007)
Ye, J.P., Li, T., Xiong, T., Janardan, R.: Using Uncorrelated Discriminant Analysis for Tissue Classification with Gene Expression Data. IEEE/ACM Trans. Computational Biology and Bioinformatics 1(4), 181–190 (2004)
Safavian, S.R., Landgrebe, D.: A Survey of Decision Tree Classifier Methodology. IEEE Trans. Systems, Man and Cybernetics 21(3), 660–674 (1991)
Janikow, C.Z.: Fuzzy Decision Trees: Issues and Methods. IEEE Trans. Systems, Man and Cybernetics 28(1), 1–14 (1998)
Wei, J.M., Wang, S.Q., Yuan, X.J.: Ensemble Rough Hypercuboid Approach for Classifying Cancers. IEEE Trans. Knowledge and Data engineering 22(3), 381–391 (2010)
Pawlak, Z.: Rough Sets. Intl J. Computer and Information Science 11, 341–356 (1982)
Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning about Data. Springer (1991)
Ziarko, W., Shan, N.: KDD-R: A Comprehensive System for Knowledge Discovery in Databases using Rough Sets. In: Proc. 3rd Int.Workshop Rough Sets Soft Comput., RSSC 1994, pp. 164–173 (1994)
Grzymala-Busse, J.W.: LERSA Knowledge Discovery System. In: Polkowski, L., Skowron, A. (eds.) Rough Sets in Knowledge Discovery 2, Applications, Case Studies and Software Systems, pp. 562–565. Physica-Verlag, Heidelberg (1998)
Grzymała-Busse, J.W., Grzymała-Busse, W.J., Goodwin, L.K.: A Closest Fit Approach to Missing Attribute Values in Preterm Birth Data. In: Zhong, N., Skowron, A., Ohsuga, S. (eds.) RSFDGrC 1999. LNCS (LNAI), vol. 1711, pp. 405–413. Springer, Heidelberg (1999)
Lin, T.Y., Cercone, N.: Rough Sets and Data Mining: Analysis for Imprecise Data. Springer (1997)
Wei, J.M., Wang, S.Q., Wang, M.Y., You, J.P., Liu, D.Y.: Rough Set Based Approach for Inducing Decision Trees. Knowledge-Based Systems 20(8), 695–702 (2007)
Vapnik, V.N.: Estimation of Dependencies Based on Empirical Data. Springer (1982)
Vapnik, V.N., Chervonenkis, A.Y.: On the Uniform Convergence of Relative Frequencies to Their Probabilities. Theory Probab. Applications 16(2), 264–280 (1971)
Dudley, R.M.: Central Limit Theorems for Empirical Measures. Ann. Probab. 6, 899–929 (1978)
Pollard, D.: Convergence of Stochastic Processes. Springer, New York (1984)
Valiant, L.G.: A Theory of the Learnable. Comm. ACM 27(11), 1134–1142 (1984)
Haussler, D.: Decision Theoretic Generalizations of the PAC Learning Model for Neural Net and Other Learning Applications. Inform. Comput. 100, 78–150 (1992)
Kulkami, S.R., Mitter, S.K., Tsitsiklis, J.N., Zeitouni, O.: PAC Learning with Generalized Samples and an Application to Stochastic Geometry. IEEE Trans. Pattern Analysis and Machine Intelligence 15(9), 933–942 (1993)
Mitchell, T.M.: Machine Learning. McGraw-Hill (2003)
Baum, E.B., Haussler, D.: What Size Net Gives Valid Generalization. Advances In Neural Information Processing Syst. 1, 81–90 (1989)
Blumer, A., Ehrenfeucht, A., Haussler, D., Warmuth, M.: Learnability and the Vapnik-Chemonenkis Dimension. Journal of the ACM 36(4), 929–965 (1989)
Ehrenfeucht, A., Haussler, D., Kearns, M., Valiant, L.: A General Lower Bound on the Number of Examples Needed for Learning. Informution and Computation, 247–261 (1989)
Hypercube (2012), http://en.wikipedia.org/wiki/Hypercube
Yang, J., Ye, C.Z., Zhou, Y., Chen, N.Y.: On Upper Bound of VC Dimension of Binary Decision Tree Algorithms. Computer Simulation 22(2), 74–78 (2005)
Koiran, P., Sontag, E.D.: Vapnik-Chervonenkis Dimension of Recurrent Neural Network. Discrete Applied Mathematics 86, 63–79 (1998)
Bartlett, P.L., Maiorov, V., Meir, R.: Almost Linear VC-Dimension Bounds for Piecewise Polynomial Networks. Neural Computation 10, 2159–2173 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yu, T., Wei, JM., Li, J. (2012). PAC Learnability of Rough Hypercuboid Classifier. In: Huang, DS., Ma, J., Jo, KH., Gromiha, M.M. (eds) Intelligent Computing Theories and Applications. ICIC 2012. Lecture Notes in Computer Science(), vol 7390. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31576-3_82
Download citation
DOI: https://doi.org/10.1007/978-3-642-31576-3_82
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31575-6
Online ISBN: 978-3-642-31576-3
eBook Packages: Computer ScienceComputer Science (R0)