Abstract
We derive an upper and a lower bound on the sample size needed for PAC-learning a concept class in the presence of one-sided classification noise. The upper bound is achieved by the strategy “Minimum One-sided Disagreement”. It matches the lower bound (which holds for any learning strategy) up to a logarithmic factor. Although “Minimum One-sided Disagreement” often leads to NP-hard combinatorial problems, we show that it can be implemented quite efficiently for some simple concept classes like, for example, unions of intervals, axis-parallel rectangles, and TREE(2,n,2,k) which is a broad subclass of 2-level decision trees. For the first class, there is an easy algorithm with time bound O(m logm). For the second-one (resp. the third-one), we design an algorithm that applies the well-known UNION-FIND data structure and has an almost quadratic time bound (resp. time bound O(n 2 m logm)).
Similar content being viewed by others
References
Andrews, S.J.D.: Learning from ambiguous examples. Ph.D. thesis, Brown University (2007)
Angluin, D., Laird, P.: Learning from noisy examples. Mach. Learn. 2(4), 343–370 (1988)
Aronov, B., Garijo, D., Núñez-Rodríguez, Y., Rappaport, D., Seara, C., Urrutia, J.: Minimizing the error of linear separators on linearly inseparable data. Discrete Appl. Math. 160(10–11), 1441–1452 (2012)
Auer, P., Holte, R.C., Maass, W.: Theory and applications of agnostic PAC-learning with small decision trees. In: Proceedings of the 12th International Conference on Machine Learning, pp. 21–29 (1995)
Auer, P., Long, P.M., Srinivasan, A.: Approximating hyper-rectangles: learning and pseudorandom sets. J. Comput. Syst. Sci. 57(3), 376–388 (1998)
Blum, A., Kalai, A.: A note on learning from multiple-instance examples. Mach. Learn. 30(1), 23–29 (1998)
Blumer, A., Ehrenfeucht, A., Haussler, D., Warmuth, M.K.: Learnability and the Vapnik–Chervonenkis dimension. J. Assoc. Comput. Mach. 36(4), 929–965 (1989)
Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms. MIT Press, Cambridge, MA (2009)
Dietterich, T.G., Lathrop, R.H., Lozano-Pérez, T.: Solving the multiple instance problem with axis-parallel rectangles. Artif. Intell. 89(1–2), 31–71 (1997)
Diochnos, D.I., Sloan, R.H., Turan, G.: On multiple-instance learning of halfspaces. Inf. Process. Lett. 112(23), 933–936 (2012)
Ehrenfeucht, A., Haussler, D., Kearns, M., Valiant, L.: A general lower bound on the number of examples needed for learning. Inf. Comput. 82(3), 247–261 (1989)
Haussler, D.: Decision theoretic generalizations of the PAC model for neural net and other learning applications. Inf. Comput. 100(1), 78–150 (1992)
Holte, R.C.: Very simple classification rules perform well on most commonly used datasets. Mach. Learn. 11(1), 63–90 (1993)
Kearns, M.: Efficient noise-tolerant learning from statistical queries. J. Assoc. Comput. Mach. 45(6), 983–1006 (1998)
Kearns, M.J., Schapire, R.E., Sellie, L.: Toward efficient agnostic learning. Mach. Learn. 17(2), 115–141 (1994)
Laird, P.: Learning from Good and Bad Data. Kluwer Academic Publishers, Boston (1988)
Maass, W.: Efficient agnostic pac-learning with simple hypotheses. In: Proceedings of the 7th Annual Conference on Computational Learning Theory, pp. 67–75 (1994)
Maron, O., Ratan, A.L.: Multiple-instance learning for natural scene classification. In: Proceedings of the 15th International Conference on Machine Learning, pp. 341–349 (1998)
Ramanan, P.: Obtaining lower bounds using artificial components. Inf. Process. Lett. 24(4), 243–246 (1987)
Sabato, S., Tishby, N.: Multi-instance learning with any hypothesis class (2011). arXiv: 1107.2021v1 [cs.LG]
Simon, H.U.: General bounds on the number of examples needed for learning probabilistic concepts. J. Comput. Syst. Sci. 52(2), 239–255 (1996)
Steele, J.M., Yao, A.C.: Lower bounds for algebraic decision trees. J. Algorithms 3(1), 1–8 (1982)
Valiant, L.G.: A theory of the learnable. Commun. ACM 27(11), 1134–1142 (1984)
Weiss, S.M., Galen, R.S., Tadepalli, P.: Maximizing the predictive value of production rules. Artif. Intell. 45(1–2), 47–71 (1990)
Weiss, S.M., Kapouleas, I.: An empirical comparison of pattern recognition, neural nets, and machine learning classification methods. In: Proceedings of the 11th International Joint Conference on Artificial Intelligence, pp. 781–787 (1989)
Weiss, S.M., Kulikowski, C.: Computer Systems that Learn: Prediction Methods from Statistics, Neural Nets, Machine Learning and Expert Systems. Morgan Kaufmann, San Mateo, CA (1990)
Yao, A.: Probabilistic computations: toward a unified measure of complexity. In: Proceedings of the 18th Symposium on Foundations of Computer Science, pp. 222–227 (1977)
Zhou, Z.H., Jiang, K., Li, M.: Multi-instance learning based web mining. Appl. Intell. 22(2), 135–147 (2005)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Simon, H.U. PAC-learning in the presence of one-sided classification noise. Ann Math Artif Intell 71, 283–300 (2014). https://doi.org/10.1007/s10472-012-9325-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10472-012-9325-7