Abstract
We advocate to analyze the average complexity of learning problems. An appropriate framework for this purpose is introduced. Based on it we consider the problem of learning monomials and the special case of learning monotone monomials in the limit and for on-line predictions in two variants: from positive data only, and from positive and negative examples. The well-known Wholist algorithm is completely analyzed, in particular its average-case behavior with respect to the class of binomial distributions. We consider different complexity measures: the number of mind changes, the number of prediction errors, and the total learning time. Tight bounds are obtained implying that worst case bounds are too pessimistic. On the average learning can be achieved exponentially faster.
Furthermore, we study a new learning model, stochastic finite learning, in which, in contrast to PAC learning, some information about the underlying distribution is given and the goal is to find a correct (not only approximatively correct) hypothesis. We develop techniques to obtain good bounds for stochastic finite learning from a precise average case analysis of strategies for learning in the limit and illustrate our approach for the case of learning monomials.
Part of this work was performed while visiting the Department of Informatics at Kyushu University supported by the Japan Society for the Promotion of Science under Grant JSPS 29716102
Supported by the Grant-in-Aid for Scientific Research in Fundamental Areas from the Japanese Ministry of Education, Science, Sports, and Culture under grant no. 10558047
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
J.M. Barzdin, R.V. Freivald, On the prediction of general recursive functions. Soviet Math. Doklady 13:1224–1228, 1972.
G. Benedek and A. Itai. Learnability by fixed distributions. “Proc. 1988 Workshop on Computational Learning Theory,” 81–90, Morgan Kaufmann, 1988.
R. Daley and C.H. Smith. On the complexity of inductive inference. Inform. Control, 69:12–40, 1986.
F. Denis and R. Gilleron. PAC learning under helpful distributions. “Proc. 8th International Workshop on Algorithmic Learning Theory,” LNAI Vol. 1316, 132–145, Springer-Verlag, 1997.
E.M. Gold, Language identification in the limit. Inform. Control 10:447–474, 1967.
D. Haussler. Bias, version spaces and Valiant’s learning framework. “Proc. 8th National Conference on Artificial Intelligence, 564–569, Morgan Kaufmann, 1987.
M. Kearns, M. Li, L. Pitt and L.G. Valiant. On the learnability of Boolean formula. “ Proc. 19th Annual ACM Symposium on Theory of Computing,” 285–295, ACM Press 1987.
S. Lange and T. Zeugmann. Incremental learning from positive data. J. Comput. System Sci. 53(1):88–103, 1996.
M. Li and P. Vitanyi. Learning simple concepts under simple distributions. SIAM J. Comput., 20(5):911–935, 1991.
N. Littlestone. Learning quickly when irrelevant attributes are abound: A new linear threshold algorithm. Machine Learning 2:285–318, 1988.
B. Natarajan. On learning Boolean formula. “Proc. 19th Annual ACM Symposium on Theory of Computing,” 295–304, ACM Press, 1987.
S. Okamoto and K. Satoh. An average-case analysis of k-nearest neighbor classifier. “Proc. 1st International Conference on Case-Based Reasoning Research and Development,” LNCS Vol. 1010, 253–264, Springer-Verlag, 1995.
S. Okamoto and N. Yugami. Theoretical analysis of the nearest neighbor classifier in noisy domains. “Proc. 13th International Conference on Machine Learning, 355–363, Morgan Kaufmann 1996.
M.J. Pazzani and W. Sarrett, A framework for average case analysis of conjunctive learning algorithms. Machine Learning 9:349–372, 1992.
R. Reischuk and T. Zeugmann. Learning one-variable pattern languages in linear average time. “Proc. 11th Annual Conference on Computational Learning Theory,” 198–208, ACM Press, 1998.
P. Rossmanith and T. Zeugmann. Learning k-variable pattern languages efficiently stochastically finite on average from positive data. “Proc. 4th International Colloquium on Grammatical Inference,” LNAI Vol. 1433, 13–24, Springer-Verlag, 1998.
Y. Sakai, E. Takimoto and A. Maruoka. Proper learning algorithms for functions of k terms under smooth distributions. “Proc. 8th Annual ACM Conference on Computational Learning Theory,” 206–213, ACM Press, 1995.
L.G. Valiant. A theory of the learnable. Commun. A CM 27:1134–1142, 1984.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Reischuk, R., Zeugmann, T. (1999). A Complete and Tight Average-Case Analysis of Learning Monomials. In: Meinel, C., Tison, S. (eds) STACS 99. STACS 1999. Lecture Notes in Computer Science, vol 1563. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49116-3_39
Download citation
DOI: https://doi.org/10.1007/3-540-49116-3_39
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65691-3
Online ISBN: 978-3-540-49116-3
eBook Packages: Springer Book Archive