Abstract
Highly-parameterized deep neural networks are known to have strong data-memorization capability, but does this ability to memorize random data also extend to simple standard learning methods with few parameters? Following recent work exploring memorization in deep learning, we investigate memorization in standard non-neural learning models through the label recorder method, which uses a model’s training accuracy on randomized data to estimate its memorization ability, giving a distribution- and regularization-dependent label recording score. Label recording scores can be used to measure how capacity changes in response to regularization and other hyperparameter choices. This method is fully empirical, easy to implement, and works for all black-box classification methods. The label recording score supplements existing theoretical measures of model capacity such as Rademacher complexity and Vapnik-Chervonenkis (VC) dimension, while agreeing with conventional intuitions regarding statistical learning processes. We find that memorization ability is not limited to only over-parameterized models, but instead exists as a continuum, being present (to some degree) even in simple learning models with few parameters.
K. Rong and A. Khant—These authors contributed equally.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bashir, D., Montañez, G.D., Sehra, S., Segura, P.S., Lauw, J.: An information-theoretic perspective on overfitting and underfitting. In: Gallagher, M., Moustafa, N., Lakshika, E. (eds.) AI 2020. LNCS (LNAI), vol. 12576, pp. 347–358. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-64984-5_27
Bassily, R., Moran, S., Nachum, I., Shafer, J., Yehudayoff, A.: Learners that use little information. In: Janoos, F., Mohri, M., Sridharan, K. (eds.) Proceedings of Algorithmic Learning Theory. Proceedings of Machine Learning Research, vol. 83, pp. 25–55. PMLR, 07–09 April 2018. http://proceedings.mlr.press/v83/bassily18a.html
Castellini, J., Oliehoek, F.A., Savani, R., Whiteson, S.: The representational capacity of action-value networks for multi-agent reinforcement learning. In: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, pp. 1862–1864 (2019)
Cortes, C., Kloft, M., Mohri, M.: Learning kernels using local rademacher complexity (2013)
Maaten, L.V.D., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)
Mahalunkar, A., Kelleher, J.D.: Using regular languages to explore the representational capacity of recurrent neural architectures. In: Kůrková, V., Manolopoulos, Y., Hammer, B., Iliadis, L., Maglogiannis, I. (eds.) ICANN 2018, Part III. LNCS, vol. 11141, pp. 189–198. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01424-7_19
Montan\(\tilde{\text{e}}\)z, G.D.: Why Machine Learning Works. In: Dissertation. Carnegie Mellon University (2017)
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Pedregosa, F., et al.: Scikit-learn: mchine learning in Python. J. Mach. Learn. Rese. 12, 2825–2830 (2011)
Samengo, I., Treves, A.: Representational capacity of a set of independent neurons. Phys. Rev. E 63(1), 011910 (2000)
Sandoval Segura, P., et al.: The Labeling Distribution Matrix (LDM): a tool for estimating machine learning algorithm capacity. In: Rocha, A.P., Steels, L., van den Herik, H.J. (eds.) Proceedings of the 12th International Conference on Agents and Artificial Intelligence, ICAART 2020, Valletta, Malta, 22–24 February 2020, vol. 2, pp. 980–986. SCITEPRESS (2020). https://doi.org/10.5220/0009178209800986
Smith, S.L., Le, Q.V.: A Bayesian perspective on generalization and stochastic gradient descent. In: International Conference on Learning Representations (ICLR) (2018)
Vapnik, V., Levin, E., Cun, Y.L.: Measuring the VC-dimension of a learning machine. Neural Comput. 6(5), 851–876 (1994)
Xu, A., Raginsky, M.: Information-theoretic analysis of generalization capability of learning algorithms. In: Proceedings of the 31st Conference on Neural Information Processing Systems (2017)
Yin, D., Kannan, R., Bartlett, P.: Rademacher complexity for adversarially robust generalization. In: International Conference on Machine Learning, pp. 7085–7094. PMLR (2019)
Zhang, C., Bengio, S., Hardt, M., Recht, B., Vinyals, O.: Understanding Deep Learning Requires Rethinking Generalization. CoRR abs/1611.03530 (2016). http://arxiv.org/abs/1611.03530
Zhu, J., Gibson, B., Rogers, T.T.: Human Rademacher Complexity. Adv. Neural Inf. Process. Syst. 22, 2322–2330 (2009)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Rong, K., Khant, A., Flores, D., Montañez, G.D. (2022). The Label Recorder Method. In: Nicosia, G., et al. Machine Learning, Optimization, and Data Science. LOD 2021. Lecture Notes in Computer Science(), vol 13163. Springer, Cham. https://doi.org/10.1007/978-3-030-95467-3_42
Download citation
DOI: https://doi.org/10.1007/978-3-030-95467-3_42
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-95466-6
Online ISBN: 978-3-030-95467-3
eBook Packages: Computer ScienceComputer Science (R0)