research-article

On Sensitivity of Data Models w.r.t. Training Data

Authors:
Jochen Merker

Leipzig University of Applied Sciences, Germany

Leipzig University of Applied Sciences, Germany

0000-0003-2257-7030
View Profile

,
Willi Schimmel

University of Leipzig, Germany

University of Leipzig, Germany

0000-0001-8428-6445
View Profile

ICoMS '23: Proceedings of the 2023 6th International Conference on Mathematics and StatisticsJuly 2023Pages 72–75https://doi.org/10.1145/3613347.3613358

Published:13 December 2023Publication History

ICoMS '23: Proceedings of the 2023 6th International Conference on Mathematics and Statistics

Pages 72–75

ABSTRACT

In this article, we suggest a way how to quantify the sensitivity of a data model w.r.t. training data at a certain input by using a derived model having a single data point as artificial parameter, and we relate our definition of sensitivity w.r.t. training data to the complexity of a data model. In case of linear regression and ridge regression we give an explicit expression for the so defined sensitivity w.r.t. training data and study the properties. Moreover, we discuss the numerical approximation of sensitivity w.r.t. training data for neural networks and provide an application of this numerical approximation to remote sensing in the environmental sciences.

References

Peter L. Bartlett, Philip M. Long, Gábor Lugosi, and Alexander Tsigler. 2020. Benign overfitting in linear regression. Proc. Nat. Acad. Sci. U. S. A. 117, 48 (2020), 30063–30070.Google ScholarCross Ref
Jochen Merker and Gregor Schuldt. 2021. An attempt to explain double descent in modern machine learning. In Tagungsband zur 26. Interdisziplinären Wissenschaftlichen Konferenz Mittweida (IWKM), 14.-15.04.2021. Scientific Reports 2021, 141–144. https://doi.org/10.48446/opus-12293Google ScholarCross Ref
Jochen Merker and Gregor Schuldt. 2021. Why LASSO seems to simultaneously decrease bias and variance in machine learning. In Proceedings of ICoMS 2021. ACM, 86–89. https://doi.org/10.1145/3475827.3475839Google ScholarDigital Library
W. Schimmel, H. Kalesse-Los, M. Maahn, T. Vogl, A. Foth, P. S. Garfias, and P. Seifert. 2022. Identifying cloud droplets beyond lidar attenuation from vertically pointing cloud radar observations using artificial neural networks. Atmospheric Measurement Techniques 15, 18 (2022), 5343–5366. https://doi.org/10.5194/amt-15-5343-2022Google ScholarCross Ref
Alexander Tsigler and Peter L. Bartlett. 2020. Benign Overfitting in Ridge Regression. arxiv:2009.14286Google Scholar
V. N. Vapnik. 1995. The Nature of Statistical Learning Theory. Springer.Google ScholarDigital Library

Index Terms

On Sensitivity of Data Models w.r.t. Training Data
1. Computing methodologies
  1. Machine learning
2. Mathematics of computing
  1. Mathematical analysis
    1. Numerical analysis

Recommendations

Computation of Adalines' sensitivity to weight perturbation

In this paper, the sensitivity of Adalines to weight perturbation is discussed. According to the discrete feature of Adalines' input and output, the sensitivity is defined as the probability of an Adaline's erroneous outputs due to weight perturbation ...
Read More
Exploiting unlabeled data to enhance ensemble diversity

Ensemble learning learns from the training data by generating an ensemble of multiple base learners. It is well-known that to construct a good ensemble with strong generalization ability, the base learners are deemed to be accurate as well as diverse. ...
Read More
Synthetic data generation method for data-free knowledge distillation in regression neural networks
Abstract
Knowledge distillation is the technique of compressing a larger neural network, known as the teacher, into a smaller neural network, known as the student, while still trying to maintain the performance of the larger neural network as ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICoMS '23: Proceedings of the 2023 6th International Conference on Mathematics and Statistics
July 2023
160 pages
ISBN:9798400700187
DOI:10.1145/3613347

Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 December 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
applied mathematics
data model
digital twin
machine learning
neural networks
regression
sensitivity
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 8
  Total Downloads
- Downloads (Last 12 months)8
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

On Sensitivity of Data Models w.r.t. Training Data

ICoMS '23: Proceedings of the 2023 6th International Conference on Mathematics and Statistics

ABSTRACT

References

Cited By

Index Terms

Recommendations

Computation of Adalines' sensitivity to weight perturbation

Exploiting unlabeled data to enhance ensemble diversity

Synthetic data generation method for data-free knowledge distillation in regression neural networks

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

On Sensitivity of Data Models w.r.t. Training Data

ICoMS '23: Proceedings of the 2023 6th International Conference on Mathematics and Statistics

ABSTRACT

References

Cited By

Index Terms

Recommendations

Computation of Adalines' sensitivity to weight perturbation

Exploiting unlabeled data to enhance ensemble diversity

Synthetic data generation method for data-free knowledge distillation in regression neural networks

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media