Dataset Distillation by Automatic Training Trajectories

Liu, Dai; Gu, Jindong; Cao, Hu; Trinitis, Carsten; Schulz, Martin

doi:10.1007/978-3-031-73021-4_20

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15145))

Included in the following conference series:

European Conference on Computer Vision

199 Accesses

Abstract

Dataset Distillation is used to create a concise, yet informative, synthetic dataset that can replace the original dataset for training purposes. Some leading methods in this domain prioritize long-range matching, involving the unrolling of training trajectories with a fixed number of steps ($N_{S}$) on the synthetic dataset to align with various expert training trajectories. However, traditional long-range matching methods possess an overfitting-like problem, the fixed step size $N_{S}$ forces synthetic dataset to distortedly conform seen expert training trajectories, resulting in a loss of generality—especially to those from unencountered architecture. We refer to this as the Accumulated Mismatching Problem (AMP), and propose a new approach, Automatic Training Trajectories (ATT), which dynamically and adaptively adjusts trajectory length $N_{S}$ to address the AMP. Our method outperforms existing methods particularly in tests involving cross-architectures. Moreover, owing to its adaptive nature, it exhibits enhanced stability in the face of parameter variations. Our source code is publicly available at https://github.com/NiaLiu/ATT.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

In Search of Lost Online Test-Time Adaptation: A Survey

Article Open access 15 September 2024

UDD: Dataset Distillation via Mining Underutilized Regions

How to train your pre-trained GAN models

Article Open access 31 August 2023

References

Agarap, A.F.: Deep learning using rectified linear units (ReLU). arXiv preprint arXiv:1803.08375 (2018)
Aljundi, R., Lin, M., Goujaud, B., Bengio, Y.: Gradient based sample selection for online continual learning. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Assadi, S., Bateni, M., Bernstein, A., Mirrokni, V., Stein, C.: Coresets meet EDCS: algorithms for matching and vertex cover on massive graphs. In: Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1616–1635. SIAM (2019)
Google Scholar
Bachem, O., Lucic, M., Krause, A.: Practical coreset constructions for machine learning. arXiv preprint arXiv:1703.06476 (2017)
Bohdal, O., Yang, Y., Hospedales, T.: Flexible dataset distillation: learn labels instead of images. arXiv preprint arXiv:2006.08572 (2020)
Cazenavette, G., Wang, T., Torralba, A., Efros, A.A., Zhu, J.Y.: Dataset distillation by matching training trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4750–4759 (2022)
Google Scholar
Cazenavette, G., Wang, T., Torralba, A., Efros, A.A., Zhu, J.Y.: Wearable imagenet: synthesizing tileable textures via dataset distillation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2278–2282 (2022)
Google Scholar
Cazenavette, G., Wang, T., Torralba, A., Efros, A.A., Zhu, J.Y.: Generalizing dataset distillation via deep generative prior. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3739–3748 (2023)
Google Scholar
Chen, Y., Welling, M.: Parametric herding. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 97–104. JMLR Workshop and Conference Proceedings (2010)
Google Scholar
Chen, Y., Welling, M., Smola, A.: Super-samples from kernel herding. arXiv preprint arXiv:1203.3472 (2012)
Cui, J., Wang, R., Si, S., Hsieh, C.J.: Scaling up dataset distillation to imagenet-1k with constant memory (2022)
Google Scholar
Dasgupta, A., Drineas, P., Harb, B., Kumar, R., Mahoney, M.W.: Sampling algorithms and coresets for $\backslash $ ell_p regression. SIAM J. Comput. 38(5), 2060–2078 (2009)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009). https://doi.org/10.1109/CVPR.2009.5206848
Deng, Z., Russakovsky, O.: Remember the past: distilling datasets into addressable memories for neural networks (2022)
Google Scholar
Dong, T., Zhao, B., Lyu, L.: Privacy for free: how does dataset condensation help privacy? In: International Conference on Machine Learning, pp. 5378–5396. PMLR (2022)
Google Scholar
Du, J., Jiang, Y., Tan, V.T., Zhou, J.T., Li, H.: Minimizing the accumulated trajectory error to improve dataset distillation. arXiv preprint arXiv:2211.11004 (2023)
Fastai: A smaller subset of 10 easily classified classes from imagenet, and a little more French
Google Scholar
Feldman, D.: Core-sets: updated survey. Sampling techniques for supervised or unsupervised tasks, pp. 23–44 (2020)
Google Scholar
Gidaris, S., Komodakis, N.: Dynamic few-shot visual learning without forgetting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4367–4375 (2018)
Google Scholar
Guo, Z., Wang, K., Cazenavette, G., Li, H., Zhang, K., You, Y.: Towards lossless dataset distillation via difficulty-aligned trajectory matching. arXiv preprint arXiv:2310.05773 (2023)
Har-Peled, S., Kushal, A.: Smaller coresets for k-median and k-means clustering. In: Proceedings of the Twenty-First Annual Symposium on Computational Geometry, pp. 126–134 (2005)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015)
Google Scholar
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
Kendall, A., Gal, Y.: What uncertainties do we need in Bayesian deep learning for computer vision? In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Kim, J.H., et al.: Dataset condensation via efficient synthetic-data parameterization. In: International Conference on Machine Learning, pp. 11102–11118. PMLR (2022)
Google Scholar
Kiyasseh, D., Zhu, T., Clifton, D.A.: PCPS: patient cardiac prototypes to probe AI-based medical diagnoses, distill datasets, and retrieve patients. Trans. Mach. Learn. Res. (2022)
Google Scholar
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25 (2012)
Google Scholar
Lee, H.B., Lee, D.B., Hwang, S.J.: Dataset condensation with latent space knowledge factorization and sharing. arXiv preprint arXiv:2208.10494 (2022)
Lee, S., Chun, S., Jung, S., Yun, S., Yoon, S.: Dataset condensation with contrastive signals. In: International Conference on Machine Learning, pp. 12352–12364. PMLR (2022)
Google Scholar
Li, G., Togo, R., Ogawa, T., Haseyama, M.: Dataset distillation using parameter pruning. arXiv preprint arXiv:2209.14609 (2022)
Liu, S., Wang, K., Yang, X., Ye, J., Wang, X.: Dataset distillation via factorization (2022)
Google Scholar
Liu, S., Wang, X.: Few-shot dataset distillation via translative pre-training. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 18654–18664 (2023)
Google Scholar
Liu, S., Ye, J., Yu, R., Wang, X.: Slimmable dataset condensation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3759–3768 (2023)
Google Scholar
Liu, Y., Gu, J., Wang, K., Zhu, Z., Jiang, W., You, Y.: Dream: efficient dataset distillation by representative matching (2023)
Google Scholar
Liu, Y., Li, Z., Backes, M., Shen, Y., Zhang, Y.: Backdoor attacks against dataset distillation. arXiv preprint arXiv:2301.01197 (2023)
Loo, N., Hasani, R., Lechner, M., Rus, D.: Dataset distillation fixes dataset reconstruction attacks. arXiv preprint arXiv:2302.01428 (2023)
Loo, N., Hasani, R., Lechner, M., Rus, D.: Dataset distillation with convexified implicit gradients (2023)
Google Scholar
Mirrokni, V., Zadimoghaddam, M.: Randomized composable core-sets for distributed submodular maximization. In: Proceedings of the Forty-Seventh Annual ACM Symposium on Theory of Computing, pp. 153–162 (2015)
Google Scholar
Nguyen, T., Novak, R., Xiao, L., Lee, J.: Dataset distillation with infinitely wide convolutional networks. In: Advances in Neural Information Processing Systems, vol. 34, pp. 5186–5198 (2021)
Google Scholar
Nickolls, J., Dally, W.J.: The GPU computing era. IEEE Micro 30(2), 56–69 (2010)
Article Google Scholar
Olvera-López, J.A., Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F., Kittler, J.: A review of instance selection methods. Artif. Intell. Rev. 34, 133–143 (2010)
Article Google Scholar
O’Mahony, N., et al.: Deep learning vs. traditional computer vision. In: Arai, K., Kapoor, S. (eds.) CVC 2019. AISC, vol. 943, pp. 128–144. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-17795-9_10
Chapter Google Scholar
Paul, M., Ganguli, S., Dziugaite, G.K.: Deep learning on a data diet: finding important examples early in training. In: Advances in Neural Information Processing Systems, vol. 34, pp. 20596–20607 (2021)
Google Scholar
Pi, R., et al.: DynaFed: tackling client data heterogeneity with global dynamics. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12177–12186 (2023)
Google Scholar
Sangermano, M., Carta, A., Cossu, A., Bacciu, D.: Sample condensation in online continual learning. In: 2022 International Joint Conference on Neural Networks (IJCNN), pp. 01–08. IEEE (2022)
Google Scholar
Schwartz, R., Dodge, J., Smith, N.A., Etzioni, O.: Green AI. Commun. ACM 63(12), 54–63 (2020)
Article Google Scholar
Sener, O., Savarese, S.: Active learning for convolutional neural networks: a core-set approach. arXiv preprint arXiv:1708.00489 (2017)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Song, R., et al.: Federated learning via decentralized dataset distillation in resource-constrained edge environments. arXiv preprint arXiv:2208.11311 (2022)
Sucholutsky, I., Schonlau, M.: Soft-label dataset distillation and text dataset distillation. In: 2021 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2021)
Google Scholar
Toneva, M., Sordoni, A., Combes, R.T., Trischler, A., Bengio, Y., Gordon, G.J.: An empirical study of example forgetting during deep neural network learning. arXiv preprint arXiv:1812.05159 (2018)
Tsang, I.W., Kwok, J.T., Cheung, P.M., Cristianini, N.: Core vector machines: fast SVM training on very large data sets. J. Mach. Learn. Res. 6(4) (2005)
Google Scholar
Tukan, M., Maalouf, A., Feldman, D.: Coresets for near-convex functions. In: Advances in Neural Information Processing Systems, vol. 33, pp. 997–1009 (2020)
Google Scholar
Ulyanov, D., Vedaldi, A., Lempitsky, V.: Instance normalization: the missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022 (2016)
Voulodimos, A., Doulamis, N., Doulamis, A., Protopapadakis, E., et al.: Deep learning for computer vision: a brief review. Comput. Intell. Neurosci. 2018 (2018)
Google Scholar
Wang, J., Guo, S., Xie, X., Qi, H.: Protect privacy from gradient leakage attack in federated learning. In: IEEE Conference on Computer Communications, IEEE INFOCOM 2022, pp. 580–589. IEEE (2022)
Google Scholar
Wang, K., et al.: Cafe: learning to condense dataset by aligning features. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12196–12205 (2022)
Google Scholar
Wang, T., Zhu, J.Y., Torralba, A., Efros, A.A.: Dataset distillation. arXiv preprint arXiv:1811.10959 (2018)
Wiewel, F., Yang, B.: Condensed composite memory continual learning. In: 2021 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2021)
Google Scholar
Xu, Z., et al.: Kernel ridge regression-based graph dataset distillation. In: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 2850–2861 (2023)
Google Scholar
Yang, S., Xie, Z., Peng, H., Xu, M., Sun, M., Li, P.: Dataset pruning: reducing training data by examining generalization influence. arXiv preprint arXiv:2205.09329 (2022)
Yin, Z., Xing, E., Shen, Z.: Squeeze, recover and relabel: dataset condensation at imagenet scale from a new perspective. In: Advances in Neural Information Processing Systems, vol. 36 (2024)
Google Scholar
Yoon, J., Madaan, D., Yang, E., Hwang, S.J.: Online coreset selection for rehearsal-based continual learning. arXiv preprint arXiv:2106.01085 (2021)
Zhang, L., et al.: Accelerating dataset distillation via model augmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11950–11959 (2023)
Google Scholar
Zhao, B., Bilen, H.: Dataset condensation with differentiable Siamese augmentation. In: International Conference on Machine Learning, pp. 12674–12685. PMLR (2021)
Google Scholar
Zhao, B., Bilen, H.: Synthesizing informative training samples with GAN. arXiv preprint arXiv:2204.07513 (2022)
Zhao, B., Bilen, H.: Dataset condensation with distribution matching. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 6514–6523 (2023)
Google Scholar
Zhao, B., Mopuri, K.R., Bilen, H.: Dataset condensation with gradient matching. arXiv preprint arXiv:2006.05929 (2020)
Zhao, G., Li, G., Qin, Y., Yu, Y.: Improved distribution matching for dataset condensation (2023)
Google Scholar
Zhmoginov, A., Sandler, M., Miller, N., Kristiansen, G., Vladymyrov, M.: Decentralized learning with multi-headed distillation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8053–8063 (2023)
Google Scholar
Zhou, Y., Ma, X., Wu, D., Li, X.: Communication-efficient and attack-resistant federated edge learning with dataset distillation. IEEE Trans. Cloud Comput. (2022)
Google Scholar
Zhou, Y., Pu, G., Ma, X., Li, X., Wu, D.: Distilled one-shot federated learning. arXiv preprint arXiv:2009.07999 (2020)
Zhou, Y., Nezhadarya, E., Ba, J.: Dataset distillation using neural feature regression. arXiv preprint arXiv:2206.00719 (2022)

Download references

Acknowledgment

This work is funded by Bayerische Forschungsstiftung under the research grants Von der Edge zur Cloud und zurück: Skalierbare und Adaptive Sensordatenverarbeitung (AZ-1468-20), and supported by AI systems hosted and operated by the Leibniz-Rechenzentrum (LRZ) der Bayerischen Akademie der Wissenschaften. Further, part of the results have been obtained on systems in the test environment BEAST (Bavarian Energy Architecture & Software Testbed) at the Leibniz Supercomputing Centre.

Author information

Authors and Affiliations

Technical University of Munich, Munich, Germany
Dai Liu, Hu Cao, Carsten Trinitis & Martin Schulz
University of Oxford, Oxford, UK
Jindong Gu

Authors

Dai Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jindong Gu
View author publications
You can also search for this author in PubMed Google Scholar
Hu Cao
View author publications
You can also search for this author in PubMed Google Scholar
Carsten Trinitis
View author publications
You can also search for this author in PubMed Google Scholar
Martin Schulz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dai Liu .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Germany
Stefan Roth
Princeton University, Princeton, NJ, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 7265 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, D., Gu, J., Cao, H., Trinitis, C., Schulz, M. (2025). Dataset Distillation by Automatic Training Trajectories. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15145. Springer, Cham. https://doi.org/10.1007/978-3-031-73021-4_20

Download citation

DOI: https://doi.org/10.1007/978-3-031-73021-4_20
Published: 21 November 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-73020-7
Online ISBN: 978-3-031-73021-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Dataset Distillation by Automatic Training Trajectories