Abstract
Dataset Distillation is used to create a concise, yet informative, synthetic dataset that can replace the original dataset for training purposes. Some leading methods in this domain prioritize long-range matching, involving the unrolling of training trajectories with a fixed number of steps (\(N_{S}\)) on the synthetic dataset to align with various expert training trajectories. However, traditional long-range matching methods possess an overfitting-like problem, the fixed step size \(N_{S}\) forces synthetic dataset to distortedly conform seen expert training trajectories, resulting in a loss of generality—especially to those from unencountered architecture. We refer to this as the Accumulated Mismatching Problem (AMP), and propose a new approach, Automatic Training Trajectories (ATT), which dynamically and adaptively adjusts trajectory length \(N_{S}\) to address the AMP. Our method outperforms existing methods particularly in tests involving cross-architectures. Moreover, owing to its adaptive nature, it exhibits enhanced stability in the face of parameter variations. Our source code is publicly available at https://github.com/NiaLiu/ATT.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Agarap, A.F.: Deep learning using rectified linear units (ReLU). arXiv preprint arXiv:1803.08375 (2018)
Aljundi, R., Lin, M., Goujaud, B., Bengio, Y.: Gradient based sample selection for online continual learning. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Assadi, S., Bateni, M., Bernstein, A., Mirrokni, V., Stein, C.: Coresets meet EDCS: algorithms for matching and vertex cover on massive graphs. In: Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1616–1635. SIAM (2019)
Bachem, O., Lucic, M., Krause, A.: Practical coreset constructions for machine learning. arXiv preprint arXiv:1703.06476 (2017)
Bohdal, O., Yang, Y., Hospedales, T.: Flexible dataset distillation: learn labels instead of images. arXiv preprint arXiv:2006.08572 (2020)
Cazenavette, G., Wang, T., Torralba, A., Efros, A.A., Zhu, J.Y.: Dataset distillation by matching training trajectories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4750–4759 (2022)
Cazenavette, G., Wang, T., Torralba, A., Efros, A.A., Zhu, J.Y.: Wearable imagenet: synthesizing tileable textures via dataset distillation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2278–2282 (2022)
Cazenavette, G., Wang, T., Torralba, A., Efros, A.A., Zhu, J.Y.: Generalizing dataset distillation via deep generative prior. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3739–3748 (2023)
Chen, Y., Welling, M.: Parametric herding. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 97–104. JMLR Workshop and Conference Proceedings (2010)
Chen, Y., Welling, M., Smola, A.: Super-samples from kernel herding. arXiv preprint arXiv:1203.3472 (2012)
Cui, J., Wang, R., Si, S., Hsieh, C.J.: Scaling up dataset distillation to imagenet-1k with constant memory (2022)
Dasgupta, A., Drineas, P., Harb, B., Kumar, R., Mahoney, M.W.: Sampling algorithms and coresets for \(\backslash \) ell_p regression. SIAM J. Comput. 38(5), 2060–2078 (2009)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009). https://doi.org/10.1109/CVPR.2009.5206848
Deng, Z., Russakovsky, O.: Remember the past: distilling datasets into addressable memories for neural networks (2022)
Dong, T., Zhao, B., Lyu, L.: Privacy for free: how does dataset condensation help privacy? In: International Conference on Machine Learning, pp. 5378–5396. PMLR (2022)
Du, J., Jiang, Y., Tan, V.T., Zhou, J.T., Li, H.: Minimizing the accumulated trajectory error to improve dataset distillation. arXiv preprint arXiv:2211.11004 (2023)
Fastai: A smaller subset of 10 easily classified classes from imagenet, and a little more French
Feldman, D.: Core-sets: updated survey. Sampling techniques for supervised or unsupervised tasks, pp. 23–44 (2020)
Gidaris, S., Komodakis, N.: Dynamic few-shot visual learning without forgetting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4367–4375 (2018)
Guo, Z., Wang, K., Cazenavette, G., Li, H., Zhang, K., You, Y.: Towards lossless dataset distillation via difficulty-aligned trajectory matching. arXiv preprint arXiv:2310.05773 (2023)
Har-Peled, S., Kushal, A.: Smaller coresets for k-median and k-means clustering. In: Proceedings of the Twenty-First Annual Symposium on Computational Geometry, pp. 126–134 (2005)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015)
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
Kendall, A., Gal, Y.: What uncertainties do we need in Bayesian deep learning for computer vision? In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Kim, J.H., et al.: Dataset condensation via efficient synthetic-data parameterization. In: International Conference on Machine Learning, pp. 11102–11118. PMLR (2022)
Kiyasseh, D., Zhu, T., Clifton, D.A.: PCPS: patient cardiac prototypes to probe AI-based medical diagnoses, distill datasets, and retrieve patients. Trans. Mach. Learn. Res. (2022)
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25 (2012)
Lee, H.B., Lee, D.B., Hwang, S.J.: Dataset condensation with latent space knowledge factorization and sharing. arXiv preprint arXiv:2208.10494 (2022)
Lee, S., Chun, S., Jung, S., Yun, S., Yoon, S.: Dataset condensation with contrastive signals. In: International Conference on Machine Learning, pp. 12352–12364. PMLR (2022)
Li, G., Togo, R., Ogawa, T., Haseyama, M.: Dataset distillation using parameter pruning. arXiv preprint arXiv:2209.14609 (2022)
Liu, S., Wang, K., Yang, X., Ye, J., Wang, X.: Dataset distillation via factorization (2022)
Liu, S., Wang, X.: Few-shot dataset distillation via translative pre-training. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 18654–18664 (2023)
Liu, S., Ye, J., Yu, R., Wang, X.: Slimmable dataset condensation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3759–3768 (2023)
Liu, Y., Gu, J., Wang, K., Zhu, Z., Jiang, W., You, Y.: Dream: efficient dataset distillation by representative matching (2023)
Liu, Y., Li, Z., Backes, M., Shen, Y., Zhang, Y.: Backdoor attacks against dataset distillation. arXiv preprint arXiv:2301.01197 (2023)
Loo, N., Hasani, R., Lechner, M., Rus, D.: Dataset distillation fixes dataset reconstruction attacks. arXiv preprint arXiv:2302.01428 (2023)
Loo, N., Hasani, R., Lechner, M., Rus, D.: Dataset distillation with convexified implicit gradients (2023)
Mirrokni, V., Zadimoghaddam, M.: Randomized composable core-sets for distributed submodular maximization. In: Proceedings of the Forty-Seventh Annual ACM Symposium on Theory of Computing, pp. 153–162 (2015)
Nguyen, T., Novak, R., Xiao, L., Lee, J.: Dataset distillation with infinitely wide convolutional networks. In: Advances in Neural Information Processing Systems, vol. 34, pp. 5186–5198 (2021)
Nickolls, J., Dally, W.J.: The GPU computing era. IEEE Micro 30(2), 56–69 (2010)
Olvera-López, J.A., Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F., Kittler, J.: A review of instance selection methods. Artif. Intell. Rev. 34, 133–143 (2010)
O’Mahony, N., et al.: Deep learning vs. traditional computer vision. In: Arai, K., Kapoor, S. (eds.) CVC 2019. AISC, vol. 943, pp. 128–144. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-17795-9_10
Paul, M., Ganguli, S., Dziugaite, G.K.: Deep learning on a data diet: finding important examples early in training. In: Advances in Neural Information Processing Systems, vol. 34, pp. 20596–20607 (2021)
Pi, R., et al.: DynaFed: tackling client data heterogeneity with global dynamics. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12177–12186 (2023)
Sangermano, M., Carta, A., Cossu, A., Bacciu, D.: Sample condensation in online continual learning. In: 2022 International Joint Conference on Neural Networks (IJCNN), pp. 01–08. IEEE (2022)
Schwartz, R., Dodge, J., Smith, N.A., Etzioni, O.: Green AI. Commun. ACM 63(12), 54–63 (2020)
Sener, O., Savarese, S.: Active learning for convolutional neural networks: a core-set approach. arXiv preprint arXiv:1708.00489 (2017)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Song, R., et al.: Federated learning via decentralized dataset distillation in resource-constrained edge environments. arXiv preprint arXiv:2208.11311 (2022)
Sucholutsky, I., Schonlau, M.: Soft-label dataset distillation and text dataset distillation. In: 2021 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2021)
Toneva, M., Sordoni, A., Combes, R.T., Trischler, A., Bengio, Y., Gordon, G.J.: An empirical study of example forgetting during deep neural network learning. arXiv preprint arXiv:1812.05159 (2018)
Tsang, I.W., Kwok, J.T., Cheung, P.M., Cristianini, N.: Core vector machines: fast SVM training on very large data sets. J. Mach. Learn. Res. 6(4) (2005)
Tukan, M., Maalouf, A., Feldman, D.: Coresets for near-convex functions. In: Advances in Neural Information Processing Systems, vol. 33, pp. 997–1009 (2020)
Ulyanov, D., Vedaldi, A., Lempitsky, V.: Instance normalization: the missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022 (2016)
Voulodimos, A., Doulamis, N., Doulamis, A., Protopapadakis, E., et al.: Deep learning for computer vision: a brief review. Comput. Intell. Neurosci. 2018 (2018)
Wang, J., Guo, S., Xie, X., Qi, H.: Protect privacy from gradient leakage attack in federated learning. In: IEEE Conference on Computer Communications, IEEE INFOCOM 2022, pp. 580–589. IEEE (2022)
Wang, K., et al.: Cafe: learning to condense dataset by aligning features. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12196–12205 (2022)
Wang, T., Zhu, J.Y., Torralba, A., Efros, A.A.: Dataset distillation. arXiv preprint arXiv:1811.10959 (2018)
Wiewel, F., Yang, B.: Condensed composite memory continual learning. In: 2021 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2021)
Xu, Z., et al.: Kernel ridge regression-based graph dataset distillation. In: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 2850–2861 (2023)
Yang, S., Xie, Z., Peng, H., Xu, M., Sun, M., Li, P.: Dataset pruning: reducing training data by examining generalization influence. arXiv preprint arXiv:2205.09329 (2022)
Yin, Z., Xing, E., Shen, Z.: Squeeze, recover and relabel: dataset condensation at imagenet scale from a new perspective. In: Advances in Neural Information Processing Systems, vol. 36 (2024)
Yoon, J., Madaan, D., Yang, E., Hwang, S.J.: Online coreset selection for rehearsal-based continual learning. arXiv preprint arXiv:2106.01085 (2021)
Zhang, L., et al.: Accelerating dataset distillation via model augmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11950–11959 (2023)
Zhao, B., Bilen, H.: Dataset condensation with differentiable Siamese augmentation. In: International Conference on Machine Learning, pp. 12674–12685. PMLR (2021)
Zhao, B., Bilen, H.: Synthesizing informative training samples with GAN. arXiv preprint arXiv:2204.07513 (2022)
Zhao, B., Bilen, H.: Dataset condensation with distribution matching. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 6514–6523 (2023)
Zhao, B., Mopuri, K.R., Bilen, H.: Dataset condensation with gradient matching. arXiv preprint arXiv:2006.05929 (2020)
Zhao, G., Li, G., Qin, Y., Yu, Y.: Improved distribution matching for dataset condensation (2023)
Zhmoginov, A., Sandler, M., Miller, N., Kristiansen, G., Vladymyrov, M.: Decentralized learning with multi-headed distillation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8053–8063 (2023)
Zhou, Y., Ma, X., Wu, D., Li, X.: Communication-efficient and attack-resistant federated edge learning with dataset distillation. IEEE Trans. Cloud Comput. (2022)
Zhou, Y., Pu, G., Ma, X., Li, X., Wu, D.: Distilled one-shot federated learning. arXiv preprint arXiv:2009.07999 (2020)
Zhou, Y., Nezhadarya, E., Ba, J.: Dataset distillation using neural feature regression. arXiv preprint arXiv:2206.00719 (2022)
Acknowledgment
This work is funded by Bayerische Forschungsstiftung under the research grants Von der Edge zur Cloud und zurück: Skalierbare und Adaptive Sensordatenverarbeitung (AZ-1468-20), and supported by AI systems hosted and operated by the Leibniz-Rechenzentrum (LRZ) der Bayerischen Akademie der Wissenschaften. Further, part of the results have been obtained on systems in the test environment BEAST (Bavarian Energy Architecture & Software Testbed) at the Leibniz Supercomputing Centre.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Liu, D., Gu, J., Cao, H., Trinitis, C., Schulz, M. (2025). Dataset Distillation by Automatic Training Trajectories. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15145. Springer, Cham. https://doi.org/10.1007/978-3-031-73021-4_20
Download citation
DOI: https://doi.org/10.1007/978-3-031-73021-4_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-73020-7
Online ISBN: 978-3-031-73021-4
eBook Packages: Computer ScienceComputer Science (R0)