Abstract
Detecting the corresponding editions from just a pair of input-output images represents an interesting task for artificial intelligence. If the possible image transformations are known, the task can be easily solved by enumeration with brute force, yet this becomes an unfeasible solution for long sequences. There are several state-of-the-art approaches, mostly in the field of image forensics, which aim to detect those transformations; however, all related research is focused on detecting single transformations instead of a sequence of them. In this work, we present the Image Transformation Sequence Retrieval (ITSR) problem and describe a first attempt to solve it by considering existing technology. Our results demonstrate the huge difficulty of obtaining a good performance—being even worse than a random guess in some cases—and the necessity of developing specific solutions for ITSR.
This paper is part of the project I+D+i PID2020-118447RA-I00, funded by MCIN/AEI/10.13039/501100011033. The second author is supported by grant ACIF/2021/356 from “Programa I+D+i de la Generalitat Valenciana”.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
For the sake of reproducibility, we release all the code related to these experiments in the following repository: https://github.com/emascandela/itsr.
References
Barman, R., Ehrmann, M., Clematide, S., Oliveira, S.A., Kaplan, F.: Combining visual and textual features for semantic segmentation of historical newspapers. J. Data Min. Digit. Human. HistoInformatics (2021)
Bayar, B., Stamm, M.C.: Constrained convolutional neural networks: a new approach towards general purpose image manipulation detection. IEEE Trans. Inf. Forensics Secur. 13(11), 2691–2706 (2018)
Bengio, Y., Louradour, J., Collobert, R., Weston, J.: Curriculum learning. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 41–48. Association for Computing Machinery, New York (2009)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. CoRR abs/2010.11929 (2020)
Feng, X., Cox, I.J., Doerr, G.: Normalized energy density-based forensic detection of resampled images. IEEE Trans. Multimedia 14(3), 536–545 (2012)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Howard, J.: ImageNette. https://github.com/fastai/imagenette/
Hu, B., Zhou, N., Zhou, Q., Wang, X., Liu, W.: DiffNet: a learning to compare deep network for product recognition. IEEE Access 8, 19336–19344 (2020)
Kang, X., Stamm, M.C., Peng, A., Liu, K.J.R.: Robust median filtering forensics using an autoregressive model. IEEE Trans. Inf. Forensics Secur. 8(9), 1456–1468 (2013)
Mazumdar, A., Bora, P.K.: Siamese convolutional neural network-based approach towards universal image forensics. IET Image Process. 14(13), 3105–3116 (2020)
Robbins, H.E.: A stochastic approximation method. Ann. Math. Stat. 22, 400–407 (2007)
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986)
Snover, M., Dorr, B., Schwartz, R., Micciulla, L., Makhoul, J.: A study of translation edit rate with targeted human annotation. In: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers, pp. 223–231. Association for Machine Translation in the Americas, Cambridge, 8–12 August 2006
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems. vol. 30, pp. 5998–6008. Curran Associates, Inc. (2017)
Xu, K., et al.: Show, attend and tell: neural image caption generation with visual attention. In: Bach, F., Blei, D. (eds.) Proceedings of the 32nd International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 37, pp. 2048–2057. PMLR, Lille, 07–09 July 2015
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Mas-Candela, E., Ríos-Vila, A., Calvo-Zaragoza, J. (2022). A First Approach to Image Transformation Sequence Retrieval. In: Pinho, A.J., Georgieva, P., Teixeira, L.F., Sánchez, J.A. (eds) Pattern Recognition and Image Analysis. IbPRIA 2022. Lecture Notes in Computer Science, vol 13256. Springer, Cham. https://doi.org/10.1007/978-3-031-04881-4_26
Download citation
DOI: https://doi.org/10.1007/978-3-031-04881-4_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-04880-7
Online ISBN: 978-3-031-04881-4
eBook Packages: Computer ScienceComputer Science (R0)