research-article

Continuous ODE-defined Image Features for Adaptive Retrieval

Authors:
Fabio Carrara

ISTI CNR, Pisa, Italy

ISTI CNR, Pisa, Italy
View Profile

,
Giuseppe Amato

ISTI CNR, Pisa, Italy

ISTI CNR, Pisa, Italy
View Profile

,
Fabrizio Falchi

ISTI CNR, Pisa, Italy

ISTI CNR, Pisa, Italy
View Profile

,
Claudio Gennaro

ISTI CNR, Pisa, Italy

ISTI CNR, Pisa, Italy
View Profile

ICMR '20: Proceedings of the 2020 International Conference on Multimedia RetrievalJune 2020Pages 198–206https://doi.org/10.1145/3372278.3390690

Published:08 June 2020Publication History

ICMR '20: Proceedings of the 2020 International Conference on Multimedia Retrieval

Pages 198–206

ABSTRACT

In the last years, content-based image retrieval largely benefited from representation extracted from deeper and more complex convolutional neural networks, which became more effective but also more computationally demanding. Despite existing hardware acceleration, query processing times may be easily saturated by deep feature extraction in high-throughput or real-time embedded scenarios, and usually, a trade-off between efficiency and effectiveness has to be accepted. In this work, we experiment with the recently proposed continuous neural networks defined by parametric ordinary differential equations, dubbed ODE-Nets, for adaptive extraction of image representations. Given the continuous evolution of the network hidden state, we propose to approximate the exact feature extraction by taking a previous "near-in-time" hidden state as features with a reduced computational cost. To understand the potential and the limits of this approach, we also evaluate an ODE-only architecture in which we minimize the number of classical layers in order to delegate most of the representation learning process --- and thus the feature extraction process --- to the continuous part of the model. Preliminary experiments on standard benchmarks show that we are able to dynamically control the trade-off between efficiency and effectiveness of feature extraction at inference-time by controlling the evolution of the continuous hidden state. Although ODE-only networks provide the best fine-grained control on the effectiveness-efficiency trade-off, we observed that mixed architectures perform better or comparably to standard residual nets in both the image classification and retrieval setups while using fewer parameters and retaining the controllability of the trade-off.

References

Giuseppe Amato, Fabio Carrara, Fabrizio Falchi, Claudio Gennaro, and Lucia Vadicamo. 2019. Large-scale instance-level image retrieval. Information Processing & Management (2019), 102100.Google Scholar
Giuseppe Amato, Fabrizio Falchi, Claudio Gennaro, and Lucia Vadicamo. 2016. Deep permutations: deep convolutional neural networks and permutation-based indexing. In International Conference on Similarity Search and Applications. Springer, 93--106.Google ScholarCross Ref
Relja Arandjelovic, Petr Gronat, Akihiko Torii, Tomas Pajdla, and Josef Sivic. 2016. NetVLAD: CNN architecture for weakly supervised place recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5297--5307.Google ScholarCross Ref
Artem Babenko, Anton Slesarev, Alexandr Chigorin, and Victor Lempitsky. 2014. Neural codes for image retrieval. In European conference on computer vision. Springer, 584--599.Google ScholarCross Ref
Fatih Cakir, Kun He, Sarah Adel Bargal, and Stan Sclaroff. 2017. MIHash: Online hashing with mutual information. In Proceedings of the IEEE International Conference on Computer Vision. 437--445.Google ScholarCross Ref
Fabio Carrara, Giuseppe Amato, Fabrizio Falchi, and Claudio Gennaro. 2019. Evaluation of continuous image features learned by ode nets. In International Conference on Image Analysis and Processing. Springer, 432--442.Google ScholarDigital Library
Bo Chang, Lili Meng, Eldad Haber, Lars Ruthotto, David Begert, and Elliot Holtham. 2018a. Reversible architectures for arbitrarily deep residual neural networks. In Thirty-Second AAAI Conference on Artificial Intelligence .Google ScholarCross Ref
Bo Chang, Lili Meng, Eldad Haber, Frederick Tung, and David Begert. 2018b. Multi-level Residual Networks from Dynamical Systems View. In International Conference on Learning Representations. https://openreview.net/forum?id=SyJS-OgR-Google Scholar
Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David K Duvenaud. 2018. Neural ordinary differential equations. In Advances in Neural Information Processing Systems. 6572--6583.Google Scholar
John R Dormand and Peter J Prince. 1980. A family of embedded Runge-Kutta formulae. Journal of computational and applied mathematics, Vol. 6, 1 (1980), 19--26.Google ScholarCross Ref
Matthijs Douze, Hervé Jégou, and Florent Perronnin. 2016. Polysemous codes. In European Conference on Computer Vision. Springer, 785--801.Google ScholarCross Ref
Emilien Dupont, Arnaud Doucet, and Yee Whye Teh. 2019. Augmented Neural ODEs. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 8--14 December 2019, Vancouver, BC, Canada. 3134--3144. http://papers.nips.cc/paper/8577-augmented-neural-odesGoogle Scholar
Albert Gordo, Jon Almazan, Jerome Revaud, and Diane Larlus. 2017. End-to-end learning of deep visual representations for image retrieval. International Journal of Computer Vision, Vol. 124, 2 (2017), 237--254.Google ScholarDigital Library
Will Grathwohl, Ricky T. Q. Chen, Jesse Bettencourt, Ilya Sutskever, and David Duvenaud. 2019. FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models. International Conference on Learning Representations (2019).Google Scholar
Eldad Haber and Lars Ruthotto. 2017. Stable architectures for deep neural networks. Inverse Problems, Vol. 34, 1 (2017), 014004.Google ScholarCross Ref
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016a. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.Google ScholarCross Ref
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016b. Identity mappings in deep residual networks. In European conference on computer vision. Springer, 630--645.Google ScholarCross Ref
Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7132--7141.Google ScholarCross Ref
Herve Jegou, Matthijs Douze, and Cordelia Schmid. 2008. Hamming embedding and weak geometric consistency for large scale image search. In European conference on computer vision. Springer, 304--317.Google ScholarDigital Library
Yannis Kalantidis, Clayton Mellina, and Simon Osindero. 2016. Cross-dimensional weighting for aggregated deep convolutional features. In European conference on computer vision. Springer, 685--701.Google ScholarCross Ref
Alex Krizhevsky and Geoffrey Hinton. 2009. Learning multiple layers of features from tiny images. Technical Report. Citeseer.Google Scholar
Gustav Larsson, Michael Maire, and Gregory Shakhnarovich. 2017. FractalNet: Ultra-Deep Neural Networks without Residuals. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24--26, 2017, Conference Track Proceedings. https://openreview.net/forum?id=S1VaB4cexGoogle Scholar
Yann LeCun, Léon Bottou, Yoshua Bengio, Patrick Haffner, et al. 1998. Gradient-based learning applied to document recognition. Proc. IEEE, Vol. 86, 11 (1998), 2278--2324.Google ScholarCross Ref
Wei Liu, Jun Wang, Rongrong Ji, Yu-Gang Jiang, and Shih-Fu Chang. 2012. Supervised hashing with kernels. In 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2074--2081.Google ScholarDigital Library
Yiping Lu, Aoxiao Zhong, Quanzheng Li, and Bin Dong. 2018. Beyond Finite Layer Neural Networks: Bridging Deep Architectures and Numerical Differential Equations. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsm"a ssan, Stockholm, Sweden, July 10--15, 2018. 3282--3291. http://proceedings.mlr.press/v80/lu18d.htmlGoogle Scholar
Yury A Malkov and Dmitry A Yashunin. 2018. Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. IEEE transactions on pattern analysis and machine intelligence (2018).Google Scholar
Nicola Messina, Giuseppe Amato, Fabio Carrara, Fabrizio Falchi, and Claudio Gennaro. 2018. Learning relationship-aware visual features. In Proceedings of the European Conference on Computer Vision (ECCV). 0--0.Google Scholar
James Philbin, Ondrej Chum, Michael Isard, Josef Sivic, and Andrew Zisserman. 2007. Object retrieval with large vocabularies and fast spatial matching. In 2007 IEEE conference on computer vision and pattern recognition. IEEE, 1--8.Google ScholarCross Ref
Filip Radenović, Ahmet Iscen, Giorgos Tolias, Yannis Avrithis, and Ondvr ej Chum. 2018a. Revisiting oxford and paris: Large-scale image retrieval benchmarking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5706--5715.Google ScholarCross Ref
Filip Radenović, Giorgos Tolias, and Ondrej Chum. 2018b. Fine-tuning CNN image retrieval with no human annotation. IEEE transactions on pattern analysis and machine intelligence (2018).Google Scholar
Ali S Razavian, Josephine Sullivan, Stefan Carlsson, and Atsuto Maki. 2016. Visual instance retrieval with deep convolutional networks. ITE Transactions on Media Technology and Applications, Vol. 4, 3 (2016), 251--258.Google ScholarCross Ref
Yulia Rubanova, Ricky TQ Chen, and David Duvenaud. 2019. Latent odes for irregularly-sampled time series. arXiv preprint arXiv:1907.03907 (2019).Google Scholar
Lars Ruthotto and Eldad Haber. 2019. Deep Neural Networks Motivated by Partial Differential Equations. Journal of Mathematical Imaging and Vision (18 Sep 2019). https://doi.org/10.1007/s10851-019-00903--1Google Scholar
Ali Sharif Razavian, Hossein Azizpour, Josephine Sullivan, and Stefan Carlsson. 2014. CNN features off-the-shelf: an astounding baseline for recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 806--813.Google ScholarDigital Library
Giorgos Tolias, Ronan Sicre, and Hervé Jégou. 2015. Particular object retrieval with integral max-pooling of CNN activations. arXiv preprint arXiv:1511.05879 (2015).Google Scholar
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008.Google Scholar
Yuxin Wu and Kaiming He. 2018. Group normalization. In Proceedings of the European Conference on Computer Vision (ECCV). 3--19.Google ScholarDigital Library
Xingcheng Zhang, Zhizhong Li, Chen Change Loy, and Dahua Lin. 2017. Polynet: A pursuit of structural diversity in very deep networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 718--726.Google ScholarCross Ref
Mai Zhu, Bo Chang, and Chong Fu. 2018. Convolutional Neural Networks combined with Runge-Kutta Methods. arXiv preprint arXiv:1802.08831 (2018).Google Scholar

Index Terms

Continuous ODE-defined Image Features for Adaptive Retrieval
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision representations
        Image representations
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks
2. Mathematics of computing
  1. Mathematical analysis
    1. Differential equations
      1. Ordinary differential equations

Recommendations

Based on Image Salient Features Network Image Retrieval Method
CIS '12: Proceedings of the 2012 Eighth International Conference on Computational Intelligence and Security

Traditional image retrieval depends on the images embedded in text messages, text description of the limitations of image content, resulting in low quality of image retrieval. The local information extracted image itself, the use of local features LSH ...
Read More
Image Retrieval Using Fused Deep Convolutional Features

This paper proposes an image retrieval using fused deep convolutional features to solve the semantic gap between low-level features and high-level semantic features of traditional contend-based image retrieval method. Firstly, the improved network ...
Read More
Self-adaptive Feature Extraction Scheme for Mobile Image Retrieval of Flowers
SITIS '12: Proceedings of the 2012 Eighth International Conference on Signal Image Technology and Internet Based Systems

This paper proposes a new self-adaptive feature extraction scheme to improve retrieval precision for Content-based Image Retrieval (CBIR) systems on mobile phones such that users can search similar pictures for a query image taken from their mobile ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICMR '20: Proceedings of the 2020 International Conference on Multimedia Retrieval
June 2020
605 pages
ISBN:9781450370875
DOI:10.1145/3372278
General Chairs:
Cathal Gurrin
Dublin City University, Ireland
,
Björn Þór Jónsson
IT University of Copenhagen, Denmark
,
Noriko Kando
National Institute of Informatics, Tokyo
,
Program Chairs:
Klaus Schoeffmann
Klagenfurt University, Austria
,
Phoebe Chen
La Trobe University, Australia
,
Noel E. O'Connor
Dublin City University, Ireland
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 8 June 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
adaptive computation
continuous neural networks
feature extraction
image retrieval
ordinary differential equations
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate254of830submissions,31%
Upcoming Conference
ICMR '24

Sponsor:

sigmm

International Conference on Multimedia Retrieval

June 10 - 14, 2024

Phuket , Thailand
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 160
  Total Downloads
- Downloads (Last 12 months)2
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Continuous ODE-defined Image Features for Adaptive Retrieval

ICMR '20: Proceedings of the 2020 International Conference on Multimedia Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Based on Image Salient Features Network Image Retrieval Method

Image Retrieval Using Fused Deep Convolutional Features

Self-adaptive Feature Extraction Scheme for Mobile Image Retrieval of Flowers