Abstract
Forward-looking sonar (FLS) image registration is a key step in many underwater applications such as underwater target detection, ocean observation, and mapping. However, low resolution, low signal-to-noise ratio, and the complex nonlinear transformation relationship between FLS images from two different viewpoints have brought great challenges to register them. In order to better cope with this challenge, we propose a global perspective and local flow registration (GPLFR) method for FLS images. GPLFR consists of two networks, i.e., a regression correction network (RCNet) and a deformable network (IRRDNet) with the iterative refinement of the residual. For a given pair of FLS images, RCNet is used to estimate the global transformation parameters to achieve global registration, and then, IRRDNet is used to estimate the deformation field or flow field to realize local alignment. The experimental results on real FLS image and 2D face expression image registration tasks demonstrate the effectiveness and robustness of the proposed method.








Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Zitova B, Flusser J (2003) Image registration methods: a survey. Image Vis Comput 21(11):977–1000
Liu J, Gong J, Guo B, Zhang W (2017) A novel adjustment model for mosaicking low-overlap sweeping images. IEEE Trans Geosci Remot Sens 55(7):4089–4097
Goshtasby AA, Nikolov S (2007) Image fusion: advances in the state of the art. Infor fus 2(8):114–118
Zanetti M, Bruzzone L (2017) A theoretical framework for change detection based on a compound multiclass statistical model of the difference image. IEEE Trans Geosci Remot Sens 56(2):1129–1143
Vakalopoulou M, Karantzalos K, Komodakis N, Paragios N (2015) Simultaneous registration and change detection in multitemporal, very high resolution remote sensing data. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 61–69
Negahdaripour S, Firoozfam P, Sabzmeydani P (2005) On processing and registration of forward-scan acoustic video imagery. In: The 2nd Canadian conference on computer and robot vision (CRV’05), IEEE, pp 452–459
Li H, Dong Y, He X, Xie S, Luo J (2014) A sonar image mosaicing algorithm based on improved sift for usv. In: 2014 IEEE International conference on mechatronics and automation, IEEE, pp 1839–1843
Negahdaripour S, Aykin M, Sinnarajah S (2011) Dynamic scene analysis and mosaicing of benthic habitats by fs sonar imaging-issues and complexities. In: OCEANS’11 MTS/IEEE KONA, IEEE, pp 1–7
Yang Z, Dan T, Yang Y (2018) Multi-temporal remote sensing image registration using deep convolutional features. IEEE Access 6:38544–38555
Balakrishnan G, Zhao A, Sabuncu MR, Guttag J, Dalca AV (2019) Voxelmorph: a learning framework for deformable medical image registration. IEEE Trans Medical Imag 38(8):1788–1800
Zhao S, Dong Y, Chang EI, Xu Y, et al. (2019) Recursive cascaded networks for unsupervised medical image registration. In: Proceedings of the IEEE international conference on computer vision, pp 10600–10610
de Vos BD, Berendsen FF, Viergever MA, Staring M, Išgum I (2017) End-to-end unsupervised deformable image registration with a convolutional neural network. In: Deep learning in medical image analysis and multimodal learning for clinical decision support, Springer, pp 204–212
Galceran E, Djapic V, Carreras M, Williams DP (2012) A real-time underwater object detection algorithm for multi-beam forward looking sonar. IFAC Proceed Vol 45(5):306–311
Quidu I, Jaulin L, Bertholom A, Dupas Y (2012) Robust multitarget tracking in forward-looking sonar image sequences using navigational data. IEEE J Ocean Eng 37(3):417–430
Clark DE, Bell J (2005) Bayesian multiple target tracking in forward scan sonar images using the phd filter. IEE Proceed-Radar, Sonar Navigat 152(5):327–334
Petillot Y, Ruiz IT, Lane DM (2001) Underwater vehicle obstacle avoidance and path planning using a multi-beam forward looking sonar. IEEE J Ocean Eng 26(2):240–251
Hurtos N, Ribas D, Cufí X, Petillot Y, Salvi J (2015) Fourier-based registration for robust forward-looking sonar mosaicing in low-visibility underwater environments. J Field Robot 32(1):123–151
Hurtós N, Nagappa S, Cufí X, Petillot Y, Salvi J (2013) Evaluation of registration methods on two-dimensional forward-looking sonar imagery. In: 2013 MTS/IEEE OCEANS-Bergen, IEEE, pp 1–8
Hurtós N, Petillot Y, Salvi J, et al. (2012) Fourier-based registrations for two-dimensional forward-looking sonar image mosaicing. In: 2012 IEEE/RSJ International conference on intelligent robots and systems, IEEE, pp 5298–5305
Zhang J, Sohel F, Bian H, Bennamoun M, An S (2016) Forward-looking sonar image registration using polar transform. In: OCEANS 2016 MTS/IEEE Monterey, IEEE, pp 1–6
Aykin M, Negahdaripour S (2012) On feature extraction and region matching for forward scan sonar imaging. In: 2012 Oceans, IEEE, pp 1–9
Sekkati H, Negahdaripour S (2007) 3-d motion estimation for positioning from 2-d acoustic video imagery. In: Iberian conference on Pattern Recognition and Image Analysis, Springer, pp 80–88
Hurtós Vilarnau N, et al. (2014) Forward-looking sonar mosaicing for underwater environments
Hurtós N, Palomeras N, Nagappa S, Salvi J (2013) Automatic detection of underwater chain links using a forward-looking sonar. In: 2013 MTS/IEEE OCEANS-Bergen, IEEE, pp 1–7
Guo Y, Wei L, Xu X (2020) A sonar image segmentation algorithm based on quantum-inspired particle swarm optimization and fuzzy clustering. Neural Comput Appl 32(22):16775–16782
Zhao S, Lau T, Luo J, Eric I, Chang C, Xu Y (2019) Unsupervised 3d end-to-end medical image registration with volume tweening network. IEEE J Biomed Health Infor 24(5):1394–1404
Hur J, Roth S (2019) Iterative residual refinement for joint optical flow and occlusion estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5754–5763
Brown LG (1992) A survey of image registration techniques. ACM Comput Surveys (CSUR) 24(4):325–376
Lowe DG (1999) Object recognition from local scale-invariant features. Proceedings of the seventh IEEE International conference on computer vision, IEEE 2:1150–1157
Bay H, Tuytelaars T, Van Gool L (2006) Surf: Speeded up robust features. In: European conference on computer vision, Springer, pp 404–417
Rublee E, Rabaud V, Konolige K, Bradski G (2011) Orb: An efficient alternative to sift or surf. In: 2011 International conference on computer vision, IEEE, pp 2564–2571
Fischler MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications ACM 24(6):381–395
Moisan L, Moulon P, Monasse P (2012) Automatic homographic registration of a pair of images, with a contrario elimination of outliers. Image Process Line 2:56–73
Raguram R, Chum O, Pollefeys M, Matas J, Frahm JM (2012) Usac: a universal framework for random sample consensus. IEEE Trans Patt Anal Mach Intell 35(8):2022–2038
Tao W, Zhao J, Liu J, Zhang H (2010) Study on the side-scan sonar image matching navigation based on surf. In: 2010 International conference on electrical and control engineering, IEEE, pp 2181–2184
Gai S, Xu X, Xiong B (2020) Paper currency defect detection algorithm using quaternion uniform strength. Neural computing and applications pp 1–18
Viola P, Wells WM III (1997) Alignment by maximization of mutual information. International J Comput Vis 24(2):137–154
Maes F, Collignon A, Vandermeulen D, Marchal G, Suetens P (1997) Multimodality image registration by maximization of mutual information. IEEE Transact Med Imag 16(2):187–198
Wang G, Xu X, Jiang X, Ding S (2016) Medical image registration based on self-adapting pulse-coupled neural networks and mutual information. Neur Comput Appl 27(7):1917–1926
Briechle K, Hanebeck UD (2001) Template matching using fast normalized cross correlation. Optical Pattern Recognition XII. Int Soci Optic Phot 4387:95–102
Sarvaiya JN, Patnaik S, Bombaywala S (2009) Image registration by template matching using normalized cross-correlation. In: 2009 International conference on advances in computing, control, and telecommunication technologies, IEEE, pp 819–822
Das A, Bhattacharya M (2011) Affine-based registration of CT and MR modality images of human brain using multiresolution approaches: comparative study on genetic algorithm and particle swarm optimization. Neural Comput Appl 20(2):223–237
Song S, Herrmann JM, Si B, Liu K, Feng X (2017) Two-dimensional forward-looking sonar image registration by maximization of peripheral mutual information. Int J Adv Robot Sys 14(6):1729881417746270
Valdenegro-Toro M (2017) Improving sonar image patch matching via deep learning. In: 2017 European conference on mobile robots (ECMR), IEEE, pp 1–6
Sarnel H, Senol Y (2011) Accurate and robust image registration based on radial basis neural networks. Neural Comput Appl 20(8):1255–1262
Ot P, dos Santos MM, Drews PLJ, da Costa Botelho SS, et al. (2017) Forward looking sonar scene matching using deep learning. In: 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), IEEE, pp 574–579
Cheng X, Zhang L, Zheng Y (2018) Deep similarity learning for multimodal medical images. Comput Method Biomech Biomed Eng: Imag Visual 6(3):248–252
DeTone D, Malisiewicz T, Rabinovich A (2016) Deep image homography estimation. arXiv:1912.02942
Chee E, Wu Z (2018) Airnet: Self-supervised affine registration for 3d medical images using neural networks. arXiv:1810.02583
Sokooti H, De Vos B, Berendsen F, Lelieveldt BP, Išgum I, Staring M (2017) Nonrigid image registration using multi-scale 3d convolutional neural networks. In: International conference on medical image computing and computer-assisted intervention, Springer, pp 232–239
Sokooti H, de Vos B, Berendsen F, Ghafoorian M, Yousefi S, Lelieveldt BP, Isgum I, Staring M (2019) 3d convolutional neural networks image registration based on efficient supervised learning from artificial deformations. arXiv:1908.10235
Fu Y, Lei Y, Wang T, Curran WJ, Liu T, Yang X (2020) Deep learning in medical image registration: a review. Phys Medic Biol 65:20TR01
Jaderberg M, Simonyan K, Zisserman A, et al. (2015) Spatial transformer networks. In: Advances in neural information processing systems, pp 2017–2025
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, Springer, pp 234–241
Zou W, Luo Y, Cao W, He Z, He Z (2021) A cascaded registration network rcinet with segmentation mask. Neural Computing and Applications pp 1–17
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Von Gioi RG, Jakubowicz J, Morel JM, Randall G (2012) Lsd: a line segment detector. Image Process Line 2:35–55
Liu R, Lehman J, Molino P, Petroski Such F, Frank E, Sergeev A, Yosinski J (2018) An intriguing failing of convolutional neural networks and the coordconv solution. Adv Neural Infor Process Sys 31:9605–9616
Handa A, Bloesch M, Pătrăucean V, Stent S, McCormac J, Davison A (2016) gvnn: Neural network library for geometric computer vision. In: European conference on computer vision, Springer, pp 67–82
Cheng H, GUPTA K (1989) An historical note on finite rotations. J Appl Mech 56(1):139–145
Gallego G, Yezzi A (2015) A compact formula for the derivative of a 3-d rotation in exponential coordinates. J Math Imag Vis 51(3):378–384
Langner O, Dotsch R, Bijlstra G, Wigboldus DH, Hawk ST, Van Knippenberg A (2010) Presentation and validation of the radboud faces database. Cognit Emot 24(8):1377–1388
Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv:1412.6980
Chen J, Li Y, Du Y, Frey EC (2020) Generating anthropomorphic phantoms using fully unsupervised deformable image registration with convolutional neural networks. Med Phys 47(12):6366–6380
Saad ZS, Glen DR, Chen G, Beauchamp MS, Desai R, Cox RW (2009) A new method for improving functional-to-structural MRI alignment using local pearson correlation. Neuroimage 44(3):839–848
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image process 13(4):600–612
Guo X, Xu Z, Lu Y, Pang Y (2005) An application of fourier-mellin transform in image registration. In: The Fifth international conference on computer and information technology (CIT’05), IEEE, pp 619–623
Chen X, Meng Y, Zhao Y, Williams R, Vallabhaneni SR, Zheng Y (2021) Learning unsupervised parameter-specific affine transformation for medical images registration. In: International conference on medical image computing and computer-assisted intervention, Springer, pp 24–34
Mok TC, Chung AC (2020) Large deformation diffeomorphic image registration with laplacian pyramid networks. In: International conference on medical image computing and computer-assisted intervention, Springer, pp 211–221
Kim B, Han I, Ye JC (2021) Diffusemorph: Unsupervised deformable image registration along continuous trajectory using diffusion models. arXiv:2112.05149
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
In the sonar-based spherical coordinate system, the projection model of the sonar is shown in Fig. 9. The FLS projects the 3D point \({\mathbf {P}}\) in the scene onto the 2D image plane, and the projection point is \({\mathbf {P}}_{\mathrm {s}}\). The point \({\mathbf {P}}\) is defined as:
where \(\left[ \begin{array}{lll}x&y&z\end{array}\right] ^{T}\) is the Cartesian coordinates of point \({\mathbf {P}}\) and \(\left[ \begin{array}{lll}R&\theta&\varphi \end{array}\right] ^{T}\) represents the distance, azimuth, and elevation angle of point \({\mathbf {P}}\) in the spherical coordinate system.
The projection point \({\mathbf {P}}_{\mathrm {s}}\) is defined as:
Now suppose that the sonar device follows rigid body motion, and then the coordinate points \({\mathbf {P}}\) and \({\mathbf {P}}^{\prime }\) of different views of the same scene satisfy the following transformation relationship:
where \({\mathbf {R}}\) is \(3 \times 3\) 3D rotation matrix and \({\mathbf {t}}\) is the 3D translation vector.
Let \({\mathbf {n}}=\left[ n_{x}, n_{y}, n_{z}\right] ^{T}\) be the scaled normal vector derived from the plane equation \(Z=Z_{o}+\zeta _{x} X+\zeta _{y} Y\), and satisfying \({\mathbf {n}} \cdot {\mathbf {P}}=1\), and then [22]
where
Using Eq.18, we can get
which can be rewritten:
Using Eq.20, we can get
and then
where the projection point of point \({\mathbf {P}}^{\prime }\) on the 2D image plane is:
Finally, sonar image points satisfy the transformation [6]:
where
Although it appears to be an affine model, the elements of \({\mathbf {H}}\) are different throughout the image due to the dependence on elevation angles \(\varphi \) and \(\varphi ^{\prime }\). The sonar images from two different viewpoints show a complex nonlinear transformation relationship [22].
Rights and permissions
About this article
Cite this article
Huang, P., Guo, C., Fu, X. et al. GPLFR—Global perspective and local flow registration-for forward-looking sonar images. Neural Comput & Applic 34, 12663–12679 (2022). https://doi.org/10.1007/s00521-022-07113-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-07113-8