Abstract
In object detection, the Intersection over Union (\({\mathrm{IoU}}\)) is the most popular criterion used to validate the performance of an object detector on the testing object dataset, or to compare the performances of various object detectors on a common object dataset. The calculation of this criterion requires the determination of the overlapping area between two bounding boxes. If these latter are axis-aligned (or horizontal), then the exact calculation of their overlapping area is simple. But if these bounding boxes are rotated (or oriented), then the exact calculation of their overlapping area is laborious. Many rotated objects detectors have been developed using heuristics to approximate \({\mathrm{IoU}}\) between two rotated bounding boxes. We have shown, through counterexamples, that these heuristics are not efficient in the sense that they can lead to false positive or false negative detection, which can bias the performance of comparative studies between object detectors. In this paper, we develop a method to calculate exact value of \({{\mathrm{IoU}}}\) between two rotated bounding boxes. Moreover, we present an \((\epsilon ,\alpha )\)-estimator \(\widehat{{\mathrm{IoU}}}\) of \({{\mathrm{IoU}}}\) that satisfies \({\mathbf {Pr}} (|\widehat{{\mathrm{IoU}}} -{\mathrm{IoU}}| \le {\mathrm{IoU}}\epsilon )\ge 1-\alpha \). We also generalize the exact computing method and the \((\epsilon ,\alpha )\)-estimator of \({{\mathrm{IoU}}}\), to three-dimensional bounding boxes. Finally, we carry out many numerical experiments in \({\mathbb {R}}^2\) and \({\mathbb {R}}^3\), in order to test the exact method of calculating the \({{\mathrm{IoU}}}\), and to compare the efficiency of the \((\epsilon ,\alpha )\)-estimator with respect to heuristic estimates of \({{\mathrm{IoU}}}\). Numerical study shows that the \((\epsilon ,\alpha )\)-estimator is distinguished by both precision and simplicity of implementation, while the exact calculation method is distinguished by both precision and speed.
Similar content being viewed by others
References
Endres, A., et al.: The benefits and challenges of collecting richer object annotations. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2010)
Su, K., Geng, X.: Soft facial landmark detection by label distribution learning. Proc. AAAI Conf. Artif. Intell. 33(01), 5008–5015 (2019)
Zou, Z., et al.: Object detection in 20 years: a survey (2019). arXiv:1905.05055
Ren, S., et al.: Faster R-CNN: towards realtime object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2017)
Cai, Z., Vasconcelos, N.: Cascade r-cnn: delving into high quality object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 6154–6162 (2018)
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv:1804.02767 (2018)
Liu, W., et al.: SSD: single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37 (2016)
Qian, W., et al.: Learning modulated loss for rotated object detection. arXiv:1911.08299 (2019)
Zhu, H., et al.: Orientation robust object detection in aerial images using deep convolutional neural network. IEEE Int. Conf. Image Process. 13, 3735–3739 (2015)
Liu, Z., et al.: Ship rotated bounding box space for ship extraction from high-resolution optical satellite images with complex backgrounds. IEEE Geosci. Remote Sens. Lett. 13, 1074–1078 (2016)
Xia, G., et al.: Dota: a large-scale dataset for object detection in aerial images. In: The IEEE Conference on Computer Vision and Pattern Recognition (2018)
Azimi, S., et al.: Towards multi-class object detection in unconstrained remote sensing imagery. In: Asian Conference on Computer Vision, pp. 150–165 (2018)
Ding, J., et al.: Learning ROI transformer for oriented object detection in aerial images. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2844–2853 (2019)
Yang, X., et al.: Towards more robust detection for small, cluttered and rotated objects. IEEE International Conference on Computer Vision, pp. 8232–8241 (2019)
Zahou, X., et al.: East: an efficient and accurate scene text detector. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5551–5560 (2017)
Yang, X., et al.: R3det: refined single-stage detector with feature refinement for rotating object. arXiv:1908.05612 (2019)
Vahab, A., et al.: Applications of object detection system. Int. Res. J. Eng. Technol. 06(04), 4186–4192 (2019)
Everingham, M., et al.: The pascal visual object classes challenge: a retrospective. Int. J. Comput. Vis. 111, 98–136 (2015)
Lin, T., et al.: Microsoft coco: common objects in context. In: European Conference on Computer Vision (2014)
Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 211–252 (2015)
Adhikari, B., et al.: Faster bounding box annotation for object detection in indoor scenes. In: European Workshop on Visual Information Processing, pp. 1–6 (2018)
Rezatofighi, H., et al.: Generalized intersection over union: a metric and a loss for bounding box regression. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 658–666 (2019)
Liu, L., Pan, Z., Lei, B.: Learning a rotation invariant detector with rotatable bounding box. arXiv:1711.09405 (2017)
Yao, C., et al.: Detecting texts of arbitrary orientations in natural images. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1083–1090 (2012)
Lo, D.: Finite Element Mesh Generation, 1st edn. CRC Press, Boca Raton (2014)
Kabat, M.: Design and Analysis of Algorithms. PHI Learning (2013)
Ramirez, A., Cox, C.: Improving on the range rule of thumb. Rose Hulman Undergrad. Math. J. 13(2), 1 (2012)
Jacod, J., Protter, P.: Probablity Essentials. Springer, Berlin (2004)
Sunday, D.: Fast polygon area and Newell normal computation. J. Graph. Tools 7(2), 9–13 (2002)
Dagum, P., et al.: An optimal algorithm for Monte Carlo estimation. SIAM J. Comput. 29(5), 1484–1496 (2000)
Huber, M.: An optimal \((\epsilon,\delta )\)-randomized approximation scheme for the mean of random variables with bounded relative variance. Random Struct. Algorithms 55(2), 356–370 (2019)
Bochinski, E., Eiselein, V., Sikora, T.: High-speed tracking by detection without using image information. In: IEEE International Conference on Advanced Video and Signals-based Surveillance (2017)
Arvo, J.: Graphics Gems II. Academic Press, Cambridge (1991)
Yi, J., et al.: Oriented object detection in aerial images with box boundary-aware vectors. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2150–2159 (2021)
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The author declares that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zaïdi, A. Accurate IoU computation for rotated bounding boxes in \({\mathbb {R}}^2\) and \({\mathbb {R}}^3\). Machine Vision and Applications 32, 114 (2021). https://doi.org/10.1007/s00138-021-01238-x
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00138-021-01238-x