skip to main content
10.1145/3595916.3626456acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Monocular 3D Pose Estimation of Very Small Airplane in the Air

Published: 01 January 2024 Publication History

Abstract

In this paper, a novel pose estimation algorithm is proposed specifically for maneuvering airplanes in the air. The algorithm consists of two main stages. The first stage involves semantic segmentation of a monocular input image of a flying airplane, where the entire captured area serves as feature points for the airplane, which are typically small in the image. The second stage focuses on the 3D pose estimation of the segmented image using projective registration. Since airplanes have unique characteristics and there is a scarcity of airplane-specific datasets, a custom dataset is generated for the experiments. Unreal Engine 4, a 3D computer graphics game engine renowned for its realistic simulations, is employed for this purpose. Experimental results demonstrate the suitability of the algorithm for 3D pose estimation of airplanes, providing valuable information for studying autonomous control of airplanes.

References

[1]
Daniel Bolya, Chong Zhou, Fanyi Xiao, and Yong Jae Lee. 2019. Yolact: Real-time instance segmentation. In Proceedings of the IEEE/CVF international conference on computer vision. 9157–9166.
[2]
Garrick Brazil and Xiaoming Liu. 2019. M3d-rpn: Monocular 3d region proposal network for object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9287–9296.
[3]
Holger Caesar, Varun Bankiti, Alex H Lang, Sourabh Vora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, and Oscar Beijbom. 2020. nuscenes: A multimodal dataset for autonomous driving. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 11621–11631.
[4]
Liang-Chieh Chen, George Papandreou, Florian Schroff, and Hartwig Adam. 2017. Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017).
[5]
Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. 2018. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (ECCV). 801–818.
[6]
Oscal Tzyh-Chiang Chen, Yu-Xuan Chang, Yu-Wei Jhao, Chih-Yu Chung, Yun-Ling Chang, and Wei-Hsiang Huang. 2022. 3D Object Detection of Cars and Pedestrians by Deep Neural Networks from Unit-Sharing One-Shot NAS. In 2022 18th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). 1–8. https://doi.org/10.1109/AVSS56176.2022.9959427
[7]
Xiaozhi Chen, Kaustav Kundu, Ziyu Zhang, Huimin Ma, Sanja Fidler, and Raquel Urtasun. 2016. Monocular 3d object detection for autonomous driving. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2147–2156.
[8]
Jin-Kyu Choi, Yong-Tae Lee, HeaSook Park, BongSoo Kim, and Byung-Woon Kim. 2022. Challenges to the Development of Manned and Unmanned Combat Systems. In 2022 13th International Conference on Information and Communication Technology Convergence (ICTC). 2362–2364. https://doi.org/10.1109/ICTC55196.2022.9952483
[9]
Mingyu Ding, Yuqi Huo, Hongwei Yi, Zhe Wang, Jianping Shi, Zhiwu Lu, and Ping Luo. 2020. Learning depth-guided convolutions for monocular 3d object detection. In Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition workshops. 1000–1001.
[10]
Daoyong Fu, Songchen Han, Wei Li, and Hanren Lin. 2023. The Pose Estimation of the Aircraft on the Airport Surface Based on the Contour Features. IEEE Trans. Aerospace Electron. Systems 59, 2 (2023), 817–826. https://doi.org/10.1109/TAES.2022.3192220
[11]
Andreas Geiger, Philip Lenz, and Raquel Urtasun. 2012. Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite. In Conference on Computer Vision and Pattern Recognition (CVPR).
[12]
Tong He and Stefano Soatto. 2019. Mono3d++: Monocular 3d vehicle detection with two-scale 3d hypotheses and task priors. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 8409–8416.
[13]
Xinyu Huang, Xinjing Cheng, Qichuan Geng, Binbin Cao, Dingfu Zhou, Peng Wang, Yuanqing Lin, and Ruigang Yang. 2018. The apolloscape dataset for autonomous driving. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 954–960.
[14]
Peixuan Li, Huaici Zhao, Pengfei Liu, and Feidao Cao. 2020. Rtm3d: Real-time monocular 3d detection from object keypoints for autonomous driving. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part III 16. Springer, 644–660.
[15]
Shichao Li, Zengqiang Yan, Hongyang Li, and Kwang-Ting Cheng. 2021. Exploring intermediate representation for monocular vehicle pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1873–1883.
[16]
Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2117–2125.
[17]
Shu Liu, Lu Qi, Haifang Qin, Jianping Shi, and Jiaya Jia. 2018. Path aggregation network for instance segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 8759–8768.
[18]
Arsalan Mousavian, Dragomir Anguelov, John Flynn, and Jana Kosecka. 2017. 3d bounding box estimation using deep learning and geometry. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 7074–7082.
[19]
Mrunalini Nalamati, Ankit Kapoor, Muhammed Saqib, Nabin Sharma, and Michael Blumenstein. 2019. Drone Detection in Long-Range Surveillance Videos. In 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). 1–6. https://doi.org/10.1109/AVSS.2019.8909830
[20]
Gerhard Neuhold, Tobias Ollmann, Samuel Rota Bulo, and Peter Kontschieder. 2017. The mapillary vistas dataset for semantic understanding of street scenes. In Proceedings of the IEEE international conference on computer vision. 4990–4999.
[21]
Felix Nobis, Ehsan Shafiei, Phillip Karle, Johannes Betz, and Markus Lienkamp. 2021. Radar voxel fusion for 3D object detection. Applied Sciences 11, 12 (2021), 5598.
[22]
Adrian P. Pope, Jaime S. Ide, Daria Micovic, Henry Díaz, David Rosenbluth, Lee Ritholtz, Jason C. Twedt, Thayne T. Walker, Kevin Alcedo, and Daniel Javorsek. 2021. Hierarchical Reinforcement Learning for Air-to-Air Combat. CoRR abs/2105.00990 (2021). arXiv:2105.00990https://arxiv.org/abs/2105.00990
[23]
Arne Schumann, Lars Sommer, Johannes Klatte, Tobias Schuchert, and Jürgen Beyerer. 2017. Deep cross-domain flying obrazilject classification for robrazilust UAV detection. In 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). 1–6. https://doi.org/10.1109/AVSS.2017.8078558
[24]
Lars Sommer, Arne Schumann, Thomas Müller, Tobrazilias Schuchert, and Jürgen Beyerer. 2017. Flying object detection for automatic UAV recognition. In 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). 1–6. https://doi.org/10.1109/AVSS.2017.8078557
[25]
Nian Wang, Zhe Zhang, Jing Xiao, and Li Cui. 2019. DeepLap: A deep learning based non-specific low back pain symptomatic muscles recognition system. In 2019 16th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON). IEEE, 1–9.
[26]
Di Wu, Zhaoyong Zhuang, Canqun Xiang, Wenbin Zou, and Xia Li. 2019. 6d-vnet: End-to-end 6-dof vehicle pose estimation from monocular rgb images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 0–0.
[27]
Jie Xu, Qing Guo, Lei Xiao, Zhaoyi Li, and Gaowei Zhang. 2019. Autonomous Decision-Making Method for Combat Mission of UAV based on Deep Reinforcement Learning. In 2019 IEEE 4th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Vol. 1. 538–544. https://doi.org/10.1109/IAEAC47372.2019.8998066
[28]
Jaewoong Yoo, Hyunki Seong, David Hyunchul Shim, Jung Ho Bae, and Yong-Duk Kim. 2022. Deep Reinforcement Learning-based Intelligent Agent for Autonomous Air Combat. In 2022 IEEE/AIAA 41st Digital Avionics Systems Conference (DASC). 1–9. https://doi.org/10.1109/DASC55683.2022.9925811

Index Terms

  1. Monocular 3D Pose Estimation of Very Small Airplane in the Air
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image ACM Conferences
            MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in Asia
            December 2023
            745 pages
            ISBN:9798400702051
            DOI:10.1145/3595916
            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Sponsors

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            Published: 01 January 2024

            Permissions

            Request permissions for this article.

            Check for updates

            Author Tags

            1. 3D pose estimation
            2. Dataset
            3. Monocular camera
            4. Small airplane

            Qualifiers

            • Research-article
            • Research
            • Refereed limited

            Funding Sources

            Conference

            MMAsia '23
            Sponsor:
            MMAsia '23: ACM Multimedia Asia
            December 6 - 8, 2023
            Tainan, Taiwan

            Acceptance Rates

            Overall Acceptance Rate 59 of 204 submissions, 29%

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • 0
              Total Citations
            • 83
              Total Downloads
            • Downloads (Last 12 months)44
            • Downloads (Last 6 weeks)3
            Reflects downloads up to 28 Feb 2025

            Other Metrics

            Citations

            View Options

            Login options

            View options

            PDF

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            HTML Format

            View this article in HTML Format.

            HTML Format

            Figures

            Tables

            Media

            Share

            Share

            Share this Publication link

            Share on social media