research-article

An Effective Imaging System for 3D Detection of Occluded Objects

Authors:

Kaile ZhangAuthors Info & Claims

ICIGP '21: Proceedings of the 2021 4th International Conference on Image and Graphics Processing

Pages 20 - 30

https://doi.org/10.1145/3447587.3447591

Published: 04 June 2021 Publication History

Abstract

Occluded objects detection is a challenge task in computer vision. To address this problem, this paper proposes an effective light field imaging system for occluded objects 3D detection, which integrates digital refocus methods to imaging occluded objects and deep learning based method to located objects position with defocus clues. Camera arrays based integral imaging system could provide focal stacks images, which makes occluded objects more clear and attenuates foreground occlusion. With observation that recognition probability are related to objects clarity, as well as focal length of images, recognition probability based defocus clues are proposed to located objects depth. Hierarchical object localization process is applied on refocus images stacks to coarsely located object depth by detected probabilities, following gradient based fine-grained defocus response process could further refine the depth accuracy. With the depths from defocus clues and detected locations from neural model, proposed algorithm could achieve 3D object detection under partial occlusion. Furthermore, a parallel computation framework is proposed to accelerate whole detection process. Real experiments show the robust performance of proposed 3D occluded objects detection algorithm.

References

[1]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097–1105.

[2]

Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition . 580–587.

Digital Library

[3]

Ross Girshick. 2015. Fast r-cnn. In Proceedings of the IEEE international conference on computer vision. 1440–1448.

Digital Library

[4]

Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems. 91–99.

[5]

Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 779–788.

[6]

Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C Berg. 2016. Ssd: Single shot multibox detector. In European conference on computer vision. Springer, 21–37

[7]

Xiaozhi Chen, Kaustav Kundu, Yukun Zhu, Huimin Ma, Sanja Fidler, and Raquel Urtasun. 2017. 3d object proposals using stereo imagery for accurate object class detection. IEEE transactions on pattern analysis and machine intelligence 40, 5 (2017), 1259–1272.

[8]

Jung-Un Kim, Jihong Min, and Hang-Bong Kang. 2017. 3D Object Detection Method Using LiDAR Information in Multiple Frames. In International Conference on Image Analysis and Processing. Springer, 276–286.

[9]

Jung-Un Kim and Hang-Bong Kang. 2017. LiDAR Based 3D object detection using CCD information. In 2017 IEEE Third International Conference on Multimedia Big Data (BigMM). IEEE, 303–309.

[10]

Shuran Song and Jianxiong Xiao. 2016. Deep sliding shapes for amodal 3d object detection in rgb-d images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 808–816.

[11]

Shuran Song and Jianxiong Xiao. 2014. Sliding shapes for 3d object detection in depth images. In European conference on computer vision. Springer, 634–651.

[12]

Zhuo Deng and Longin Jan Latecki. 2017. Amodal detection of 3d objects: Inferring 3d bounding boxes from 2d ones in rgb-depth images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5762–5770.

[13]

Zhile Ren and Erik B Sudderth. 2016. Three-dimensional object detection and layout prediction using clouds of oriented gradients. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1525–1533.

[14]

Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. 2017. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 652–660.

[15]

Charles Ruizhongtai Qi, Li Yi, Hao Su, and Leonidas J Guibas. 2017. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In Advances in neural information processing systems. 5099–5108.

[16]

Xiaozhi Chen, Kaustav Kundu, Ziyu Zhang, Huimin Ma, Sanja Fidler, and Raquel Urtasun. 2016. Monocular 3d object detection for autonomous driving. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . 2147–2156.

[17]

Angel X Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, 2015. Shapenet: An information-rich 3d model repository. arXiv preprint arXiv:1512.03012 (2015).

[18]

Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, and Jianxiong Xiao. 2015. 3D ShapeNets: A deep representation for volumetric shapes. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (Jun 2015). https://doi.org/10.1109/cvpr.2015.7298801

[19]

Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, and Matthias Niener. 2017. ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. In Proc. Computer Vision and Pattern Recognition (CVPR), IEEE .

[20]

Ren Ng 2006. Digital light field photography. Vol. 7. stanford university Stanford.

[21]

MG Lippmann. 1908. La photographie integrals. Compt. rend. 146 (1908), 446–451.

[22]

Vaibhav Vaish, Marc Levoy, Richard Szeliski, C Lawrence Zitnick, and Sing Bing Kang. 2006. Reconstructing occluded surfaces using synthetic apertures: Stereo, focus and robust measures. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), Vol. 2. IEEE, 2331–2338.

Digital Library

[23]

Todor Georgiev, Zhan Yu, Andrew Lumsdaine, and Sergio Goma. 2013. Lytro camera technology: theory, algorithms, performance analysis. In Multimedia Content and Mobile Devices, Vol. 8667. International Society for Optics and Photonics, 86671J.

[24]

Edward H Adelson, James R Bergen, 1991. The plenoptic function and the elements of early vision. Vol. 2. Vision and Modeling Group, Media Laboratory, Massachusetts Institute of Technology.

[25]

Shree K Nayar and Yasuo Nakagawa. 1994. Shape from focus. IEEE Transactions on Pattern analysis and machine intelligence 16, 8 (1994), 824–831.

Digital Library

[26]

Vaibhav Vaish, Bennett Wilburn, Neel Joshi, and Marc Levoy. 2004. Using plane+ parallax for calibrating dense camera arrays. In Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., Vol. 1. IEEE, I–I.

[27]

Donald G Dansereau, Oscar Pizarro, and Stefan B Williams. 2013. Decoding, calibration and rectification for lenseletbased plenoptic cameras. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1027–1034.

Cited By

Chang SSiu MLi H(2023)Development of a Fuzzy Logic Controller for Autonomous Navigation of Building Inspection Robots in Unknown EnvironmentsJournal of Computing in Civil Engineering10.1061/JCCEE5.CPENG-506037:4Online publication date: Jul-2023
https://doi.org/10.1061/JCCEE5.CPENG-5060
Kaur JSingh W(2023)A systematic review of object detection from images using deep learningMultimedia Tools and Applications10.1007/s11042-023-15981-y83:4(12253-12338)Online publication date: 24-Jun-2023
https://dl.acm.org/doi/10.1007/s11042-023-15981-y

Recommendations

Deep eyes: Joint depth inference using monocular and binocular cues
Abstract
Human visual system relies on both monocular focusness cues and binocular stereo cues to gain effective 3D perception. Correspondingly, depth from focus/defocus (DfF/DfD) and stereo matching are two most studied passive depth sensing schemes, ...
Single-photon 3D imaging with deep sensor fusion

Sensors which capture 3D scene information provide useful data for tasks in vehicle navigation, gesture recognition, human pose estimation, and geometric reconstruction. Active illumination time-of-flight sensors in particular have become widely used to ...
Object as Hotspots: An Anchor-Free 3D Object Detection Approach via Firing of Hotspots
Computer Vision – ECCV 2020
Abstract
Accurate 3D object detection in LiDAR based point clouds suffers from the challenges of data sparsity and irregularities. Existing methods strive to organize the points regularly, e.g. voxelize, pass them through a designed 2D/3D neural network, ... $^{}$

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICIGP '21: Proceedings of the 2021 4th International Conference on Image and Graphics Processing

January 2021

231 pages

ISBN:9781450389105

DOI:10.1145/3447587

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 June 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Shanghai Rising Stars of Medical Talent Youth Development Program
Youth Program of Zhejiang Provincial Natural Science Foundation of China
Shanghai Jiao Tong University Biomedical Engineering Cross Research Foundation
National Key Research Development Program of China
National Natural Science Fund of China

Conference

ICIGP 2021

ICIGP 2021: 2021 The 4th International Conference on Image and Graphics Processing

January 1 - 3, 2021

Sanya, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
104
Total Downloads

Downloads (Last 12 months)9
Downloads (Last 6 weeks)0

Reflects downloads up to 19 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Chang SSiu MLi H(2023)Development of a Fuzzy Logic Controller for Autonomous Navigation of Building Inspection Robots in Unknown EnvironmentsJournal of Computing in Civil Engineering10.1061/JCCEE5.CPENG-506037:4Online publication date: Jul-2023
https://doi.org/10.1061/JCCEE5.CPENG-5060
Kaur JSingh W(2023)A systematic review of object detection from images using deep learningMultimedia Tools and Applications10.1007/s11042-023-15981-y83:4(12253-12338)Online publication date: 24-Jun-2023
https://dl.acm.org/doi/10.1007/s11042-023-15981-y

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten