Article

Encoder-Decoder based Neural Network for Perspective Estimation

Authors:
Yutong Wang

Sungkyunkwan University and Northwestern Polytechnical University, Republic of Korea

Sungkyunkwan University and Northwestern Polytechnical University, Republic of Korea
View Profile

,
Qi Zhang

Northwestern Polytechnical University, China

Northwestern Polytechnical University, China
View Profile

,
Joongkyu Kim

Sungkyunkwan University, Republic of Korea

Sungkyunkwan University, Republic of Korea
View Profile

,
Huifang Li

Northwestern Polytechnical University, China

Northwestern Polytechnical University, China
View Profile

IPMV '21: Proceedings of the 2021 3rd International Conference on Image Processing and Machine VisionMay 2021Pages 42–46https://doi.org/10.1145/3469951.3469967

Published:21 August 2021Publication History

IPMV '21: Proceedings of the 2021 3rd International Conference on Image Processing and Machine Vision

Pages 42–46

References

Yao, Y., Zheng, L., Yang, X., Naphade, M., & Gedeon, T. 2019. Simulating content consistent vehicle datasets with attribute descent. arXiv.Google Scholar
Shi, M., Yang, Z., Xu, C., & Chen, Q. 2019. Revisiting perspective information for efficient crowd counting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7279–7288.Google ScholarCross Ref
Yang, B. (2020). Learning to reconstruct and segment 3D objects. arXiv.Google Scholar
Yan, Z., Yuan, Y., Zuo, W., Tan, X., Wang, Y., Wen, S., & Ding, E. 2019. Perspective-guided convolution networks for crowd counting. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 952–961.Google ScholarCross Ref
Mostajabi, M., Maire, M., & Shakhnarovich, G. 2018. Regularizing deep networks by modeling and predicting label structure. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5629–5638.Google ScholarCross Ref
Zhang, C., Li, H., Wang, X., & Yang, X. 2015. Cross-scene crowd counting via deep convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, 833–841.Google ScholarCross Ref
He, K., Zhang, X., Ren, S., & Sun, J. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778.Google ScholarCross Ref
Zhang, Y., Zhou, D., Chen, S., Gao, S., & Ma, Y. 2016. Single-image crowd counting via multi-column convolutional neural network. In Proceedings of the IEEE conference on computer vision and pattern recognition, 589–597.Google ScholarCross Ref
Li, Y., Zhang, X., & Chen, D. (2018). Csrnet: Dilated convolutional neural networks for understanding the highly congested scenes. In Proceedings of the IEEE conference on computer vision and pattern recognition, 1091–1100.Google ScholarCross Ref
Ma, Z., Wei, X., Hong, X., & Gong, Y. 2019. Bayesian loss for crowd count estimation with point supervision. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 6142–6151.Google ScholarCross Ref
Liu, L., Qiu, Z., Li, G., Liu, S., Ouyang, W., & Lin, L. 2019. Crowd counting with deep structured scale integration network. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 1774–1783.Google ScholarCross Ref
Liu, Y., Shi, M., Zhao, Q., & Wang, X. 2019. Point in, box out: Beyond counting persons in crowds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6469–6478.Google ScholarCross Ref
Hossain, M., Hosseinzadeh, M., Chanda, O., & Wang, Y. 2019, January. Crowd counting using scale-aware attention networks. In 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), 1280–1288.Google ScholarCross Ref
Sindagi, V. A., & Patel, V. M. 2017. Generating high-quality crowd density maps using contextual pyramid cnns. In Proceedings of the IEEE international conference on computer vision, 1861–1870.Google ScholarCross Ref
Liu, N., Long, Y., Zou, C., Niu, Q., Pan, L., & Wu, H. 2019. Adcrowdnet: An attention-injective deformable convolutional network for crowd understanding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3225–3234.Google ScholarCross Ref
Sam, D. B., Surya, S., & Babu, R. V. 2017, July. Switching convolutional neural network for crowd counting. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 4031–4039.Google ScholarCross Ref
Ku J, Pon A D, Waslander S L. Monocular 3d object detection leveraging accurate proposals and shape reconstruction[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 11867-11876.Google Scholar
Avetisyan A, Dahnert M, Dai A, Scan2cad: Learning cad model alignment in rgb-d scans[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 2614-2623.Google Scholar
Xie H, Yao H, Sun X, Pix2vox: Context-aware 3d reconstruction from single and multi-view images[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019: 2690-2698.Google Scholar
Zubić N, Liò P. An Effective Loss Function for Generating 3D Models from Single 2D Image without Rendering[J]. arXiv preprint arXiv:2103.03390, 2021.Google Scholar
Tatarchenko M, Richter S R, Ranftl R, What do single-view 3d reconstruction networks learn?[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 3405-3414.Google Scholar
Murez Z, van As T, Bartolozzi J, Atlas: End-to-end 3d scene reconstruction from posed images[J]. arXiv preprint arXiv:2003.10432, 2020.Google Scholar
Ke L, Li S, Sun Y, GSNet: Joint Vehicle Pose and Shape Reconstruction with Geometrical and Scene-aware Supervision[C]//European Conference on Computer Vision. Springer, Cham, 2020: 515-532.Google Scholar
Idrees H, Saleemi I, Seibert C, Multi-source multi-scale counting in extremely dense crowd images[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2013: 2547-2554.Google Scholar
Idrees H, Tayyab M, Athrey K, Composition loss for counting, density map estimation and localization in dense crowds[C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 532-546.Google Scholar
Sindagi V, Yasarla R, Patel V M M. Jhu-crowd++: Large-scale crowd counting dataset and a benchmark method[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020.Google ScholarCross Ref
Wang Q, Gao J, Lin W, NWPU-crowd: A large-scale benchmark for crowd counting and localization[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020.Google Scholar
Chen X, Ma H, Wan J, Multi-view 3d object detection network for autonomous driving[C]//Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 2017: 1907-1915.Google Scholar
Romero A, León J, Arbeláez P. Multi-view dynamic facial action unit detection[J]. Image and Vision Computing, 2018: 103723.Google Scholar
Mitash C, Bekris K E, Boularias A. A self-supervised learning system for object detection using physics simulation and multi-view pose estimation[C]//2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2017: 545-551.Google Scholar
Farfade S S, Saberian M J, Li L J. Multi-view face detection using deep convolutional neural networks[C]//Proceedings of the 5th ACM on International Conference on Multimedia Retrieval. 2015: 643-650.Google Scholar
Wang G, Tian B, Zhang Y, Multi-View Adaptive Fusion Network for 3D Object Detection[J]. arXiv preprint arXiv:2011.00652, 2020.Google Scholar
Li Z, Zhang S, Zhang J, MVP-Net: Multi-view FPN with position-aware attention for deep universal lesion detection[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, Cham, 2019: 13-21.Google Scholar
Kuehlkamp A, Pinto A, Rocha A, Ensemble of multi-view learning classifiers for cross-domain iris presentation attack detection[J]. IEEE Transactions on Information Forensics and Security, 2018, 14(6): 1419-1431.Google ScholarCross Ref
Yan K, Bagheri M, Summers R M. 3D context enhanced region-based convolutional neural network for end-to-end lesion detection[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, Cham, 2018: 511-519.Google Scholar
Artacho B, Savakis A. OmniPose: A Multi-Scale Framework for Multi-Person Pose Estimation[J]. arXiv preprint arXiv:2103.10180, 2021.Google Scholar
Lim M K, Kok V J, Loy C C, Crowd saliency detection via global similarity structure[C]//2014 22nd International Conference on Pattern Recognition. IEEE, 2014: 3957-3962.Google Scholar
Liu W, Salzmann M, Fua P. Estimating people flows to better count them in crowded scenes[C]//European Conference on Computer Vision. Springer, Cham, 2020: 723-740.Google Scholar
Hu Y, Jiang X, Liu X, Nas-count: Counting-by-density with neural architecture search[J]. arXiv preprint arXiv:2003.00217, 2020.Google Scholar
Liu L, Qiu Z, Li G, Crowd counting with deep structured scale integration network[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019: 1774-1783.Google Scholar

Recommendations

An analysis of precision: occlusion and perspective geometry’s role in 6D pose estimation
Abstract
Achieving precise 6 degrees of freedom (6D) pose estimation of rigid objects from color images is a critical challenge with wide-ranging applications in robotics and close-contact aircraft operations. This study investigates key techniques in the ... $^{}$
Read More
Neural-network-based time-delay estimation

A novel approach for estimating constant time delay through the use of neural networks (NN) is introduced. A desired reference signal and a delayed, damped, and noisy replica of it are both filtered by a fourth-order digital infinite impulse response (...
Read More
Non-conventional Transformers Cost Estimation Using Neural Network
ICFPEE '10: Proceedings of the 2010 International Conference on Future Power and Energy Engineering

Since the cost of transformer can be divided into 50-60% for material, and the rest being labor costs and modest profit, therefore as the major amount of transformers costs is related to its raw materials, so it has a high importance in costs estimating ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

IPMV '21: Proceedings of the 2021 3rd International Conference on Image Processing and Machine Vision
May 2021
87 pages
ISBN:9781450390040
DOI:10.1145/3469951

Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 August 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Perspective distortion
Perspective estimation network
Perspective value
Qualifiers
- Article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 59
  Total Downloads
- Downloads (Last 12 months)11
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Encoder-Decoder based Neural Network for Perspective Estimation

IPMV '21: Proceedings of the 2021 3rd International Conference on Image Processing and Machine Vision

References

Cited By

Recommendations

An analysis of precision: occlusion and perspective geometry’s role in 6D pose estimation

Neural-network-based time-delay estimation

Non-conventional Transformers Cost Estimation Using Neural Network

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Encoder-Decoder based Neural Network for Perspective Estimation

IPMV '21: Proceedings of the 2021 3rd International Conference on Image Processing and Machine Vision

References

Cited By

Recommendations

An analysis of precision: occlusion and perspective geometry’s role in 6D pose estimation

Neural-network-based time-delay estimation

Non-conventional Transformers Cost Estimation Using Neural Network

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media