Skip to main content

Advertisement

Log in

On automatic camera shooting systems via PTZ control and DNN-based visual sensing

  • Original Research Paper
  • Published:
Intelligent Service Robotics Aims and scope Submit manuscript

Abstract

In this paper, we propose a framework of pan-tilt-zoom control and DNN-based visual sensing to establish an automatic camera shooting system that can partially be capable to autonomously execute photograph and videoing tasks in the real-world application scenarios, including human figure photography in commercial and entertaining sites. Based on the technique of real-time visual detection and tracking with the techniques of Kalman filter and re-identification, we propose a camera control system with a continuous composition of lens, based on the photographic and videoing strategies, including the atomic rules of shots, and the trajectory planning of the camera, to generate the proportional–integral–derivative controller to the pan-tilt. We design and produce an artificial intelligence (AI) automatic camera for lively photography and clip videoing in open circumstances. We illustrate the efficiency of the proposed methods by simulation of photograph cropping and emulation with the models of AI automatic camera that we developed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig.6
Fig.7
Fig. 8
Fig.9
Fig. 10
Fig. 11
Fig. 12
Fig.13
Fig. 14
Fig. 15

Similar content being viewed by others

References

  1. Joubert N, Goldman DB, Berthouzoz F, Roberts M, Landay JA, Hanrahan P (2016) Towards a drone cinematographer: guiding quadrotor cameras using visual composition principles. In: SIGGRAPH Asia

  2. Levinson J, Thrun S (2013) Automatic online calibration of cameras and lasers. Robot Sci Syst 2(7)

  3. Lino C, Christie M (2015) Intuitive and efficient camera control with the toric space. ACM Trans Gr 34(4):1–12. https://doi.org/10.1145/2766965

    Article  Google Scholar 

  4. Xie K, Yang H, Huang S (2018) Creating and chaining camera moves for quadrotor videography. ACM Trans Gr 37(4):1–13. https://doi.org/10.1145/3197517.3201284

    Article  Google Scholar 

  5. Dinh T, Yu Q, Medioni G (2009) Real time tracking using an active Pan-Tilt-Zoom network camera. In: IEEE/RSJ International Conference on IEEE, pp. 3786–3793 (2009) https://doi.org/10.1109/IROS.2009.5353915

  6. Joubert N, Roberts M, Troung A et al (2015) An interactive tool for designing quadrotor camera shots. SIGGRAGH Asia 34(6):238. https://doi.org/10.1145/2816795.2818106

    Article  Google Scholar 

  7. Sampedro C, Martinez C, Chauhan A, Campoy P (2014) A supervised approach to electric tower detection and classification for power line inspection. In: 2014 international joint conference on neural networks (IJCNN). pp. 1970–1977 (2014) https://doi.org/10.1109/IJCNN.2014.6889836

  8. Yan N, Zhou T, Gu C, et al (2020) Instance segmentation model for substation equipment based on Mask R-CNN. In: International conference on electrical engineering and control technologies, pp. 192–198. https://doi.org/10.1109/CEECT50755.2020.9298600

  9. Zhang Y, Yuan X, Fang Y, Chen S (2017) UAV low altitude photogrammetry for power line inspection. Int J Geo-Inf. https://doi.org/10.3390/ijgi6010014

    Article  Google Scholar 

  10. He LW, Cohen MF, Salesin DH (1996) The virtual cinematographer: a paradigm for automatic real-time camera control and directing. In: Proceedings of the 23rd annual conference on Computer graphics and interactive techniques, pp. 217–224 (1996) https://doi.org/10.1145/237170.237259

  11. Girshick R (2015) Fast R-CNN. In: 2015 IEEE international conference on computer vision (ICCV), pp. 1440–1449 (2015) https://doi.org/10.1109/ICCV.2015.169

  12. He K, Gkioxari G, Dollár P et al (2017) Mask R-CNN. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2018.2844175

    Article  Google Scholar 

  13. Huang Z, Huang L, Gong Y, et al (2019) Mask scoring R-CNN. CVPR, pp. 6409–6418

  14. Liu W, Anguelov D, Erhan D, et al (2016) SSD: single shot multibox detector. In: Computer Vision – ECCV, 9905:21–37. https://doi.org/10.1007/978-3-319-46448-02

  15. Redmon J, Divvala S, Girshick R, et al (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779–788. https://doi.org/10.1109/CVPR.2016.91

  16. Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In: CVPR, pp. 7263–7271 https://doi.org/10.1109/CVPR.2017.690

  17. Redmon J, Farhadi A (2018) YOLOv3: an incremental improvement. In: Computer Science

  18. Ren S, He K, Girshick R et al (2017) Faster R-CNN: towards real-Time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031

    Article  Google Scholar 

  19. Bertinetto L, Valmadre J, Henriques JF, et al (2016) Fully-convolutional siamese networks for object tracking. In: Computer Vision – ECCV 2016 Workshops. 9914:860–865 (2016) https://doi.org/10.1007/978-3-319-48881-3_56

  20. Bo L, Yan J, Wu W (2018) High performance visual tracking with siamese region proposal Network. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 8971–8980. https://doi.org/10.1109/CVPR.2018.00935

  21. Bochinski E, Eiselein V, Sikora T (2017) High-Speed tracking-by-detection without using image information. In: IEEE international conference on advanced video and signal based surveillance. https://doi.org/10.1109/AVSS.2017.8078516

  22. Chen L, Ai H, Zhuang Z, Shang C (2018) Real-time multiple people tracking with deeply learned candidate selection and person re-Identification. In: IEEE international conference on multimedia and expo. https://doi.org/10.1109/ICME.2018.8486597

  23. Feng W, Hu Z, Wu W, et al (2019) Multi-Object tracking with multiple cues and Switcher-Aware classification. https://arxiv.org/abs/1901.06129

  24. Tiantian Y, Guodong Y, Junzhi Y (2017) Feature fusion based insulator detection for aerial inspection. In: Chinese control conference CCC, pp. 10972–10979. https://doi.org/10.23919/ChiCC.2017.8029108

  25. Assa J, Cohen-Or D, Yeh IC et al (2008) Motion overview of human actions. ACM Trans Gr 27(5):1–10. https://doi.org/10.1145/1409060.1409068

    Article  Google Scholar 

  26. Rhodes C, Morari M, Tsimring LS et al (1997) Data-based control trajectory planning for nonlinear systems. Phys Rev E 56(3):2398–2406. https://doi.org/10.1103/PhysRevE.56.2398

    Article  Google Scholar 

  27. Yang C, Li Z, Li J (2013) Trajectory planning and optimized adaptive control for a class of wheeled inverted pendulum vehicle models. IEEE Trans Cybern 43(1):24–36. https://doi.org/10.1109/TSMCB.2012.2198813

    Article  Google Scholar 

  28. Kyrkou C (2021) C3NET: end-to-end deep learning for efficient real-time visual active camera control. https://arxiv.org/pdf/2107.13233.pdf

  29. Kyrkou C (2020) Imitation-based active camera control with deep convolutional neural network. In: IEEE 4th international conference on image processing, applications and systems. https://doi.org/10.1109/IPAS50080.2020.9334958

  30. Brady DJ, Fang L, Ma Z (2020) Deep learning for camera data acquisition, control, and image estimation. Adv Optics Photonics 12(4):787–846. https://doi.org/10.1364/AOP.398263

    Article  Google Scholar 

  31. Fleck S, Straßer W (2008) Smart camera based monitoring system and its application to assisted living. Proc IEEE 96(10):1698–1714

    Article  Google Scholar 

  32. Chen X, Fang H, Lin TY, et al (2015) Microsoft COCO captions: data collection and evaluation server. https://arxiv.org/abs/1504.00325

  33. Bourdev L, Brandt J (2005) Robust object detection via soft cascade. CVPR. https://doi.org/10.1109/CVPR.2005.310

    Article  Google Scholar 

  34. Pham MT, Cham TJ (2007) Fast training and selection of haar features using statistics in boosting-based face detection. In: 2007 IEEE 11th international conference on computer vision. IEEE. https://doi.org/10.1109/ICCV.2007.4409038

  35. Liao S, Jain AK, Li SZ (2016) A fast and accurate unconstrained face detector. PAMI. https://doi.org/10.1109/TPAMI.2015.2448075

    Article  Google Scholar 

  36. Yan J, Zhen L, Wen L, Li SZ (2014) The fastest deformable part model for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp.2497–2504 (2014) https://doi.org/10.1109/CVPR.2014.320

  37. Zhang S, Zhu X, Lei Z et al (2019) FaceBoxes: a CPU real-time face detector with high accuracy. Neurocomputing 364:297–309. https://doi.org/10.1109/BTAS.2017.8272675

    Article  Google Scholar 

  38. Kalman RE (1960) A new approach to linear filtering and prediction problems. J Basic Eng 82(1):35–45. https://doi.org/10.1115/1.3662552

    Article  MathSciNet  Google Scholar 

  39. Wang N, Yeung DY (2013) Learning a deep compact image representation for visual tracking. In: NIPS. Curran Associates Inc

  40. Christianson D (1996) Declarative camera control for automatic cinematography. In: Proceedings os AAAI'96, Volume 1

  41. Wang J, Sun A, Zheng C, Wang J (2010) Research on a new crawler type inspection robot for power transmission lines. In: 2010 1st international conference on applied robotics for the power industry CARPI, pp. 1–5. https://doi.org/10.1109/CARPI.2010.5624471

  42. Alhassan AB, Zhang X, Shen H, Xu H (2020) Power transmission line inspection robots: a review, trends and challenges for future research. Int J Electr Power Energy Syst 118:105862. https://doi.org/10.1016/j.ijepes.2020.105862

    Article  Google Scholar 

  43. Deng C, Liu JY, Liu YB, Tan YY (2016) Real time autonomous transmission line following system for quadrotor helicopters. In: Int Conf Smart Grid Clean Energy Technol ICSGCE, pp. 61–64 (2016) https://doi.org/10.1109/ICSGCE.2016.7876026

  44. Katrašnik J, Pernuš F, Likar B (2010) A climbing-flying robot for power line inspection. In: InTech, pp. 95–110

  45. Patel AR, Patel MA, Vyas DR (2012) Modeling and analysis of quadrotor using sliding mode control. In: 44th IEEE southeast symposium system theory, pp. 111–114. https://doi.org/10.1109/SSST.2012.6195140

  46. Wronkowicz A (2016) Vision diagnostics of power transmission lines: approach to recognition of insulators. In: Proc 9th Int Conf Comput Recognit Syst CORES 2015, Advances in Intelligent Systems and Computing, 403:431–440

  47. Oluwatosin OP, Syed SA, Apis O (2021) Application of computer vision in pipeline inspection robot. In: Proceedings of the 11th annual international conference on industrial engineering and operations management, Singapore

  48. Huang J, Wang J, Tan Y, Wu D, Cao Y (2020) An automatic analog instrument reading system using computer vision and inspection robot. IEEE Trans Instrum Measurement 69(9):6322–6335. https://doi.org/10.1109/TIM.2020.2967956

    Article  Google Scholar 

  49. Huang Y, Xiong S, Liao Y (2021) Research on fire inspection robot based on computer vision. IOP Conf Ser Earth Environ Sci. https://doi.org/10.1088/1755-1315/632/5/052066

    Article  Google Scholar 

  50. Dinh TH, Ha QP, La HM (2016) Computer vision-based method for concrete crack detection. In: 2016 14th international conference on control, automation, robotics and vision (ICARCV), pp. 1–6. https://doi.org/10.1109/ICARCV.2016.7838682

  51. Oh J, Jang G, Oh S et al (2009) Bridge inspection robot system with machine vision. Autom Constr 18(7):929–941. https://doi.org/10.1016/j.autcon.2009.04.003

    Article  Google Scholar 

  52. Pflugfelder R, Mičušík B (2010) Self-calibrating cameras in video surveillance. In: Belbachir A (ed) Smart cameras. Springer, Berlin

    Google Scholar 

  53. Maggiani L, Salvadori C, Petracca M, Pagano P, Saletti R (2013) Reconfigurable FPGA architecture for computer vision applications in Smart Camera Networks. In: 2013 seventh international conference on distributed smart cameras (ICDSC), pp. 1–6 (2013) https://https://doi.org/10.1109/ICDSC.2013.6778212

  54. Magno M, Tombari F, Brunelli D, Di Stefano L, Benini L (2013) Multimodal video analysis on self-powered resource-limited wireless smart camera. IEEE J Emerg Sel Top Circuits Syst 3(2):223–235

    Article  Google Scholar 

  55. Senouci B, Charfi I, Heyrman B et al (2016) Fast prototyping of a SoC-based smart-camera: a real-time fall detection case study. J Real-Time Image Proc 12:649–662

    Article  Google Scholar 

  56. Liu G, Shi H, Kiani A et al (2022) Smart traffic monitoring system using computer vision and edge computing. IEEE Trans Intell Transp Syst 23(8):12027–12038. https://doi.org/10.1109/TITS.2021.3109481

    Article  Google Scholar 

  57. Amato G, Bolettieri P, Moroni D, et al (2018) A wireless smart camera network for parking monitoring. In: IEEE Globecom Workshops (GC Wkshps), pp. 1–6, (2018) https://doi.org/10.1109/GLOCOMW.2018.8644226

  58. Arijon D (1976) Grammar of the film language. Communication Arts Books, Hastings House Publishers, New York

    Google Scholar 

  59. Wonham WM (1968) On the separation theorem of stochastic control. SIAM Journal on Control 6(2):312–326. https://doi.org/10.1137/0306023

    Article  MathSciNet  MATH  Google Scholar 

Download references

Funding

The presented work was supported by the Shanghai Municipal Science and Technology Major Project (No. 2018SHZDZX01) and the ZHANGJIANG LAB, Support of Science and Technology Project (No. 52094022004W) by State grid Shanghai Electric Power Company.

Author information

Authors and Affiliations

Authors

Contributions

YR: conceptualization, methodology, programming, computation, model analysis, writing. NY, XY, FT, QT and YW: technique and resources, coding, investigation. WL: conceptualization, methodology, writing and editing, supervision, funding acquisition.

Corresponding author

Correspondence to Wenlian Lu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This paper does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (MP4 24563 KB)

Supplementary file2 (MP4 78291 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ren, Y., Yan, N., Yu, X. et al. On automatic camera shooting systems via PTZ control and DNN-based visual sensing. Intel Serv Robotics 16, 265–285 (2023). https://doi.org/10.1007/s11370-023-00462-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11370-023-00462-w

Keywords

Navigation