skip to main content
research-article

Pedestrian-Aware Panoramic Video Stitching Based on a Structured Camera Array

Published: 12 November 2021 Publication History

Abstract

The panorama stitching system is an indispensable module in surveillance or space exploration. Such a system enables the viewer to understand the surroundings instantly by aligning the surrounding images on a plane and fusing them naturally. The bottleneck of existing systems mainly lies in alignment and naturalness of the transition of adjacent images. When facing dynamic foregrounds, they may produce outputs with misaligned semantic objects, which is evident and sensitive to human perception. We solve three key issues in the existing workflow that can affect its efficiency and the quality of the obtained panoramic video and present Pedestrian360, a panoramic video system based on a structured camera array (a spatial surround-view camera system). First, to get a geometrically aligned 360○ view in the horizontal direction, we build a unified multi-camera coordinate system via a novel refinement approach that jointly optimizes camera poses. Second, to eliminate the brightness and color difference of images taken by different cameras, we design a photometric alignment approach by introducing a bias to the baseline linear adjustment model and solving it with two-step least-squares. Third, considering that the human visual system is more sensitive to high-level semantic objects, such as pedestrians and vehicles, we integrate the results of instance segmentation into the framework of dynamic programming in the seam-cutting step. To our knowledge, we are the first to introduce instance segmentation to the seam-cutting problem, which can ensure the integrity of the salient objects in a panorama. Specifically, in our surveillance oriented system, we choose the most significant target, pedestrians, as the seam avoidance target, and this accounts for the name Pedestrian360. To validate the effectiveness and efficiency of Pedestrian360, a large-scale dataset composed of videos with pedestrians in five scenes is established. The test results on this dataset demonstrate the superiority of Pedestrian360 compared to its competitors. Experimental results show that Pedestrian360 can stitch videos at a speed of 12 to 26 fps, which depends on the number of objects in the shooting scene and their frequencies of movements. To make our reported results reproducible, the relevant code and collected data are publicly available at https://cslinzhang.github.io/Pedestrian360-Homepage/.

References

[1]
Vijay Badrinarayanan, Alex Kendall, and Roberto Cipolla. 2017. SegNet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 12 (2017), 2481–2495.
[2]
Zongwen Bai, Ying Li, Xiaohuan Chen, Tingting Yi, Wei Wei, Marcin Wozniak, and Robertas Damasevicius. 2020. Real-time video stitching for mine surveillance using a hybrid image registration Method. Electronics 9, 9 (2020), 1336.
[3]
Matthew Brown and David G. Lowe. 2003. Recognising panoramas. In Proceedings of the IEEE International Conference on Computer Vision. 1218–1225.
[4]
Matthew Brown and David G. Lowe. 2007. Automatic panoramic image stitching using invariant features. International Journal of Computer Vision 74, 1 (2007), 59–73.
[5]
Peter J. Burt and Edward H. Adelson. 1983. A multiresolution spline with application to image mosaics. ACM Transactions on Graphics 2, 4 (1983), 217–236.
[6]
Yu-Sheng Chen and Yung-Yu Chuang. 2016. Natural image stitching with the global similarity prior. In Proceedings of the European Conference on Computer Vision. 186–201.
[7]
Kyoungtaek Choi, Ho Gi Jung, and Jae Kyu Suhr. 2018. Automatic calibration of an around view monitor system exploiting lane markings. Sensors 18, 9 (2018), 2956.
[8]
Alexei A. Efros and William T. Freeman. 2001. Image quilting for texture synthesis and transfer. In Proceedings of the 28th International Conference on Computer Graphics and Interactive Techniques (SIGGRAPH’01). 341–346.
[9]
Keren Fu, Qijun Zhao, and Irene Yu-Hua Gu. 2018. Refinet: A deep segmentation assisted refinement network for salient object detection. IEEE Transactions on Multimedia 21, 2 (2018), 457–469.
[10]
Junhong Gao, Seon Joo Kim, and Michael S. Brown. 2011. Constructing image panoramas using dual-homography warping. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 49–56.
[11]
Yi Gao, Chunyu Lin, Yao Zhao, Xin Wang, Shikui Wei, and Qi Huang. 2017. 3-D surround view for advanced driver assistance systems. IEEE Transactions on Intelligent Transportation Systems 19, 1 (2017), 320–328.
[12]
Seung-Ryong Han, Jongsul Min, Taesung Park, and Yongje Kim. 2012. Photometric and geometric rectification for stereoscopic images. In Three-Dimensional Image Processing and Applications II, Vol. 8290. SPIE, 829007.
[13]
Botao He and Shaohua Yu. 2016. Parallax-robust surveillance video stitching. Sensors 16, 1 (2016), 7.
[14]
Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. 2017. Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision. 2961–2969.
[15]
Adam Hedi and Sven Lončarić. 2012. A system for vehicle surround view. IFAC Proceedings Volumes 45, 22 (2012), 120–125.
[16]
Lionel Heng, Mathias Burki, Gim Hee Lee, Paul Furgale, and Marc Pollefeys. 2014. Infrastructure-based calibration of a multi-camera rig. In Proceedings of the IEEE International Conference on Robotics and Automation. 4912–4919.
[17]
Lionel Heng, Bo Li, and Marc Pollefeys. 2013. Camodocal: Automatic intrinsic and extrinsic calibration of a rig with multiple generic cameras and odometry. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems. 1793–1800.
[18]
Jie Hu, Dong-Qing Zhang, Heather Yu, and Chang Wen Chen. 2015. Discontinuous seam cutting for enhanced video stitching. In Proceedings of the IEEE International Conference on Multimedia and Expo. 1–6.
[19]
Huaizu Jiang, Jingdong Wang, Zejian Yuan, Yang Wu, Nanning Zheng, and Shipeng Li. 2013. Salient object detection: A discriminative regional feature integration approach. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2083–2090.
[20]
Peng Jiang, Haibin Ling, Jingyi Yu, and Jingliang Peng. 2013. Salient region detection by UFO: Uniqueness, focusness and objectness. In Proceedings of the IEEE International Conference on Computer Vision. 1976–1983.
[21]
Jeonho Kang, Junsik Kim, Inhong Lee, and Kyuheon Kim. 2019. Minimum error seam-based efficient panorama video stitching method robust to parallax. IEEE Access 7 (2019), 167127–167140.
[22]
Lai Kang, Yingmei Wei, Jie Jiang, and Yuxiang Xie. 2019. Robust cylindrical panorama stitching for low-texture scenes based on image alignment using deep learning and iterative optimization. Sensors 19, 23 (2019), 5310.
[23]
Rainer Kümmerle, Giorgio Grisetti, Hauke Strasdat, Kurt Konolige, and Wolfram Burgard. 2011. G2o: A general framework for graph optimization. In Proceedings of the IEEE International Conference on Robotics and Automation. 3607–3613.
[24]
Vivek Kwatra, Arno Schödl, Irfan Essa, Greg Turk, and Aaron Bobick. 2003. Graphcut textures: Image and video synthesis using graph cuts. ACM Transactions on Graphics 22, 3 (2003), 277–286.
[25]
Jungjin Lee, Bumki Kim, Kyehyun Kim, Younghui Kim, and Junyong Noh. 2016. Rich360: Optimized spherical representation from structured panoramic camera arrays. ACM Transactions on Graphics 35, 4 (2016), 1–11.
[26]
Hongdong Li and Richard Hartley. 2006. Five-point motion estimation made easy. In Proceedings of the IEEE International Conference on Pattern Recognition, Vol. 1. 630–633.
[27]
Jiangeng Li, Minjie Fan, Guangsheng Wang, Xiaoli Li, and Rihui Sun. 2018. Panorama video stitching system based on VR Works 360 video. In Proceedings of the IEEE Chinese Automation Congress. 715–720.
[28]
Jia Li, Yifan Zhao, Weihua Ye, Kaiwen Yu, and Shiming Ge. 2019. Attentive deep stitching and quality assessment for omnidirectional images. IEEE Journal of Selected Topics in Signal Processing 14, 1 (2019), 209–221.
[29]
Nan Li, Tianli Liao, and Chao Wang. 2018. Perception-based seam cutting for image stitching. Signal, Image and Video Processing 12, 5 (2018), 967–974.
[30]
Tianli Liao, Jing Chen, and Yifang Xu. 2019. Quality evaluation-based iterative seam estimation for image stitching. Signal, Image and Video Processing 13, 6 (2019), 1199–1206.
[31]
Tianli Liao and Nan Li. 2019. Single-perspective warps in natural image stitching. IEEE Transactions on Image Processing 29 (2019), 724–735.
[32]
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. 2014. Microsoft COCO: Common objects in context. In Proceedings of the European Conference on Computer Vision. 740–755.
[33]
Wen-Yan Lin, Siying Liu, Yasuyuki Matsushita, Tian-Tsong Ng, and Loong-Fah Cheong. 2011. Smoothly varying affine stitching. In Proceedings of the IEEE International Conference on Computer Vision. 345–352.
[34]
Hanyu Liu, Chong Tang, Shaoen Wu, and Honggang Wang. 2011. Real-time video surveillance for large scenes. In Proceedings of the IEEE International Conference on Wireless Communications and Signal Processing. 1–4.
[35]
Qiongxin Liu, Xiangyang Su, Lei Zhang, and Hua Huang. 2020. Panoramic video stitching of dual cameras based on spatio-temporal seam optimization. Multimedia Tools and Applications 79, 5 (2020), 3107–3124.
[36]
Si Liu, Zhen Wei, Yao Sun, Xinyu Ou, Junyu Lin, Bin Liu, and Ming-Hsuan Yang. 2018. Composing semantic collage for image retargeting. IEEE Transactions on Image Processing 27, 10 (2018), 5032–5043.
[37]
Tie Liu, Zejian Yuan, Jian Sun, Jingdong Wang, Nanning Zheng, Xiaoou Tang, and Heung-Yeung Shum. 2010. Learning to detect a salient object. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 2 (2010), 353–367.
[38]
Koba Natroshvili and Kay-Ulrich Scholl. 2017. Automatic extrinsic calibration methods for surround view systems. In Proceedings of the IEEE Intelligent Vehicles Symposium. 82–88.
[39]
Nils Plath, Marc Toussaint, and Shinichi Nakajima. 2009. Multi-class image segmentation using conditional random fields and global classification. In Proceedings of the International Conference on Machine Learning. 817–824.
[40]
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems. 91–99.
[41]
Abhishek Sharma, Oncel Tuzel, and Ming-Yu Liu. 2014. Recursive context propagation network for semantic scene labeling. In Advances in Neural Information Processing Systems. 2447–2455.
[42]
Nasim Souly, Concetto Spampinato, and Mubarak Shah. 2017. Semi supervised semantic segmentation using generative adversarial network. In Proceedings of the IEEE International Conference on Computer Vision. 5688–5696.
[43]
Simon T. Y. Suen, Edmund Y. Lam, and Kenneth K. Y. Wong. 2006. Digital photograph stitching with optimized matching of gradient and curvature. In Digital Photography II, Vol. 6069. SPIE, 60690G.
[44]
Marius Tennøe, Espen Helgedagsrud, Mikkel Næss, Henrik Kjus Alstad, Håkon Kvale Stensland, Vamsidhar Reddy Gaddam, Dag Johansen, Carsten Griwodz, and Pål Halvorsen. 2013. Efficient implementation and processing of a real-time panorama video pipeline. In Proceedings of the IEEE International Symposium on Multimedia. 76–83.
[45]
Toshio Ueshiba and Fumiaki Tomita. 2002. Calibration of multi-camera systems using planar patterns. Sensors 8 (2002), 4.
[46]
Matthew Uyttendaele, Ashley Eden, and Richard Skeliski. 2001. Eliminating ghosting and exposure artifacts in image mosaics. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2. 509–516.
[47]
Lijun Wang, Huchuan Lu, Xiang Ruan, and Ming-Hsuan Yang. 2015. Deep networks for saliency detection via local estimation and global search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3183–3192.
[48]
Yichen Wei, Fang Wen, Wangjiang Zhu, and Jian Sun. 2012. Geodesic saliency using background priors. In Proceedings of the European Conference on Computer Vision. 29–42.
[49]
Yuan Xu, Qinghai Zhou, Liwei Gong, Mingcheng Zhu, Xiaohong Ding, and Robert K. F. Teng. 2013. High-speed simultaneous image distortion correction transformations for a multicamera cylindrical panorama real-time video system using FPGA. IEEE Transactions on Circuits and Systems for Video Technology 24, 6 (2013), 1061–1069.
[50]
Julio Zaragoza, Tat-Jun Chin, Michael S. Brown, and David Suter. 2014. As-projective-as-possible image stitching with moving DLT. IEEE Transactions on Pattern Analysis and Machine Intelligence 36, 7 (2014), 1285–1298.
[51]
Buyue Zhang, Vikram Appia, Ibrahim Pekkucuksen, Yucheng Liu, Aziz Umit Batur, Pavan Shastry, Stanley Liu, Shiju Sivasankaran, and Kedar Chitnis. 2014. A surround view camera solution for embedded systems. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 662–667.
[52]
Guofeng Zhang, Yi He, Weifeng Chen, Jiaya Jia, and Hujun Bao. 2016. Multi-viewpoint panorama construction with wide-baseline images. IEEE Transactions on Image Processing 25, 7 (2016), 3099–3111.
[53]
Lin Zhang, Juntao Chen, Dongyang Liu, Ying Shen, and Shengjie Zhao. 2019. Seamless 3D surround view with a novel burger model. In Proceedings of the IEEE International Conference on Image Processing. 4150–4154.
[54]
Liuxin Zhang, Bin Li, and Yunde Jia. 2007. A practical calibration method for multiple cameras. In Proceedings of the IEEE International Conference on Image and Graphics. 45–50.
[55]
Zhengyou Zhang. 2000. A flexible new technique for camera calibration. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 11 (2000), 1330–1334.
[56]
Wenbin Zou and Nikos Komodakis. 2015. HARF: Hierarchy-associated rich features for salient object detection. In Proceedings of the IEEE International Conference on Computer Vision. 406–414.

Cited By

View all
  • (2024)Video stitching method utilizing mesh segmentationProceedings of the 2024 3rd International Symposium on Control Engineering and Robotics10.1145/3679409.3679417(43-46)Online publication date: 24-May-2024
  • (2024)Remote Sensing Image Rectangling With Iterative Warping Kernel Self-Correction TransformerIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2024.344124662(1-17)Online publication date: 2024
  • (2024)Seam Mask Guided Partial Reconstruction with Quantum-Inspired Local Aggregation For Deep Image StitchingICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10447800(2430-2434)Online publication date: 14-Apr-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications
ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 17, Issue 4
November 2021
529 pages
ISSN:1551-6857
EISSN:1551-6865
DOI:10.1145/3492437
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 November 2021
Accepted: 01 April 2021
Revised: 01 March 2021
Received: 01 November 2020
Published in TOMM Volume 17, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Panoramic video stitching
  2. extrinsic calibration
  3. photometric alignment
  4. seam-cutting
  5. instance segmentation

Qualifiers

  • Research-article
  • Refereed

Funding Sources

  • National Key Research and Development Project
  • National Natural Science Foundation of China
  • Shanghai Science and Technology Innovation Plan
  • Shanghai Municipal Science and Technology Major Project
  • Fundamental Research Funds for the Central Universities

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)143
  • Downloads (Last 6 weeks)30
Reflects downloads up to 18 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Video stitching method utilizing mesh segmentationProceedings of the 2024 3rd International Symposium on Control Engineering and Robotics10.1145/3679409.3679417(43-46)Online publication date: 24-May-2024
  • (2024)Remote Sensing Image Rectangling With Iterative Warping Kernel Self-Correction TransformerIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2024.344124662(1-17)Online publication date: 2024
  • (2024)Seam Mask Guided Partial Reconstruction with Quantum-Inspired Local Aggregation For Deep Image StitchingICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10447800(2430-2434)Online publication date: 14-Apr-2024
  • (2023)Geological Borehole Video Image Stitching Method Based on Local Homography Matrix Offset OptimizationSensors10.3390/s2302063223:2(632)Online publication date: 5-Jan-2023
  • (2023)Image Stitching Techniques Applied to Plane or 3-D Models: A ReviewIEEE Sensors Journal10.1109/JSEN.2023.325166123:8(8060-8079)Online publication date: 15-Apr-2023
  • (2023)Deep Learning on Image Stitching With Multi-viewpoint Images: A SurveyNeural Processing Letters10.1007/s11063-023-11226-z55:4(3863-3898)Online publication date: 23-Mar-2023
  • (2023)Quantification and analysis of performance fluctuation in distributed file systemCluster Computing10.1007/s10586-023-04141-427:3(3149-3162)Online publication date: 22-Sep-2023
  • (2022)Stationary wavelet transformation based video stabilization and stitchingJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-21306943:5(5759-5770)Online publication date: 1-Jan-2022

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media