ABSTRACT
The recent advances in smart city infrastructure have provided support for a higher adoption of surveillance cameras as a mainstream crime prevention measure. However, a consequent massive deployment raises concerns about privacy issues among citizens. In this paper, we present VR-Surv, a VR-based privacy aware surveillance system for large scale urban environments. Our concept is based on conveying the semantics of the scene uniquely, without revealing the identity of the individuals or the contextual details that might violate the privacy of the entities present in the surveillance area. For this, we create a virtual replica of the areas of interest, in real-time, through the combination of procedurally generated environments and markerless motion capture models. The results of our preliminary evaluation revealed that our system successfully conceals privacy-sensitive data, while preserving the semantics of the scene. Furthermore, participants in our user study expressed higher acceptance to being surveilled through the proposed system.
- Kunjal Ahir, Kajal Govani, Rutvik Gajera, and Manan Shah. 2020. Application on virtual reality for enhanced education learning, military training and sports. Augmented Human Research 5, 1 (2020), 1–9.Google ScholarCross Ref
- Julia Beck, Mattia Rainoldi, and Roman Egger. 2019. Virtual reality in tourism: A state-of-the-art review. Tourism Review 74, 3 (2019), 586–612.Google ScholarCross Ref
- Aniket Bera, Tanmay Randhavane, and Dinesh Manocha. 2019. Improving Socially-aware Multi-channel Human Emotion Prediction for Robot Navigation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops. IEEE, Long Beach, CA, USA, 21–27.Google Scholar
- F. Biljecki, H. Ledoux, and J. Stoter. 2016. Generation of multi-LOD 3D city models in CityGML with the procedural modelling engine Random3Dcity. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences IV-4/W1(2016), 51–59. https://doi.org/10.5194/isprs-annals-IV-4-W1-51-2016Google Scholar
- Francesco Bonchi, Yücel Saygin, Vassilios S Verykios, Maurizio Atzori, Aris Gkoulalas-Divanis, Selim Volkan Kaya, and Erkay Sava. 2008. Privacy in spatiotemporal data mining., 297–333 pages.Google Scholar
- Daniel J. Butler, Justin Huang, Franziska Roesner, and Maya Cakmak. 2015. The Privacy-Utility Tradeoff for Remotely Teleoperated Robots. In Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction(Portland, Oregon, USA) (HRI ’15). Association for Computing Machinery, New York, NY, USA, 27–34. https://doi.org/10.1145/2696454.2696484Google ScholarDigital Library
- Yu Cheng, Bo Yang, Bo Wang, and Robby T Tan. 2020. 3d human pose estimation using spatio-temporal networks with explicit occlusion training. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. AAAI Press, New York, NY, USA, 10631–10638.Google ScholarCross Ref
- Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. 2015. Image super-resolution using deep convolutional networks. IEEE transactions on pattern analysis and machine intelligence 38, 2(2015), 295–307.Google Scholar
- Chao Dong, Chen Change Loy, and Xiaoou Tang. 2016. Accelerating the Super-Resolution Convolutional Neural Network. In Computer Vision – ECCV 2016, Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling (Eds.). Springer International Publishing, Cham, 391–407.Google ScholarCross Ref
- Kimberly Dylla, Bernard Frischer, Pascal Müller, Andreas Ulmer, and Simon Haegler. 2008. Rome reborn 2.0: A case study of virtual city reconstruction using procedural modeling techniques. Computer Graphics World 16, 6 (2008), 62–66.Google Scholar
- Zekeriya Erkin, Martin Franz, Jorge Guajardo, Stefan Katzenbeisser, Inald Lagendijk, and Tomas Toft. 2009. Privacy-preserving face recognition. In International symposium on privacy enhancing technologies symposium. Springer, Seattle, WA, USA, 235–253.Google ScholarDigital Library
- Alem Fitwi, Yu Chen, Sencun Zhu, Erik Blasch, and Genshe Chen. 2021. Privacy-preserving surveillance as an edge service based on lightweight video protection schemes using face de-identification and window masking. Electronics 10, 3 (2021), 236.Google ScholarCross Ref
- Gerhard Gröger, Thomas H Kolbe, Claus Nagel, and Karl-Heinz Häfele. 2012. OGC city geography markup language (CityGML) encoding standard.Google Scholar
- Mir Rayat Imtiaz Hossain and James J Little. 2018. Exploiting temporal information for 3d human pose estimation. In Proceedings of the European Conference on Computer Vision (ECCV). Springer, Munich, Germany, 68–84.Google Scholar
- Jie Huang, Anmin Huang, and Liming Wang. 2020. Intelligent Video Surveillance of Tourist Attractions Based on Virtual Reality Technology. IEEE Access 8(2020), 159220–159233. https://doi.org/10.1109/ACCESS.2020.3020637Google ScholarCross Ref
- Pirazh Khorramshahi, Neehar Peri, Amit Kumar, Anshul Shah, and Rama Chellappa. 2019. Attention Driven Vehicle Re-identification and Unsupervised Anomaly Detection for Traffic Understanding.. In CVPR Workshops. Computer Vision Foundation / IEEE, Long Beach, CA, USA, 239–246.Google Scholar
- Jiefeng Li, Can Wang, Hao Zhu, Yihuan Mao, Hao-Shu Fang, and Cewu Lu. 2019. Crowdpose: Efficient crowded scenes pose estimation and a new benchmark. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Computer Vision Foundation / IEEE, Long Beach, CA, USA, 10863–10872.Google ScholarCross Ref
- Peilun Li, Guozhen Li, Zhangxi Yan, Youzeng Li, Meiqi Lu, Pengfei Xu, Yang Gu, Bing Bai, Yifei Zhang, and DiDi Chuxing. 2019. Spatio-temporal Consistency and Hierarchical Matching for Multi-Target Multi-Camera Vehicle Tracking.. In CVPR Workshops. Computer Vision Foundation / IEEE, Long Beach, CA, USA, 222–230.Google Scholar
- Pei Li, Loreto Prieto, Domingo Mery, and Patrick J. Flynn. 2019. On Low-Resolution Face Recognition in the Wild: Comparisons and New Techniques. IEEE Trans. Inf. Forensics Secur. 14, 8 (2019), 2000–2012.Google ScholarDigital Library
- Aristid Lindenmayer. 1968. Mathematical models for cellular interactions in development I. Filaments with one-sided inputs. Journal of theoretical biology 18, 3 (1968), 280–299.Google ScholarCross Ref
- Dushyant Mehta, Helge Rhodin, Dan Casas, Oleksandr Sotnychenko, Weipeng Xu, and Christian Theobalt. 2016. Monocular 3d human pose estimation using transfer learning and improved cnn supervision. arXiv preprint arXiv:1611.09813 1, 3 (2016), 5.Google Scholar
- Nobuhiro Miyazaki, Kentaro Tsuji, Mingxie Zheng, Moyuri Nakashima, Yuji Matsuda, and Eigo Segawa. 2015. Privacy-conscious human detection using low-resolution video. In 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR). IEEE, Kuala Lumpur, Malaysia, 326–330. https://doi.org/10.1109/ACPR.2015.7486519Google ScholarCross Ref
- Alejandro Newell, Kaiyu Yang, and Jia Deng. 2016. Stacked hourglass networks for human pose estimation. In European conference on computer vision. Springer, Springer, Amsterdam, The Netherlands, 483–499.Google ScholarCross Ref
- Vitaly Petrov, Sergey Andreev, Mario Gerla, and Yevgeni Koucheryavy. 2018. Breaking the Limits in Urban Video Monitoring: Massive Crowd Sourced Surveillance over Vehicles. IEEE Wireless Communications 25, 5 (2018), 104–112. https://doi.org/10.1109/MWC.2018.1700415Google ScholarDigital Library
- Faisal Qureshi and Demetri Terzopoulos. 2008. Smart Camera Networks in Virtual Reality. Proc. IEEE 96, 10 (2008), 1640–1656. https://doi.org/10.1109/JPROC.2008.928932Google ScholarCross Ref
- Faisal Z. Qureshi and Demetri Terzopoulos. 2007. Surveillance in Virtual Reality: System Design and Multi-Camera Control. In 2007 IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, Minneapolis, Minnesota, USA, 1–8. https://doi.org/10.1109/CVPR.2007.383071Google Scholar
- Qasim Mahmood Rajpoot and Christian Damsgaard Jensen. 2014. Security and privacy in video surveillance: requirements and challenges. In IFIP international information security conference. Springer, Springer Berlin Heidelberg, Berlin, Heidelberg, 169–184.Google Scholar
- Brian A Reaves. 2015. Local police departments, 2013: Equipment and technology. Technical Report. Washington, DC: Bureau of Justice Statistics.Google Scholar
- Wenzhe Shi, Jose Caballero, Ferenc Huszár, Johannes Totz, Andrew P. Aitken, Rob Bishop, Daniel Rueckert, and Zehan Wang. 2016. Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, Las Vegas, NV, USA, 1874–1883. https://doi.org/10.1109/CVPR.2016.207Google ScholarCross Ref
- Ruben M Smelik, Tim Tutenel, Rafael Bidarra, and Bedrich Benes. 2014. A survey on procedural modelling for virtual worlds. Computer Graphics Forum 33, 6 (2014), 31–50.Google ScholarDigital Library
- Zheng Tang, Milind Naphade, Ming-Yu Liu, Xiaodong Yang, Stan Birchfield, Shuo Wang, Ratnesh Kumar, David Anastasiu, and Jenq-Neng Hwang. 2019. Cityflow: A city-scale benchmark for multi-target multi-camera vehicle tracking and re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Computer Vision Foundation / IEEE, Long Beach, CA, USA, 8797–8806.Google ScholarCross Ref
- Demetri Terzopoulos. 2003. Perceptive Agents and Systems in Virtual Reality. In Proceedings of the ACM Symposium on Virtual Reality Software and Technology (Osaka, Japan) (VRST ’03). Association for Computing Machinery, New York, NY, USA, 1–3. https://doi.org/10.1145/1008653.1008655Google ScholarDigital Library
- Felix Tschirschwitz, Christian Richerzhagen, Heinz-Jürgen Przybilla, and Thomas P Kersten. 2019. Duisburg 1566: transferring a historic 3d city model from google earth into a virtual reality application. PFG–Journal of Photogrammetry, Remote Sensing and Geoinformation Science 87, 1 (2019), 47–56.Google Scholar
- Jos P Van Leeuwen, Klaske Hermans, Antti Jylhä, Arnold Jan Quanjer, and Hanke Nijman. 2018. Effectiveness of virtual reality in participatory urban planning: A case study. In Proceedings of the 4th Media Architecture Biennale Conference. ACM, Beijing, China, 128–136.Google ScholarDigital Library
- Eric van Rees. 2014. Esri cityengine 2013. GeoInformatics 17, 2 (2014), 6.Google Scholar
- Thomas Winkler, Ádám Erdélyi, and Bernhard Rinner. 2014. TrustEYE.M4: Protecting the sensor — Not the camera. In 2014 11th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). IEEE Computer Society, Seoul, South Korea, 159–164. https://doi.org/10.1109/AVSS.2014.6918661Google ScholarCross Ref
- Ryo Yonetani, Vishnu Naresh Boddeti, Kris M Kitani, and Yoichi Sato. 2017. Privacy-preserving visual learning using doubly permuted homomorphic encryption. In Proceedings of the IEEE International Conference on Computer Vision. IEEE Computer Society, Venice, Italy, 2040–2050.Google ScholarCross Ref
- Erfan Zangeneh, Mohammad Rahmati, and Yalda Mohsenzadeh. 2020. Low resolution face recognition using a two-branch deep convolutional neural network architecture. Expert Systems with Applications 139 (2020), 112854.Google ScholarDigital Library
- Xingyi Zhou, Qixing Huang, Xiao Sun, Xiangyang Xue, and Yichen Wei. 2017. Towards 3d human pose estimation in the wild: a weakly-supervised approach. In Proceedings of the IEEE International Conference on Computer Vision. IEEE Computer Society, Venice, Italy, 398–407.Google ScholarCross Ref
Index Terms
- VR-Surv: a VR-Based Privacy Preserving Surveillance System
Recommendations
Implementation of the Privacy Protection in Video Surveillance System
SSIRI '09: Proceedings of the 2009 Third IEEE International Conference on Secure Software Integration and Reliability ImprovementDue to increased terrors and crimes, the use of the video surveillance camera system is increasing. It has been operated for public interest such as prevention of crimes and fly-tipping by the police and local government, but private information such as ...
Internet privacy concerns and beliefs about government surveillance - An empirical investigation
This U.S.-based research attempts to understand the relationships between users' perceptions about Internet privacy concerns, the need for government surveillance, government intrusion concerns, and the willingness to disclose personal information ...
W3-privacy: understanding what, when, and where inference channels in multi-camera surveillance video
Huge amounts of video are being recorded every day by surveillance systems. Since video is capable of recording and preserving an enormous amount of information which can be used in many applications, it is worth examining the degree of privacy loss ...
Comments