Metadata based need-to-know view in large-scale video surveillance systems
Introduction
Large-scale video surveillance systems (VSS) deploy a large number of video cameras at many public places to observe different events and objects of interest. Video cameras are excellent multi-sensors so different observers (an observer is either a monitoring human, a program, or a system) may use the same video feed to accomplish different tasks such as smart police patrolling, public safety management, traffic operations management, infrastructure maintenance, or in case of recent pandemic social-distance inspection, etc. (Baran et al., 2016). Most of the above-mentioned services require continuous monitoring of public places because the information that they are interested in is not pre-defined. No one usually knows the exact form that an unwanted incident may take or when a particular event will occur, for instance, a traffic-monitoring observer (TM) does not know when and where a vehicle will be speeding, or a patrolling officer (PO) is generally unaware of where and when two individuals may start a Fight. To record a particular activity when it happens, and then use that specific piece of data according to the particular requirements of the observer, data needs to be collected and analyzed continuously.
This huge amount of VSS data congregated at a mass-scale from various public places contains a lot of information about general individuals, their routine activities, and associations, which may be inferred as personal. Therefore, although VSS provides many benefits it is also very invasive as it continuously records citizens’ activities, which may pose a threat to their freedom and privacy if misused by observers (Van den Hoven et al., 2019).
Countries across the globe have introduced legislation for the installation and operation of VSS to preserve their citizens’ privacy, whenever a VSS collects or processes any personal data (Rajpoot and Jensen, 2015). Legislations like the General Data Protection Regulations (GDPR) and California Consumer Privacy Act (Garlie, 2020) require VSS owners (either public or private) to have a valid legal basis for its deployment. It also requires owners to state an explicit purpose for their data usage, and confirmation that video data will not be subject to secondary use, i.e., it will only be used for the consented primary purpose. Most data-protection legislation allows informed individual consent as a legal basis for recording personal data, which reduces legal uncertainty. However, due to the continuous presence of VSS in the public space, it is generally not possible to obtain consent from every individual every time a camera records them. Therefore, VSS deployment often refers to “public interest” as a legal basis, as different public administrative services and authorities use the data (broadly) for multiple purposes like public safety, traffic management, etc. Individuals do not have a right to erasure and data portability under this legal base, but they do retain a right to object in some cases. Hence, VSS data collected under “public interest” supports different sorts of purposes that are beneficial for citizens, but they have fewer rights and are often expected to trust observers with their data. Nonetheless, at times, observers with individuals’ information have misused it by exploiting ambiguous purposes and indefinite legal basis, causing a high number of privacy invasion incidents of voyeurism, blackmail, profiling, etc. (Campbell, 2019; Snowden, 2019; Froomkin, 2015).
To preserve citizens’ privacy, it is imperative to limit observers’ views only to the relevant parts of VSS data that is helpful for their job. There are two likely approaches to achieve this, (1) limit the time and location of cameras that the observer has complete access to and (2) limit what can be seen in the individual frames in the (content of the) video stream. In the first case, observers are restricted based on the VSS physical context of time and space (Rajpoot and Jense, 2015). For example, a PO can only view cameras (live stream or recording), which are installed within a mile radius of her ‘location’, or if the ‘timestamp’ of the requested camera-recording lies within her ‘duty-hours’. In this case, observers are allowed to view all the cameras within their allowed physical context irrespective of whether they need to view more or fewer cameras to complete their tasks. This limits the exposure of the observer, as they have access to a limited number of cameras, both location and timewise. The second approach is to limit the observer's view based on the high-level semantic information obtained from the content of the video stream. For example, a TM can be limited to only view frames of video streams that detect ‘speeding vehicles’, or PO can be restricted to view video streams that detect 'face' similar to that of a person-of-interest. In this case, observers do not have indefinite access to some cameras’ feed (as the first approach), as they will be restricted to view a specific portion of recordings (from any location or time) that have similar semantic content that they are ‘allowed’ to view. The second approach based on semantic content is more flexible and observer-oriented than just the physical context of time and space. However, observers are still exposed to some irrelevant data as even within the allowed duration of recording; it can still have information that is inapt for the observer. For instance, TM is allowed to view a recording where the camera caught a ‘speeding’ (event of interest) ‘vehicle’ (object of interest), here TM is only interested in viewing the ‘license plate’ of a ‘vehicle’ to find owner's information via automatic number plate recognition (ANPR). Yet, TM can view all the persons (drivers and passengers) in that vehicle, and all the other vehicles and persons on the road, at the 'time' or 'location' when only one of those vehicles was caught in a violation. It is important to note here that these two approaches are not mutually exclusive and can be combined in different ways creating a hybrid approach to further limit the exposure of irrelevant information. Ideally, TM should only be allowed to view the ‘license plate’ of the 'vehicle' from the camera installed at a specific 'location' from 'time' when 'traffic violation' was detected. Alternatively, the PO can only view 'humans' with specific descriptive or biometric features at the time of an ‘accident’. The hybrid approach considering both physical and semantic context can help limit the exposure of information to the observers depending upon their requirements while preserving privacy.
Thus, for VSS to enforce a dynamic need-to-know view for different observers, it is essential to understand their data requirements, i.e., what is it that they need to view in order to complete a particular task, referred to as the ‘purpose’ of the observer. The purpose should explicitly state two points: first, contextual (physical or semantic) properties that are required to limit the sequence or duration (series of frames) of the recording for viewing. Second, the type of information required from the selected content/video sequence that is required by the observer to complete his authorized task. Here, we assume that all observers are authorized and have a valid legal base to support their purposes. It is also imperative that the user's purposes should be described in a way similar to the type of information that is derived from the VSS data, so they can be mapped or compared against each other. This way, part of the surveillance data (‘events’ or ‘objects’ identified in the content) can be used to enforce restrictions on the other part of the data (objects of interest), and irrelevant data can be hidden (Sultan and Jensen, 2021). In this paper, we propose to extend Attributes Enhanced Role-Based Access Control (AERBAC) model with content-based restrictions in a way that allows observers to have explicit access to the content relevant to their purposes without compromising privacy. The proposed model will take into account both the physical and semantic context to control the amount of information from video-recordings based on the ‘purpose’ of the observer. This will limit the possibility of prejudiced interpretation by observers and enforce fine-grained need-to-know permissions while allowing multiple observers to achieve their authorized purposes, without compromising individuals’ privacy (Eckhoff and Wagner, 2018). The rest of this paper is organized as follows. Section 2 presents a smart-city video surveillance example to observe VSS privacy requirements and is used to illustrate important points throughout the paper. Section 3 investigates VSS data properties and how can they help access control mechanisms to limit the exposure of information to the observer. Section 4 will present and analyze the extended AERBAC model to enforce a context-based need-to-know view for observers. Finally, the last section will survey the related work followed by a conclusion.
Section snippets
Smart-city video surveillance systems (SC-VSS)
For our motivation example, we will use smart-city video surveillance systems (SC-VSS), a modern case of large-scale VSS, where multiple observers with different needs can access the aggregated VSS data from all over the city. Here we analyse a specific scenario to see what type of information is of interest to different observers, and what is irrelevant based on their requirements, under a specific purpose of ‘public safety’. A set of adjacent cameras installed on the highway detects a
Surveillance data
In order to implement an access control mechanism based on the information obtained from the content of the VSS, it is important to know different types and amounts of information that can be extracted from it. VSS was initially designed to allow a human observer to look for any incident/event of interest in the camera recordings in real-time or to store these recordings for later so that they can be used as evidence. With time, VSS has become an extensively used surveillance technology, making
Metadata-based access control framework
Access Control Mechanism (ACM) has three main components: users (observers), resource-objects (video recordings), and a reference monitor. The observer is an authorized user, which has distinct attributes (user properties) by which she identifies herself to the system, and then requests to access a certain resource (recording). Role-based access control (RBAC) and Attribute-based access control (ABAC) are two of the most commonly used ACMs for large-scale information systems. In RBAC, observers
Related work
Our presented solution is based on the key idea of how information in video content can be utilized to regulate access control in large-scale dynamic systems, such as SC-VSS. There are two main questions raised on this idea, first, how much information can be extracted from the video content, and whether is it possible to obtain data about activities happening in video content, already discussed in Section 3. Second, how different types of access control mechanisms regulate access based on that
Conclusion
Mass-scale video surveillance has become a universal tool to accomplish several administrative tasks from ensuring public safety to real-time traffic management and many more. The persistent and continuous video recording of data (collecting different types of personal information) from a large number of locations at a city or national level raises serious privacy concerns about data usage. Though different data protection legislations provide guidelines for privacy-aware video surveillance, it
Author statements
All persons who meet authorship criteria are listed as authors, and all authors certify that they have participated sufficiently in the work to take public responsibility for the content, including participation in the concept, design, analysis, writing, or revision of the manuscript. Furthermore, each author certifies that this material or similar material has not been and will not be submitted to or published in any other publication before its appearance in the Hong Kong Journal of
Declaration of Competing Interest
The Author(s) declare(s) that there is no conflict of interest.
Shizra Sultan is a PhD student at the Department of Mathematics and Computer Science, IN Technical University of Denmark. She completed her masters in Computer and Communication Security in 2015 from National University of Science and Technology, Pakistan. Prior to that, she worked in industry and has experience in application development and system security. Her research concentrates on access control solution, privacy and data protection legislations.
References (71)
- et al.
A novel privacy-preserving method for data publication
Inf. Sci.
(2019) - et al.
A model for attribute-based user-role assignment
Toward global IoT-enabled smart cities interworking using adaptive semantic adapter
IEEE Internet Things J.
(2019)- Balana, XACML 3.0 implementation,...
- et al.
A smart camera for the surveillance of vehicles in intelligent transportation systems
Multimed. Tools Appl.
(2016) - et al.
Negotiating privacy preferences in video surveillance systems
Modern Approaches in Applied Intelligence
(2011) - et al.
An access control model for video database systems
- et al.
A hierarchical access control model for video database systems
ACM Trans. Inf. Syst.
(2003) - et al.
Hierarchy-Based Access Control in Distributed Environments
CSE Conference and Workshop Papers.
(2001) - et al.
Hierarchy-based access control in distributed environments
Multiple hypothesis tracking for multiple target tracking
IEEE Aerosp. Electron. Syst. Mag.
The entire system is designed to suppress us
What the Chinese Surveillance State Means for the Rest of the World
Hierarchical and shared access control
IEEE Trans. Inf. Forensics Secur.
Smartphone-based public health information systems: anonymity, privacy, and intervention
J. Assoc. Inf. Sci. Technol.
Semantic access control for privacy management of personal sensing in smart cities
IEEE Transactions on Emerging Topics in Computing,
Privacy in the smart city—applications, technologies, challenges, and solutions
IEEE Commun. Surv. Tutor.
A role-based access control model and reference implementation within a corporate intranet
ACM Trans. Inf. Syst. Secur.
Regulating mass surveillance as privacy pollution: learning from environmental impact statements (November 3, 2015)
Univ. Ill. Law. Rev.
California Consumer Privacy Act of 2018:
A Study of Com- pliance and Associated Risk. PhD thesis,
Region-Based Convolutional Networks for Accurate Object Detection and Segmentation
n IEEE Transactions on Pattern Analysis and Machine Intelligence
” Decoupled deep neural network for semi-supervised semantic segmentation”
Advances in Neural Information Processing Systems 28 (NIPS 2015)
Towards an approach of semantic access control for cloud computing
Towards privacy-sensitive participatory sensing
A survey on leveraging deep neural networks for object tracking
ImageNet classification with deep convolutional neural networks (PDF)
Commun. ACM
HMDB: a large video database for human motion recognition
Microsoft COCO: common objects in context
SSD: single shot multibox detector
European Conference on Computer Vision
Distinctive image features from scale-invariant key points
Int. J. Comput. Vis.
Attributes enhanced role-based access control model
Design and implementation of a secure and trustworthy platform for privacy-aware video surveillance
Int. J. Inf. Secur.
Kalman filter-based algorithms for estimating depth from image sequences
Int. J. Comput. Vis.
Cited by (3)
Dynamic strategy to use optimum memory space in real-time video surveillance
2023, Journal of Ambient Intelligence and Humanized ComputingResearch on non-intrusive video capture technology based on FPD-linkⅢ
2022, Proceedings of SPIE - The International Society for Optical EngineeringVideo Salient Object Extraction Model Guided by Spatio-Temporal Contrast
2022, Communications in Computer and Information Science
Shizra Sultan is a PhD student at the Department of Mathematics and Computer Science, IN Technical University of Denmark. She completed her masters in Computer and Communication Security in 2015 from National University of Science and Technology, Pakistan. Prior to that, she worked in industry and has experience in application development and system security. Her research concentrates on access control solution, privacy and data protection legislations.