Elsevier

Computers & Security

Volume 111, December 2021, 102452
Computers & Security

Metadata based need-to-know view in large-scale video surveillance systems

https://doi.org/10.1016/j.cose.2021.102452Get rights and content

Abstract

Large-scale video surveillance systems are increasingly seen as the answer to problems concerning public safety, law enforcement, and situational awareness in public places. However, the unauthorized use of personal information derived from video data can be harmful. To preserve privacy, it is important to understand what type of personal information is contained in video surveillance data, and how much of that information is essential for an observer to achieve her authorized purpose. The purpose of the observer is described in terms similar to the information extracted by video surveillance systems, so they can be compared. This helps identify what type of information is better suited to control the flow of information to multiple observers, without compromising the privacy of the individuals. This paper presents a privacy-aware and need-to-know access control framework built on fine-grained data properties, extracted from surveillance data, which must conform to the explicitly defined purpose of the observers.

Introduction

Large-scale video surveillance systems (VSS) deploy a large number of video cameras at many public places to observe different events and objects of interest. Video cameras are excellent multi-sensors so different observers (an observer is either a monitoring human, a program, or a system) may use the same video feed to accomplish different tasks such as smart police patrolling, public safety management, traffic operations management, infrastructure maintenance, or in case of recent pandemic social-distance inspection, etc. (Baran et al., 2016). Most of the above-mentioned services require continuous monitoring of public places because the information that they are interested in is not pre-defined. No one usually knows the exact form that an unwanted incident may take or when a particular event will occur, for instance, a traffic-monitoring observer (TM) does not know when and where a vehicle will be speeding, or a patrolling officer (PO) is generally unaware of where and when two individuals may start a Fight. To record a particular activity when it happens, and then use that specific piece of data according to the particular requirements of the observer, data needs to be collected and analyzed continuously.

This huge amount of VSS data congregated at a mass-scale from various public places contains a lot of information about general individuals, their routine activities, and associations, which may be inferred as personal. Therefore, although VSS provides many benefits it is also very invasive as it continuously records citizens’ activities, which may pose a threat to their freedom and privacy if misused by observers (Van den Hoven et al., 2019).

Countries across the globe have introduced legislation for the installation and operation of VSS to preserve their citizens’ privacy, whenever a VSS collects or processes any personal data (Rajpoot and Jensen, 2015). Legislations like the General Data Protection Regulations (GDPR) and California Consumer Privacy Act (Garlie, 2020) require VSS owners (either public or private) to have a valid legal basis for its deployment. It also requires owners to state an explicit purpose for their data usage, and confirmation that video data will not be subject to secondary use, i.e., it will only be used for the consented primary purpose. Most data-protection legislation allows informed individual consent as a legal basis for recording personal data, which reduces legal uncertainty. However, due to the continuous presence of VSS in the public space, it is generally not possible to obtain consent from every individual every time a camera records them. Therefore, VSS deployment often refers to “public interest” as a legal basis, as different public administrative services and authorities use the data (broadly) for multiple purposes like public safety, traffic management, etc. Individuals do not have a right to erasure and data portability under this legal base, but they do retain a right to object in some cases. Hence, VSS data collected under “public interest” supports different sorts of purposes that are beneficial for citizens, but they have fewer rights and are often expected to trust observers with their data. Nonetheless, at times, observers with individuals’ information have misused it by exploiting ambiguous purposes and indefinite legal basis, causing a high number of privacy invasion incidents of voyeurism, blackmail, profiling, etc. (Campbell, 2019; Snowden, 2019; Froomkin, 2015).

To preserve citizens’ privacy, it is imperative to limit observers’ views only to the relevant parts of VSS data that is helpful for their job. There are two likely approaches to achieve this, (1) limit the time and location of cameras that the observer has complete access to and (2) limit what can be seen in the individual frames in the (content of the) video stream. In the first case, observers are restricted based on the VSS physical context of time and space (Rajpoot and Jense, 2015). For example, a PO can only view cameras (live stream or recording), which are installed within a mile radius of her ‘location’, or if the ‘timestamp’ of the requested camera-recording lies within her ‘duty-hours’. In this case, observers are allowed to view all the cameras within their allowed physical context irrespective of whether they need to view more or fewer cameras to complete their tasks. This limits the exposure of the observer, as they have access to a limited number of cameras, both location and timewise. The second approach is to limit the observer's view based on the high-level semantic information obtained from the content of the video stream. For example, a TM can be limited to only view frames of video streams that detect ‘speeding vehicles’, or PO can be restricted to view video streams that detect 'face' similar to that of a person-of-interest. In this case, observers do not have indefinite access to some cameras’ feed (as the first approach), as they will be restricted to view a specific portion of recordings (from any location or time) that have similar semantic content that they are ‘allowed’ to view. The second approach based on semantic content is more flexible and observer-oriented than just the physical context of time and space. However, observers are still exposed to some irrelevant data as even within the allowed duration of recording; it can still have information that is inapt for the observer. For instance, TM is allowed to view a recording where the camera caught a ‘speeding’ (event of interest) ‘vehicle’ (object of interest), here TM is only interested in viewing the ‘license plate’ of a ‘vehicle’ to find owner's information via automatic number plate recognition (ANPR). Yet, TM can view all the persons (drivers and passengers) in that vehicle, and all the other vehicles and persons on the road, at the 'time' or 'location' when only one of those vehicles was caught in a violation. It is important to note here that these two approaches are not mutually exclusive and can be combined in different ways creating a hybrid approach to further limit the exposure of irrelevant information. Ideally, TM should only be allowed to view the ‘license plate’ of the 'vehicle' from the camera installed at a specific 'location' from 'time' when 'traffic violation' was detected. Alternatively, the PO can only view 'humans' with specific descriptive or biometric features at the time of an ‘accident’. The hybrid approach considering both physical and semantic context can help limit the exposure of information to the observers depending upon their requirements while preserving privacy.

Thus, for VSS to enforce a dynamic need-to-know view for different observers, it is essential to understand their data requirements, i.e., what is it that they need to view in order to complete a particular task, referred to as the ‘purpose’ of the observer. The purpose should explicitly state two points: first, contextual (physical or semantic) properties that are required to limit the sequence or duration (series of frames) of the recording for viewing. Second, the type of information required from the selected content/video sequence that is required by the observer to complete his authorized task. Here, we assume that all observers are authorized and have a valid legal base to support their purposes. It is also imperative that the user's purposes should be described in a way similar to the type of information that is derived from the VSS data, so they can be mapped or compared against each other. This way, part of the surveillance data (‘events’ or ‘objects’ identified in the content) can be used to enforce restrictions on the other part of the data (objects of interest), and irrelevant data can be hidden (Sultan and Jensen, 2021). In this paper, we propose to extend Attributes Enhanced Role-Based Access Control (AERBAC) model with content-based restrictions in a way that allows observers to have explicit access to the content relevant to their purposes without compromising privacy. The proposed model will take into account both the physical and semantic context to control the amount of information from video-recordings based on the ‘purpose’ of the observer. This will limit the possibility of prejudiced interpretation by observers and enforce fine-grained need-to-know permissions while allowing multiple observers to achieve their authorized purposes, without compromising individuals’ privacy (Eckhoff and Wagner, 2018). The rest of this paper is organized as follows. Section 2 presents a smart-city video surveillance example to observe VSS privacy requirements and is used to illustrate important points throughout the paper. Section 3 investigates VSS data properties and how can they help access control mechanisms to limit the exposure of information to the observer. Section 4 will present and analyze the extended AERBAC model to enforce a context-based need-to-know view for observers. Finally, the last section will survey the related work followed by a conclusion.

Section snippets

Smart-city video surveillance systems (SC-VSS)

For our motivation example, we will use smart-city video surveillance systems (SC-VSS), a modern case of large-scale VSS, where multiple observers with different needs can access the aggregated VSS data from all over the city. Here we analyse a specific scenario to see what type of information is of interest to different observers, and what is irrelevant based on their requirements, under a specific purpose of ‘public safety’. A set of adjacent cameras installed on the highway detects a

Surveillance data

In order to implement an access control mechanism based on the information obtained from the content of the VSS, it is important to know different types and amounts of information that can be extracted from it. VSS was initially designed to allow a human observer to look for any incident/event of interest in the camera recordings in real-time or to store these recordings for later so that they can be used as evidence. With time, VSS has become an extensively used surveillance technology, making

Metadata-based access control framework

Access Control Mechanism (ACM) has three main components: users (observers), resource-objects (video recordings), and a reference monitor. The observer is an authorized user, which has distinct attributes (user properties) by which she identifies herself to the system, and then requests to access a certain resource (recording). Role-based access control (RBAC) and Attribute-based access control (ABAC) are two of the most commonly used ACMs for large-scale information systems. In RBAC, observers

Related work

Our presented solution is based on the key idea of how information in video content can be utilized to regulate access control in large-scale dynamic systems, such as SC-VSS. There are two main questions raised on this idea, first, how much information can be extracted from the video content, and whether is it possible to obtain data about activities happening in video content, already discussed in Section 3. Second, how different types of access control mechanisms regulate access based on that

Conclusion

Mass-scale video surveillance has become a universal tool to accomplish several administrative tasks from ensuring public safety to real-time traffic management and many more. The persistent and continuous video recording of data (collecting different types of personal information) from a large number of locations at a city or national level raises serious privacy concerns about data usage. Though different data protection legislations provide guidelines for privacy-aware video surveillance, it

Author statements

All persons who meet authorship criteria are listed as authors, and all authors certify that they have participated sufficiently in the work to take public responsibility for the content, including participation in the concept, design, analysis, writing, or revision of the manuscript. Furthermore, each author certifies that this material or similar material has not been and will not be submitted to or published in any other publication before its appearance in the Hong Kong Journal of

Declaration of Competing Interest

The Author(s) declare(s) that there is no conflict of interest.

Shizra Sultan is a PhD student at the Department of Mathematics and Computer Science, IN Technical University of Denmark. She completed her masters in Computer and Communication Security in 2015 from National University of Science and Technology, Pakistan. Prior to that, she worked in industry and has experience in application development and system security. Her research concentrates on access control solution, privacy and data protection legislations.

References (71)

  • C. Liu et al.

    A novel privacy-preserving method for data publication

    Inf. Sci.

    (2019)
  • M.A. Al-Kahtani et al.

    A model for attribute-based user-role assignment

  • J. An

    Toward global IoT-enabled smart cities interworking using adaptive semantic adapter

    IEEE Internet Things J.

    (2019)
  • Balana, XACML 3.0 implementation,...
  • R Baran et al.

    A smart camera for the surveillance of vehicles in intelligent transportation systems

    Multimed. Tools Appl.

    (2016)
  • M.S. Barham et al.

    Negotiating privacy preferences in video surveillance systems

    Modern Approaches in Applied Intelligence

    (2011)
  • E. Bertino et al.

    An access control model for video database systems

  • E. Bertino et al.

    A hierarchical access control model for video database systems

    ACM Trans. Inf. Syst.

    (2003)
  • J.-C. Birget et al.

    Hierarchy-Based Access Control in Distributed Environments

    CSE Conference and Workshop Papers.

    (2001)
  • J.C. Birgit et al.

    Hierarchy-based access control in distributed environments

  • S.S. Blackman

    Multiple hypothesis tracking for multiple target tracking

    IEEE Aerosp. Electron. Syst. Mag.

    (2004)
  • C. Campbell

    The entire system is designed to suppress us

    What the Chinese Surveillance State Means for the Rest of the World

    (2019)
  • A. Castiglione

    Hierarchical and shared access control

    IEEE Trans. Inf. Forensics Secur.

    (2016)
  • A. Clarke et al.

    Smartphone-based public health information systems: anonymity, privacy, and intervention

    J. Assoc. Inf. Sci. Technol.

    (2015)
  • M. Drozdowicz et al.

    Semantic access control for privacy management of personal sensing in smart cities

    IEEE Transactions on Emerging Topics in Computing,

    (2020)
  • D. Eckhoff et al.

    Privacy in the smart city—applications, technologies, challenges, and solutions

    IEEE Commun. Surv. Tutor.

    (2018)
  • D.F. Ferraiolo et al.

    A role-based access control model and reference implementation within a corporate intranet

    ACM Trans. Inf. Syst. Secur.

    (1999)
  • A.M. Froomkin

    Regulating mass surveillance as privacy pollution: learning from environmental impact statements (November 3, 2015)

    Univ. Ill. Law. Rev.

    (2015)
  • M Garlie

    California Consumer Privacy Act of 2018:

    A Study of Com- pliance and Associated Risk. PhD thesis,

    (2020)
  • R Girshick

    Region-Based Convolutional Networks for Accurate Object Detection and Segmentation

    n IEEE Transactions on Pattern Analysis and Machine Intelligence

    (2016)
  • Heilbron, Fabian & Escorcia, Victor & Ghanem, Bernard & Niebles, Juan Carlos. (2015). ActivityNet: A Large-Scale Video...
  • S Hong

    ” Decoupled deep neural network for semi-supervised semantic segmentation”

    Advances in Neural Information Processing Systems 28 (NIPS 2015)

    (2015)
  • L. Hu et al.

    Towards an approach of semantic access control for cloud computing

  • K.L. Huang et al.

    Towards privacy-sensitive participatory sensing

  • S. Krebs et al.

    A survey on leveraging deep neural networks for object tracking

  • A. Krizhevsky et al.

    ImageNet classification with deep convolutional neural networks (PDF)

    Commun. ACM

    (2017)
  • H. Kuehne et al.

    HMDB: a large video database for human motion recognition

  • T.Y. Lin et al.

    Microsoft COCO: common objects in context

    (2014)
  • W. Liu et al.

    SSD: single shot multibox detector

    European Conference on Computer Vision

    (2016)
  • D. Lowe

    Distinctive image features from scale-invariant key points

    Int. J. Comput. Vis.

    (2004)
  • Q. Mahmood Rajpoot et al.

    Attributes enhanced role-based access control model

  • A. Martínez-Ballesté et al.

    Design and implementation of a secure and trustworthy platform for privacy-aware video surveillance

    Int. J. Inf. Secur.

    (2018)
  • l. Matthies et al.

    Kalman filter-based algorithms for estimating depth from image sequences

    Int. J. Comput. Vis.

    (1989)
  • OASIS XACML,...
  • Optical Character Recognition,...
  • Cited by (3)

    Shizra Sultan is a PhD student at the Department of Mathematics and Computer Science, IN Technical University of Denmark. She completed her masters in Computer and Communication Security in 2015 from National University of Science and Technology, Pakistan. Prior to that, she worked in industry and has experience in application development and system security. Her research concentrates on access control solution, privacy and data protection legislations.

    View full text