Elsevier

Expert Systems with Applications

Volume 72, 15 April 2017, Pages 327-334
Expert Systems with Applications

Deep learning for decentralized parking lot occupancy detection

https://doi.org/10.1016/j.eswa.2016.10.055Get rights and content

Highlights

  • We propose an effective CNN architecture for visual parking occupancy detection.

  • The CNN architecture is small enough to run on smart cameras.

  • The proposed solution performs and generalizes better than other SotA approaches.

  • We provide a new training/validation dataset for parking occupancy detection.

Abstract

A smart camera is a vision system capable of extracting application-specific information from the captured images. The paper proposes a decentralized and efficient solution for visual parking lot occupancy detection based on a deep Convolutional Neural Network (CNN) specifically designed for smart cameras. This solution is compared with state-of-the-art approaches using two visual datasets: PKLot, already existing in literature, and CNRPark-EXT. The former is an existing dataset, that allowed us to exhaustively compare with previous works. The latter dataset has been created in the context of this research, accumulating data across various seasons of the year, to test our approach in particularly challenging situations, exhibiting occlusions, and diverse and difficult viewpoints. This dataset is public available to the scientific community and is another contribution of our research. Our experiments show that our solution outperforms and generalizes the best performing approaches on both datasets. The performance of our proposed CNN architecture on the parking lot occupancy detection task, is comparable to the well-known AlexNet, which is three orders of magnitude larger.

Introduction

Recently there has been a growing interest in developing smart camera solutions able to detect parking lot occupancy. The approach that we propose performs this task in real-time directly on smart cameras, without using a central server. It is a decentralized, effective, efficient, and scalable approach, based on deep learning techniques (Bengio, 2009). It relies on a deep Convolutional Neural Network (CNN) specifically designed to be executed on smart cameras.

The clear advantages of the decentralization are the reduction of the communication overhead and the elimination of computing bottleneck. As a consequence, the system scales better when the number of monitored parking spaces increases.

We believe that the proposed approach is also advantageous with respect to those using ground sensors (e.g. magnetic sensors) placed on every parking space. Indeed, a single smart camera can simultaneously monitor several parking lots at a cost that is significantly lower than the cost required to install and maintain sensors in every parking lot.

The usage of video to monitor occupancy of parking lots is not new, see for instance (de Almeida, Oliveira, Britto, Silva, Koerich, 2015, Dan, del Postigo, Torres, Menéndez, 2015, Wu, Huang, Wang, Chiu, Chen, 2007). However, vacant parking space detection using only visual information is still an open problem. Many techniques using video cameras are tailored and fine-tuned to specific contexts and scenarios. However, these techniques cannnot be easily generalized, and even the adaptation of one solution to a different parking lot is not easy.

Thanks to the use of deep CNN, the proposed solution is robust to disturbances created by partial occlusions, by the presence of shadows and by the variation of light conditions. Moreover, it exhibits a good generalization property: in fact, the quality of the results is maintained when we consider parking lots and scenarios significantly different from the ones used during the CNN training phase. Furthermore, the classification phase needs fewer computational resources than the training phase, making it possible to run it on distributed, embedded, and low computing-power frameworks.

To validate our approach, we built a dataset, called CNRPark-EXT, collecting images from the parking lots in the experimentation area, which is the campus of the National Research Council (CNR) in Pisa.

The images in the CNRPark-EXT dataset are taken by 9 smart cameras with different point of views and different perspectives, in different days with different weather and light conditions, and includes occlusion and shadow situations that make the occupancy detection task more challenging. The dataset has been exhaustively, manually annotated, and is available to the scientific community. More details about the CNRPark-EXT dataset will be given in Section 4.

In addition, we tested our method on PKLot, a dataset for parking lot occupancy detection, so as to be able to compare our method against the state-of-the-art methods discussed in de Almeida et al. (2015).

The usage of datasets coming from different parking lots and scenarios allowed us to test the generalization property of our approach. To this end, we trained the CNN on one scenario and tested it in a completely different one. To the best of our knowledge, there are no other experiments where this type of generalization property has been tested.

The paper is organized as follows. Section 2 introduces other works related to our proposal. Section 3 describes the convolutional neural network implied in the classification process. Section 4 presents the datasets used to evaluate and compare our approach. Section 5 discusses the experiments and the obtained results. Section 6 discusses how the framework was deployed in a real scenario and gives an overview of the overall system. Finally, Section 7 concludes the paper.

Section snippets

Related work

To deal with the problem of light changes, Tsai, Hsieh, and Fan (2007) trained a Bayesian classifier to verify the detection of vehicles based on corners, edges, and wavelet features. Huang, Tai, and Wang (2013) used a Bayesian hierarchical framework to build a vacant parking space detection system that operates day and night based on a 3D model for parking spaces. Similarly, the method presented in Delibaltov, Wu, Loce, Bernal et al. (2013) models every parking space as a volume in the 3D

Deep embedded convolutional neural networks for occupancy detection

A very popular deep convolutional neural network, used as reference in many works, is the so called AlexNet (Krizhevsky et al., 2012). The architecture of an AlexNet consists of 60 million parameters and 500,000 neurons. It is organized into five convolutional layers, some followed by max-pooling layers, and two fully connected layers with a 1000-way softmax (more details can be found in Krizhevsky et al. (2012)). Using such an architecture directly on a low-computing power device, poses a very

Datasets

CNRPark-EXT includes and significantly extends CNRPark (Amato, Carrara, Falchi, Gennaro, & Vairo, 2016), a smaller dataset of roughly 12.000 labeled images, which we also used to perform some of the experiments.

The smaller CNRPark dataset contains images of the parking lot collected in different days of July 2015, from 2 distinct cameras A and B, (see Fig. 2, top row), which were placed in order to have different perspectives and angles of view. The CNRPark dataset is also available for

Evaluation

We used two datasets in our experiments: CNRPark-EXT, the dataset generated by us, and PKLot (de Almeida et al., 2015). The two datasets are significantly different. Besides the fact that they contain pictures taken from different parking lots, it is worth highlighting the following differences:

  • (a)

    in CNRPark-EXT parking spaces masks are non-rotated squares; often images do not cover precisely or entirely the parking space volume, whereas in PKLot images are extracted using rotated rectangular

Deployment of the proposed solution in a real scenario

The entire framework was deployed in the parking lot of the research campus of the CNR in Pisa as a Smart City application. The monitored parking lot consists of 164 parking spaces, organized in five rows, four of which are composed of about 35 parking spaces each, and one row is composed of 18 parking spaces. Although a single Raspberry Pi equipped with the standard camera module is able to monitor more than 50 parking spaces (i.e. with the given height and distance of the cameras from the

Conclusions

A deep CNN architecture designed to run on embedded systems such as smart cameras, is used to classify images of parking spaces as occupied or vacant directly on board of the smart camera. In this way, the only information that is sent to a central server for visualization is the binary output of the classification.

As a further contribution, we collected and made publicly available CNRPark-EXT, a dataset containing images of a real parking lot taken by nine smart cameras, in different days,

Acknowledgments

This work has been partially funded by the DIITET Department of CNR, in the framework of the “Renewable Energy and ICT for Sustainability Energy” project. We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Tesla K40 GPU used for this research.

References (22)

  • D. Delibaltov et al.

    Parking lot occupancy determination from lamp-post camera images

    Intelligent transportation systems-(itsc), 2013 16th international ieee conference on

    (2013)
  • Cited by (265)

    • Traffic congestion-aware graph-based vehicle rerouting framework from aerial imagery

      2023, Engineering Applications of Artificial Intelligence
    View all citing articles on Scopus
    View full text