Elsevier

Neurocomputing

Volume 100, 16 January 2013, Pages 51-57
Neurocomputing

Transfer learning for pedestrian detection

https://doi.org/10.1016/j.neucom.2011.12.043Get rights and content

Abstract

Most of the existing methods for pedestrian detection work well, only when the following assumption is satisfied: the features extracted from the training dataset and the testing dataset have very similar distributions in the feature space. However, in practice, this assumption does not hold because of the scene complexity and variation. In this paper, a new method is proposed for detecting pedestrians in various scenes based on the transfer learning technique. Our proposed method employs the following two strategies for improving the pedestrian detection performance. First, a new sample screening method based on manifold learning is proposed. The basic idea is to choose samples from the training set, which may be similar to the samples from the unseen scene, and then merge the selected samples into the unseen set. Second, a new classification model based on transfer learning is proposed. The advantage of the classification model is that only a small number of samples need to be used from the unseen scenes. Most of the training samples are still obtained from the training scene, which take up to 90% of the entire training samples. Compared to the traditional pedestrian detection methods, the proposed algorithm can adapt to different scenes for detecting pedestrians. Experiments on two pedestrian detection benchmark datasets, DC and NICTA, showed that the method can obtain better performance as compared to other previous methods.

Introduction

A major assumption in many traditional pedestrian detection methods is that the training dataset and test dataset should be similar, that is the training images should be taken from similar scenes as the testing images. However, in real pedestrian detection applications, the scene may be changing and become more complex due to varying environment such as light, weather, background and pedestrian clothing. Therefore, the traditional pedestrian detection methods may not work well when the unseen scene is different from the training set. How to find the common part between the different scenes and design pedestrian detection algorithm for different scenes specially is a problem in the current research.

Although the distribution of the training data and unseen data is different, there exists certain correlation between the two dataset because of the same feature space. Specifically, the training set has some samples which are similar to the unseen set. If the samples with high similarity in the training set are transferred to unseen scene, it can expand the scale of the unseen set. Traditional image similarity measurement typically adopts Euclidean distance. However, as the feature dimension of pedestrian image is usually very high, it is difficult to accurately measure the similarity between samples using Euclidean distance. The manifold learning method in the machine learning field has the advantage of feature dimension reduction and data visualization. It can find a low-dimensional manifold from the high dimensional space and computes the corresponding embedding mapping. Therefore, manifold learning theory is used to visualize the samples in the training set and unseen set, and further select the samples of high similarity in the two dataset. This is the most intuitive results which are obtained from the feature level.

In order to further find the shared knowledge among the samples from the two datasets, the transfer learning method is used to solve this problem. According to transfer learning, even though data from the source domain cannot be directly applied, there exists certain portion of the data being useful in learning for the target domain. The related work is to train a transferred classifier using a small number of samples in the target domain. However, transfer learning cannot be directly used for detecting pedestrians due to the complexity of the scenes and the large variation between them.

In this paper, a new method is proposed for pedestrian detection using modified transfer learning. Considering that there may be only a small number of samples available from the testing dataset, it will be difficult to train a reliable classification model. Although the distribution of the two datasets is different, they are in the same feature space, so there exists certain similarity between the training dataset and the testing dataset. The motivations of our work are as follows. We select some samples in the training set which are similar to the samples in the unseen scene through manifold learning method, and then use samples in the training set as much as possible to assist pedestrian detection in the unseen scene by transfer learning theory.

There are two main contributions of this paper. First, a new sample screening method based on Isomap algorithm is proposed. The method describes the samples in the training scene and the unseen scene visually, and then selects several samples in the training set, which are very similar to the samples in the unseen set. The selected samples are merged into the unseen set, so the method can expand the scale of the unseen scene and provide strong basis for the subsequent classifier. Second, a new classification model based on transfer learning is proposed. This algorithm only uses a small number of samples from the testing scene to assist the construction of an effective classification model. The advantage of the algorithm is that as much as possible to transfer the training scene knowledge to the testing scene for pedestrian detection. Compared to the traditional methods, the method can suit for unseen scenes for pedestrian detection.

Two datasets, DC and NICTA, which are significantly different, are used as the training set and the testing set, respectively. It was shown by our experiments that, by using only a small number of samples from the testing dataset, the proposed method was able to obtain better performance in detecting pedestrians in the NICTA dataset. The following contents of the paper are organized as follows. In Section 2, the related works are discussed. In Section 3, we describe the framework of the proposed pedestrian detection method including sample screening and classification. In Section 4, the experimental results are presented. Finally, we give the concluding remarks in Section 5.

Section snippets

Related works

Pedestrian detection method based classification is the mainstream in this field. In practice, the classifier which trained in the training set cannot achieve satisfactory result because of the scene complexity and variation. The previous methods can be divided into three categories: (1) The first kind of methods train a reliable classifier using the training dataset and detect pedestrians in the new scenes directly [1], [2], [3], [4], [5], [6], [7], [8], [9]. This kind of methods can achieve

Problem definition

Let Da={xia,yia}|i=1nbe training dataset in the training scenes and it is also an auxiliary dataset. Ds={xis,yis}|i=1m is training dataset in the unseen scenes, which contains only a small amount of labeled samples and they are not sufficient to train a classifier alone. Ds and Da may come from different distributions.Dt={xit}|i=1k is test dataset in the unseen scenes. n, m and k are the size of the dataset. Ds and Dt come from the same scene and they are in the same feature space, while Da

Datasets

The DC and NICTA datasets were used for validation of the proposed method. Since these two datasets are taken from different scenes, the former was used as the training set and the latter was taken as the scenes for testing. The datasets were manually labeled. Pedestrians are marked by rectangular in videos. The dataset was equally split into five subsets. Three of them were used for training and the rest two were used for testing. Each subset consists of 4800 pedestrian samples, which were

Conclusions

This paper presented a transfer learning based method for detecting pedestrians in changing scenes. The contributions of this paper include: (1) A new framework for detecting pedestrians with sample screening algorithm based manifold learning and classification based transfer learning. (2) Sample screening algorithm use Isomap algorithm to select useful samples in the training scenes for extending large training set. (3) ITLAdaBoost was employed to solve the problem of training set and testing

Acknowledgment

The presented research work is in part supported by the National Basic Research Program of China (973 Program) (Grant no. 2011CB707000), and the National Natural Science Foundation of China (Grant nos. 61125106, 60972103).

Xianbin Cao received the B.S. degree in computer science and the M.S. degree in information and system from Anhui University, Hefei, China, in 1990 and 1993, respectively, and the Ph.D. degree in intelligent information processing from the University of Science and Technology of China (USTC), Hefei, in 1996. He has been with the USTC since 1996 and became an Associate Professor with the Department of Computer Science and Technology from 1999 to 2005, where he is currently a Professor and the

References (31)

  • M. Enzweiler et al.

    Monocular Pedestrian Detection: Survey and Experiments,

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2009)
  • G. Overett, L. Petersson, N. Brewer, L. Andersson, N. Pettersson, A New Pedestrian Dataset for Supervised Learning, in:...
  • X.B. Cao et al.

    A low-cost pedestrian detection system with a single optical camera,

    IEEE Trans. Intell. Trans. Syst.

    (2008)
  • Y.P. Huang et al.

    Stereovision-based object segmentation for automotive applications,

    EURASIP J. Appl. Signal Processing

    (2005)
  • M. Bertozzi, A. Broggi, A. Lasagni, M. Del Rose, Infrared stereo vision-based pedestrian detection, in: Proceedings of...
  • D.M. Gavrila, J. Geibel, Shape-based pedestrian detection and tracking, in: Proceedings of the IEEE Intelligent...
  • G.M.A. Sessler, T. Martoyo, F.K. Jondral, RBF based multiuser detectors for UTRA-TDD, in: Proceedings of the IEEE...
  • F.L. Xu et al.

    Pedestrian Detection and Tracking with Night Vision

    IEEE Transactions on Intelligent Transportation System

    (2005)
  • S. Maji, A. Berg, J. Malik, Classification using intersection kernels Support Vector Machines is efficient, in:...
  • Z. Wang, X.B. Cao, Rapid classification based pedestrian detection in changing scenes, in: Proceedings of the IEEE...
  • D.M. Gavrila et al.

    Multi-Cue Pedestrian Detection and Tracking from a Moving Vehicle

    Int. J. Comput. Vision

    (2007)
  • S. Munder et al.

    An Experimental Study on Pedestrian Classification

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2006)
  • V.D. Shet, J. Neumann, V. Ramesh, L.S. Davis, Bilattice-Based Logical Reasoning for Human Detection, in: Proceedings of...
  • L. Zhang, B. Wu, R. Nevatia, Detection and Tracking of Multiple Humans with Extensive Pose Articulation, in:...
  • C. Papageorgiou et al.

    A Trainable System for Object Detection

    Intl. J. Comput. Vision

    (2000)
  • Cited by (58)

    • Transfer of resource allocation between overlapping and embedded communities in multiagent social networks

      2021, Knowledge-Based Systems
      Citation Excerpt :

      Currently, substantial literature exists investigating improvements and applications of transfer learning. For example, a study by Li et al. described a rating-matrix generative model for effective cross-domain collaborative filtering in solving the sparsity problem by transferring rating knowledge across multiple domains [43]; Ling et al. discussed a domain-transfer learning problem in spectral classification [44]; Raina et al. drew on transfer learning to use an informative prior in a Bayesian setting, which satisfied the generalization problem from limited labeled data for supervised learning [45]; Cao et al. involved a pedestrian detection method based on transfer learning, which led to effective detection in various scenes [46]; and Farhadi et al. used transfer learning in sign language to generate more reliable word models and discussed opportunities of transfer learning in computer vision [47]. Although extensive research has been conducted on transfer mechanisms, there has been limited focus on their applications to resource allocation in networks.

    • Reshaping inputs for convolutional neural network: Some common and uncommon methods

      2019, Pattern Recognition
      Citation Excerpt :

      Such kind of procedures can force us to reshape the available input as per the model specification. For example, a model trained on ImageNet can easily be adapted to some surveillance system for various different image analysis tasks like pedestrian detection [26] or geophysical analysis [27]. But surveillance cameras generally do not provide their inputs as squares and hence it is essential to reshape the input.

    View all citing articles on Scopus

    Xianbin Cao received the B.S. degree in computer science and the M.S. degree in information and system from Anhui University, Hefei, China, in 1990 and 1993, respectively, and the Ph.D. degree in intelligent information processing from the University of Science and Technology of China (USTC), Hefei, in 1996. He has been with the USTC since 1996 and became an Associate Professor with the Department of Computer Science and Technology from 1999 to 2005, where he is currently a Professor and the Administrative Director of the Anhui Province Key Laboratory of Software in Computing and Communication, Hefei since 2005. Since 2009 he has also been a professor in the school of electronic and information engineering, Beihang University PR China and is the director of the lab of intelligent transportation system. His current research interests include intelligent transportation systems, airspace transportation management and intelligent computation. He has been publishing more than 100 books, book chapters, and papers in these areas since 1993.

    Zhong Wang is a PhD candidate in the Department of Computer Science and Technology, the University of Science and Technology. He received the BS degrees in computer science from AnQing Teachers College, China, in 2006. His research interests include machine learning, computer vision, and pedestrian detection. He is a student member of IEEE.

    Pingkun Yan received the B.Eng. degree in electronics engineering and information science from the University of Science and Technology of China, Hefei, China and the Ph.D. degree in electrical and computer engineering from the National University of Singapore, Singapore. He is a full professor with the Center for OPTical IMagery Analysis and Learning (OPTIMAL), State Key Laboratory of Transient Optics and Photonics, Xi'an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi'an 710119, Shaanxi, PR China. He was a Senior Member of the Research Staff of Philips Research North America, Briarcliff Manor, NY. Before that, he worked as a Research Associate with the Computer Vision Laboratory, University of Central Florida, Orlando, FL. His research interests include computer vision, pattern recognition, machine learning, and their applications. He is a senior member of the IEEE.

    Xuelong Li is a full professor with the Center for OPTical IMagery Analysis and Learning (OPTIMAL), State Key Laboratory of Transient Optics and Photonics, Xi'an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi'an 710119, Shaanxi, PR China.

    View full text