skip to main content
10.1145/3290589.3290601acmotherconferencesArticle/Chapter ViewAbstractPublication PagesssipConference Proceedingsconference-collections
research-article

Pedestrian Detection based on Reduced High-Dimensional Distinctive Feature using Deep Neural Network

Published: 12 October 2018 Publication History

Abstract

Pedestrian detection is an essential and significant research topic due to its diverse applications in the area of safety systems. The distinctiveness is detected by the following three steps. Firstly, images are represented by their respective co-occurrence matrices, which are vectorized by using the bag of visual words (BoVW) framework. Secondly, their weights are calculated from the histograms of visual words of each class. Finally, computed weights are applied to the testing image set as the distinctiveness of visual words. The fully connected Multi-Layer Perceptron (MLP) is used as classification method in our system, which has limited performance. This paper introduces a combination of principle component analysis (PCA), guided filtering, deeplearning architecture into visual data classification. In detail, as a mature dimension reduction architecture, PCA is capable of reducing the redundancy of multi-dimensional information. The proposed method is compared with the MLP using the Caltech 256 datasets, with the following classes: pedestrians, cars, motorbikes and airplanes. The experimental results show that the proposed method outperforms current methods in predicting pedestrians and transportation objects. It yields an average accuracy of 96.50%, which is approximately 8.5% more than the compared state-of-the-art.

References

[1]
Abdullah, A., Veltkamp, R. C., and Wiering, M. A. Ensembles of novel visual keywords descriptors for image categorization. In Control Automation Robotics & Vision (ICARCV), 2010 11th International Conference on (2010), IEEE, pp. 1206--1211.
[2]
Bosch, A., Zisserman, A., and Munoz, X. Representing shape with a spatial pyramid kernel. In Proceedings of the 6th ACM international conference on Image and video retrieval (2007), ACM, pp. 401--408.
[3]
Buitinck, L., Louppe, G., Blondel, M., Pedregosa, F., Mueller, A., Grisel, O., Niculae, V., Prettenhofer, P., Gramfort, A., Grobler, J., Layton, R., VanderPlas, J., Joly, A., Holt, B., and Varoquaux, G. API design for machine learning software: experiences from the scikit-learn project. In ECML PKDD Workshop: Languages for Data Mining and Machine Learning (2013), pp. 108--122.
[4]
Csurka, G., Dance, C., Fan, L., Willamowski, J., and Bray, C. Visual categorization with bags of keypoints. In Workshop on statistical learning in computer vision, ECCV (2004), vol. 1, Prague, pp. 1--2.
[5]
Farquhar, J., Szedmak, S., Meng, H., and Shawe-Taylor, J. Improving" bag-of-keypoints" image categorisation: Generative models and pdf-kernels.
[6]
Fei-Fei, L., Fergus, R., and Perona, P. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. Computer vision and Image understanding 106, 1 (2007), 59--70.
[7]
Fei-Fei, L., and Perona, P. A bayesian hierarchical model for learning natural scene categories. In Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on (2005), vol. 2, IEEE, pp. 524--531.
[8]
Fergus, R., Fei-Fei, L., Perona, P., and Zisserman, A. Learning object categories from google's image search. In Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on (2005), vol. 2, IEEE, pp. 1816--1823.
[9]
Goodfellow, I., Bengio, Y., and Courville, A. Deep Learning. MIT Press, 2016. http://www.deeplearningbook.org.
[10]
Grauman, K., and Darrell, T. The pyramid match kernel: Discriminative classification with sets of image features. In Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on (2005), vol. 2, IEEE, pp. 1458--1465.
[11]
Griffin, G., Holub, A., and Perona, P. Caltech-256 object category dataset. In California Institute of Technology (2007).
[12]
Hadjidemetriou, E., Grossberg, M. D., and Nayar, S. K. Spatial information in multiresolution histograms. In Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference on (2001), vol. 1, IEEE, pp. I-I.
[13]
Han, H., Han, Q., Li, X., and Gu, J. Hierarchical spatial pyramid max pooling based on sift features and sparse coding for image classification. IET Computer Vision 7, 2 (2013), 144--150.
[14]
Jarrett, K., Kavukcuoglu, K., LeCun, Y., et al. What is the best multi-stage architecture for object recognition? In Computer Vision, 2009 IEEE 12th International Conference on (2009), IEEE, pp. 2146--2153.
[15]
Ji, Z. Decoupling sparse coding with fusion of fisher vectors and scalable svms for large-scale visual recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2013), pp. 450--457.
[16]
Joachims, T. A probabilistic analysis of the rocchio algorithm with tfidf for text categorization. Tech. rep., DTIC Document, 1996.
[17]
Lazebnik, S., Schmid, C., and Ponce, J. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Computer vision and pattern recognition, 2006 IEEE computer society conference on (2006), vol. 2, IEEE, pp. 2169--2178.
[18]
Lewis, D. D. Naive (bayes) at forty: The independence assumption in information retrieval. In European conference on machine learning (1998), Springer, pp. 4--15.
[19]
Lowe, D. G. Object recognition from local scale-invariant features. In Computer vision, 1999. The proceedings of the seventh IEEE international conference on (1999), vol. 2, IEEE, pp. 1150--1157.
[20]
Nair, V., and Hinton, G. E. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10) (2010), pp. 807--814.
[21]
Overett, G., Petersson, L., Brewer, N., Andersson, L., and Pettersson, N. A new pedestrian dataset for supervised learning. In Intelligent Vehicles Symposium, 2008 IEEE (2008), IEEE, pp. 373--378.
[22]
Parkhi, O. M., Vedaldi, A., Zisserman, A., and Jawahar, C. Cats and dogs. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on (2012), IEEE, pp. 3498--3505.
[23]
Perronnin, F., and Dance, C. Fisher kernels on visual vocabularies for image categorization. In Computer Vision and Pattern Recognition, 2007. CVPR'07. IEEE Conference on (2007), IEEE, pp. 1--8.
[24]
Sivic, J., Russell, B. C., Efros, A. A., Zisserman, A., and Freeman, W. T. Discovering object categories in image collections. In Proceedings of the International Conference on Computer Vision (2005).
[25]
Sivic, J., Zisserman, A., et al. Video google: A text retrieval approach to object matching in videos. In iccv (2003), vol. 2, pp. 1470--1477.
[26]
Zhang, J., Marszalek, M., Lazebnik, S., and Schmid, C. Local features and kernels for classification of texture and object categories: A comprehensive study. In Computer Vision and Pattern Recognition Workshop, 2006. CVPRW'06. Conference on (2006), IEEE, pp. 13--13.

Index Terms

  1. Pedestrian Detection based on Reduced High-Dimensional Distinctive Feature using Deep Neural Network

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    SSIP '18: Proceedings of the 2018 International Conference on Sensors, Signal and Image Processing
    October 2018
    88 pages
    ISBN:9781450366205
    DOI:10.1145/3290589
    Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

    In-Cooperation

    • CTU: Czech Technical University in Prague

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 October 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Bag of visual words
    2. Multi-Layer Perceptron
    3. Pedestrian detection
    4. Principle Component Analysis
    5. Scale-invariant feature transform
    6. Weighting scheme

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    SSIP 2018

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 63
      Total Downloads
    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 20 Jan 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media