research-article

Object Recognition Using Deep Neural Network with Distinctive Features

Authors:

Hyun Chul Song,

Kwang Nam ChoiAuthors Info & Claims

ICVIP '18: Proceedings of the 2018 2nd International Conference on Video and Image Processing

Pages 203 - 207

https://doi.org/10.1145/3301506.3301518

Published: 29 December 2018 Publication History

Abstract

In this paper, a new object recognition method using statistically weighting Multi-Layer Perceptron (MLP) is proposed. It uses visual distinctive features, which are computed using Bag of Visual Words (BoVW) framework. The proposed method has the following three main steps. At first it represents the images into their respective co-occurrence matrices, which are vectorized using BoVW and gives distinctive features. Then it computes weights from the histograms of visual words for each class. Finally, the statistically weighting distinctive features are applied to the testing image set to find the object class. In the proposed method, we improved MLP by introducing the weighted visual words, which are extracted by sampling the patches from the current image. From the Caltech 256 dataset, four classes namely pedestrians, cars, motorbikes and airplanes are used for the classification accuracy comparison between the MLP based artificial neural network (ANN) and the proposed method. The experimental results show that our method outperforms traditional MLP yielding an average classification accuracy of 89.60%, which is approximately 6.3% more than the compared MLP.

References

[1]

A. Abdullah, R. C. Veltkamp, and M. A. Wiering. Ensembles of novel visual keywords descriptors for image categorization. In Control Automation Robotics & Vision (ICARCV), 2010 11th International Conference on, pages 1206--1211. IEEE, 2010.

[2]

Z. Al-Zaydi, B. Vuksanovic, and I. Habeeb. Image processing based ambient context-aware people detection and counting. Int. J. Mach. Learn. Comput.(IJMLC), 8(3):268--273, 2018.

[3]

G. Csurka, C. Dance, L. Fan, J. Willamowski, and C. Bray. Visual categorization with bags of keypoints. In Workshop on statistical learning in computer vision, ECCV, volume 1, pages 1--2. Prague, 2004.

[4]

Y. R. Devi, S. Sarojini. A survey on machine learning and statistical techniques in bankruptcy prediction. Int. J. Mach. Learn. Comput.(IJMLC), 8(2):268--273, 2018.

[5]

J. Farquhar, S. Szedmak, H. Meng, and J. Shawe-Taylor. Improving "bag-of-keypoints" image categorisation: Generative models and pdf-kernels. 2005.

[6]

L. Fei-Fei, R. Fergus, and P. Perona. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. Computer vision and Image understanding, 106(1):59--70, 2007.

Digital Library

[7]

R. Fergus, L. Fei-Fei, P. Perona, and A. Zisserman. Learning object categories from google's image search. In Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on, volume 2, pages1816--1823. IEEE, 2005.

Digital Library

[8]

G. Griffin, A. Holub, and P. Perona. Caltech-256 object ategory dataset. In California Institute of Technology, 2007.

[9]

H. Han, Q. Han, X. Li, and J. Gu. Hierarchical spatial pyramid max pooling based on sift features and sparse coding for image classification. IET Computer Vision, 7(2):144--150, 2013.

[10]

C. Harris and M. Stephens. A combined corner and edge detector. In Alvey vision conference, volume 15, pages 10--5244. Citeseer, 1988.

[11]

M. Heidarysafa, K. Kowsari, D. E. Brown, K. J. Meimandi, and L. E. Barnes. An improvement of data classification using random multimodel deep learning(rmdl). Int. J. Mach. Learn. Comput.(IJMLC), 8(4):268--273, 2018.

[12]

Z. Ji. Decoupling sparse coding with fusion of fisher vectors and scalable svms for large-scale visual recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 450--457, 2013.

Digital Library

[13]

Y.-G. Jiang, C.-W. Ngo, and J. Yang. Towards optimal bag-of-features for object categorization and semantic video retrieval. In Proceedings of the 6th ACM international conference on Image and video retrieval, pages 494--501. ACM, 2007.

Digital Library

[14]

T. Joachims. A probabilistic analysis of the rocchio algorithm with tfidf for text categorization. Technical report, DTIC Document, 1996.

[15]

S. Lazebnik, C. Schmid, and J. Ponce. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Computer vision and pattern recognition, 2006 IEEE computer society conference on, volume 2, pages 2169--2178. IEEE, 2006.

Digital Library

[16]

D. D. Lewis. Naive (bayes) at forty: The independence assumption in information retrieval. In European conference on machine learning, pages 4--15. Springer, 1998.

Digital Library

[17]

D. G. Lowe. Local feature view clustering for 3d object recognition. In Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference on, volume 1, pages I-I. IEEE, 2001.

[18]

G. Overett, L. Petersson, N. Brewer, L. Andersson, and N. Pettersson. A new pedestrian dataset for supervised learning. In Intelligent Vehicles Symposium, 2008 IEEE, pages 373--378. IEEE, 2008.

[19]

O. M. Parkhi, A. Vedaldi, A. Zisserman, and C. Jawahar. Cats and dogs. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pages 3498--3505. IEEE, 2012.

Digital Library

[20]

F. Perronnin and C. Dance. Fisher kernels on visual vocabularies for image categorization. In Computer Vision and Pattern Recognition, 2007. CVPR'07.IEEE Conference on, pages 1--8. IEEE, 2007.

[21]

J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Lost in quantization: Improving particular object retrieval in large scale image databases. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pages 1--8. IEEE, 2008.

[22]

J. Sivic, A. Zisserman, et al. Video google: A text retrieval approach to object matching in videos. In iccv, volume 2, pages 1470--1477, 2003.

Digital Library

[23]

J. C. Van Gemert, C. J. Veenman, A. W. Smeulders, and J.-M. Geusebroek. Visual word ambiguity. IEEE transactions on pattern analysis and machine intelligence, 32(7):1271--1283, 2010.

Digital Library

Index Terms

Object Recognition Using Deep Neural Network with Distinctive Features
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

Pedestrian Detection based on Reduced High-Dimensional Distinctive Feature using Deep Neural Network
SSIP '18: Proceedings of the 2018 International Conference on Sensors, Signal and Image Processing

Pedestrian detection is an essential and significant research topic due to its diverse applications in the area of safety systems. The distinctiveness is detected by the following three steps. Firstly, images are represented by their respective co-...
Transportation Object Detection with Bag of Visual Words Model by PLSA and MLP

Visual big data is an essential and significant research topic, due to its diverse applications. In this paper, a new visual detection method for transportation is proposed based on probabilistic latent semantic analysis with visual data. We detect the ...
Random interest regions for object recognition based on texture descriptors and bag of features

In this work we propose a novel method for object recognition based on a random selection of interest regions, texture features (local binary/ternary patterns and local phase quantization) for describing each region, a bag-of-features approach for ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICVIP '18: Proceedings of the 2018 2nd International Conference on Video and Image Processing

December 2018

252 pages

ISBN:9781450366137

DOI:10.1145/3301506

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

Kyoto University: Kyoto University
TU: Tianjin University

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 December 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ICVIP 2018

ICVIP 2018: 2018 the 2nd International Conference on Video and Image Processing

December 29 - 31, 2018

Hong Kong, Hong Kong

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
69
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 20 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents