research-article

Affine Stable Characteristic based sample expansion for object detection

Authors:

Yongdong Zhang,

Shouxun LinAuthors Info & Claims

CIVR '10: Proceedings of the ACM International Conference on Image and Video Retrieval

Pages 422 - 429

https://doi.org/10.1145/1816041.1816103

Published: 05 July 2010 Publication History

Abstract

Generating better object model from automatic expanded samples is an effective approach to improve the performance of object detection. However, most existing methods either don't work well with limited relevance images in corpus, or result in redundant features and the decrease of detection speed. In this paper, we propose a novel method called Affine Stable Characteristic to generate an object feature model using only one object sample. By integrating affine simulation with stable characteristic mining, a compact and informative object model is generated with high robustness to viewpoint and scale transformations. For characteristic mining, two new notions, Global Stability and Local Stability, are introduced to calculate the robustness of each object feature from complementary hierarchies. And they are combined to generate the final object feature model. Experiments show that our novel method is capable of detecting objects in various geometric and photometric transformations, while only acquiring one sample image. In a compiled dataset composed of many famous test sets, the detection accuracy can be improved 35.8% compared with traditional methods at rapid on-line speed. The proposed approach can also be well generalized to other content analysis tasks.

References

[1]

O. Chum, J. Philbin, J. Sivic, M. Isard, A. Zisserman. Total recall: automatic query expansion with a generative feature model for object retrieval, International Conference on Computer Vision, 2007.

[2]

O. Chum and A. Zisserman. An exemplar model for learning object classes, Computer Vision and Pattern Recognition, 2007.

[3]

V. Ferrari, L. Fevrier, F. Jurie, and C. Schmid. Groups of adjacent contour segments for object detection, IEEE Transactions on Pattern Analysis and Machine Intelligence, 30:36--51, 2008.

Digital Library

[4]

J. Philbin, O. Chum, M. Isard, J. Sivic and A. Zisserman. Object retrieval with large vocabularies and fast spatial matching, Computer Vision and Pattern Recognition, 2007.

[5]

Y. G. Jiang, C. W. Ngo and J. Yang. Towards optimal bag-of-features for object categorization and semantic video retrieval. International Conference on Computer Vision, 2007.

Digital Library

[6]

Y. H. Kuo, K. T. Chen, C. H. Chiang, and W. H. Hsu. Query expansion for hash-based image object retrieval, International Conference on Multimedia, 2009.

Digital Library

[7]

A. Holub, P. Perona, M. C. Burl. Entropy-based active learning for object recognition, Computer Vision and Pattern Recognition Workshops, 2008.

[8]

J. H. Hsiao, C. S. Chen, L. F. Chien, and M. S. Chen. A new approach to image copy detection based on extended feature sets, IEEE Transactions on Image Processing, 16(8):2069--2079, 2007.

Digital Library

[9]

W. Wu and J. Yang. Object fingerprints for content analysis with applications to street landmark localization. ACM Multimedia, 2008.

Digital Library

[10]

D. Pritchard and W. Heidrich. Cloth motion capture. Computer Graphics Forum, 22(3):263--271, 2003.

[11]

J. M. Morel and G. Yu, ASIFT: A new framework for fully affine invariant image comparison, SIAM Journal on Imaging Sciences, 2(2), 2009.

Digital Library

[12]

K. Mikolajczyk, T. Tuytelaars, C. Schmid, etc. A comparison of affine region detectors, In International Journal on Computer Vision, 65(1/2):43--72, 2005.

Digital Library

[13]

K. Mikolajczyk, and C. Schmid. A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10):1615--1630, 2005.

Digital Library

[14]

J. Matas, O. Chum, M. Urban, and T. Pajdla. Robust wide-baseline stereo from maximally stable extremal regions, British Machine Vision Conference, pp. 384--393, 2002.

[15]

K. Mikolajczyk, and C. Schmid. Scale & affine invariant interest point detectors, International Journal on Computer Vision, 60(1):63--86, 2004.

Digital Library

[16]

D. Lowe. Distinctive image features from scale invariant keypoints, In International Journal on Computer Vision, 60(2): 91--110, 2004.

Digital Library

[17]

H. Jegou, M. Douze and C. Schmid. Hamming embedding and weak geometry consistency for large scale image search, European conference on Computer vision, 2008.

Digital Library

[18]

J. Sivic, A. Zisserman. Video google: a text retrieval approach to object matching in vdeos, International Conference on Computer Vision, 2003.

Digital Library

[19]

H. Bay, T. Tuytelaars, L. V. Gool. SURF: speeded up robust features, European Conference on Computer Vision, 2006.

Digital Library

[20]

S. Arya, T. Malamatos, and D. M. Mount. Space-time tradeoffs for approximate nearest neighbor searching, Journal of the ACM, 57: 1--54, 2009.

Digital Library

[21]

K. Gao, S. L, Y. Z, S. T and D. Z. Logo detection based on spatial-spectral saliency and partial spatial context, IEEE International Conference on Multimedia and Expo, 2009.

Digital Library

[22]

Http://www-nlpir.nist.gov/projects/trecvid/

[23]

H. Xie, K. Gao, Y. Zhang, J. Li, Y. Liu. GPU-basd fast scale invariant interest point detector, IEEE International Conference on Acoustics, Speech, and Signal Processing, 2010.

Cited By

Tang SGao KGu XYan CZhang Y(2022)High-Throughput Content-Based Video Analysis TechnologiesJournal of Engineering Studies10.3724/SP.J.1224.2014.0029406:03(294-306)Online publication date: 13-Oct-2022
https://doi.org/10.3724/SP.J.1224.2014.00294
Dai JLi CZuo YAi H(2020)An OSM Data-Driven Method for Road-Positive Sample CreationRemote Sensing10.3390/rs1221361212:21(3612)Online publication date: 3-Nov-2020
https://doi.org/10.3390/rs12213612
Huang JLi BZhu JChen J(2017)Age classification with deep learning face representationMultimedia Tools and Applications10.1007/s11042-017-4646-576:19(20231-20247)Online publication date: 1-Oct-2017
https://dl.acm.org/doi/10.1007/s11042-017-4646-5
Show More Cited By

Index Terms

Affine Stable Characteristic based sample expansion for object detection
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object recognition
2. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Recommendations

Event Detection Based on Noisy Object Information
ACPR '13: Proceedings of the 2013 2nd IAPR Asian Conference on Pattern Recognition

This paper proposes an event detection method using noisy object information. Some events have a close connection with objects, and the objects related to the event often appear with the event in a video. For example, if an event "Grooming an animal" ...
Dynamic sample weighting for weakly supervised object detection
Highlights
- A dynamic sample weighting strategy for weakly supervised object detection.
- Local domination is analyzed from the perspective of sample balance.
- A new perspective on sample importance is provided.
- Dynamically allocate the ...
Abstract
The framework based on Multiple Instance Learning (MIL) greatly improves the performance of Weakly Supervised Object Detection (WSOD), which enjoys a promising development. However, the detection results tend to be the most discriminative parts ...
A review and an approach for object detection in images

An object detection system finds objects of the real world present either in a digital image or a video, where the object can belong to any class of objects namely humans, cars, etc. In order to detect an object in an image or a video the system needs ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIVR '10: Proceedings of the ACM International Conference on Image and Video Retrieval

July 2010

492 pages

ISBN:9781450301176

DOI:10.1145/1816041

Conference Chairs:
Shipeng Li
Microsoft Research Asia, China
,
Xinbo Gao
Xidian University, China
,
Nicu Sebe
University of Trento, Italy

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 July 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China
Beijing New Star Project on Science & Technology
Ministry of Science and Technology of the People's Republic of China

Conference

CIVR' 10

Sponsor:

SIGMM

CIVR' 10: International Conference on Image and Video Retrieval

July 5 - 7, 2010

Xi'an, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
205
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 30 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Tang SGao KGu XYan CZhang Y(2022)High-Throughput Content-Based Video Analysis TechnologiesJournal of Engineering Studies10.3724/SP.J.1224.2014.0029406:03(294-306)Online publication date: 13-Oct-2022
https://doi.org/10.3724/SP.J.1224.2014.00294
Dai JLi CZuo YAi H(2020)An OSM Data-Driven Method for Road-Positive Sample CreationRemote Sensing10.3390/rs1221361212:21(3612)Online publication date: 3-Nov-2020
https://doi.org/10.3390/rs12213612
Huang JLi BZhu JChen J(2017)Age classification with deep learning face representationMultimedia Tools and Applications10.1007/s11042-017-4646-576:19(20231-20247)Online publication date: 1-Oct-2017
https://dl.acm.org/doi/10.1007/s11042-017-4646-5
Zhang LGao K(2016)Visual homographNeurocomputing10.1016/j.neucom.2016.04.057208:C(342-349)Online publication date: 5-Oct-2016
https://dl.acm.org/doi/10.1016/j.neucom.2016.04.057
Garg RAulakh IKumari N(2014)A mathematical model to detect hand object from the scene2014 IEEE International Advance Computing Conference (IACC)10.1109/IAdCC.2014.6779485(1133-1136)Online publication date: Feb-2014
https://doi.org/10.1109/IAdCC.2014.6779485
Gao KZhang YZhang DLin S(2013)Accurate off-line query expansion for large-scale mobile visual searchSignal Processing10.1016/j.sigpro.2012.10.01193:8(2305-2315)Online publication date: 1-Aug-2013
https://dl.acm.org/doi/10.1016/j.sigpro.2012.10.011
Ke Gao Yongdong Zhang Ping Luo Wei Zhang Junhai Xia Shouxun Lin (2012)Visual stem mapping and Geometric Tense coding for Augmented Visual Vocabulary2012 IEEE Conference on Computer Vision and Pattern Recognition10.1109/CVPR.2012.6248059(3234-3241)Online publication date: Jun-2012
https://doi.org/10.1109/CVPR.2012.6248059

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten