research-article

Sampling May Not Always Increase Detector Performance: A Study on Collecting Training Examples

Authors:
Jun Liu

Northwestern Polytechnical University, China

Northwestern Polytechnical University, China
View Profile

,
Shuang Lai

Northwestern Polytechnical University, China

Northwestern Polytechnical University, China
View Profile

CSAI '21: Proceedings of the 2021 5th International Conference on Computer Science and Artificial IntelligenceDecember 2021Pages 134–137https://doi.org/10.1145/3507548.3507568

Published:09 March 2022Publication History

CSAI '21: Proceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence

Pages 134–137

ABSTRACT

In recent years, the research of computer vision is popular. However, the image data that can be used for computer vision training is very limited, so it is necessary to find an effective method to expand the datasets based on the existing image data. In this paper, we study methods to collect more training data from existing datasets and compare detectors’ performance trained with datasets generated by different methods. One method is to perform sampling-based on statistical properties of feature descriptors. For every feature, the underlying assumption is that a probability density function (PDF) exists, such PDF is approximated with existing training examples and new training examples are sampled from the approximated PDF. The other method is simply to expand the existing datasets by flipping each training example along its symmetric axis. Locally Adaptive Regression Kernel (LARK) feature is used in this paper because it is robust against illumination changes and noise. Our experimental results demonstrate that an expanded training dataset is not always preferable, even if the expanded dataset includes all original training data.

References

Box, G. E. P., & Muller, M. E. (1958). A note on the generation of random normal deviates. The Annals of Mathematical Statistics, 29(2), 610–611. https://doi.org/10.1214/aoms/1177706645Google ScholarCross Ref
Dalal, N., & Triggs, B. (2005, June). Histograms of oriented gradients for human detection. In 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05) (Vol. 1, pp. 886-893). Ieee.Google Scholar
Goodman, L. A. (1954). Some practical techniques in serial number analysis. Journal of the American Statistical Association, 49(265), 97–112. https://doi.org/10.1080/01621459.1954.10501218Google ScholarCross Ref
Hussein, M., Porikli, F., & Davis, L. (2009). A comprehensive evaluation framework and a comparative study for human detectors. IEEE Transactions on Intelligent Transportation Systems, 10(3), 417–427. https://doi.org/10.1109/tits.2009.2026670Google ScholarDigital Library
Classic AdaBoost Classifier. (2012). Www.mathworks.com. http://www.mathworks.com/matlabcentral/fileexchange/27813Google Scholar
Lilliefors, H. W. (1967). On the Kolmogorov-Smirnov test for normality with mean and variance unknown. Journal of the American Statistical Association, 62(318), 399–402. https://doi.org/10.1080/01621459.1967.10482916Google ScholarCross Ref
RRuggles, R., & Brodie, H. (1947). An empirical approach to economic intelligence in World War II. Journal of the American Statistical Association, 42(237), 72–91. https://doi.org/10.1080/01621459.1947.10501915Google ScholarCross Ref
Schapire, R. E., & Singer, Y. (1999). Improved boosting algorithms using confidence-rated predictions. Machine learning, 37(3), 297-336.Google Scholar
Seo, H. J., & Milanfar, P. (2010). Training-free, generic object detection using locally adaptive regression kernels. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(9), 1688–1704. https://doi.org/10.1109/tpami.2009.153Google ScholarDigital Library
Seo, H. J., & Milanfar, P. (2010). Action recognition from one example. IEEE transactions on pattern analysis and machine intelligence, 33(5), 867-882.Google Scholar
Takeda, H., Farsiu, S., & Milanfar, P. (2007). Kernel regression for image processing and reconstruction. IEEE Transactions on image processing, 16(2), 349-366.Google ScholarDigital Library
Viola, P., & Jones, M. (2001, December). Rapid object detection using a boosted cascade of simple features. In Proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition. CVPR 2001 (Vol. 1, pp. I-I). Ieee.Google Scholar

Index Terms

Sampling May Not Always Increase Detector Performance: A Study on Collecting Training Examples
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object recognition
  2. Machine learning
    1. Machine learning algorithms
      1. Feature selection

Index terms have been assigned to the content through auto-classification.

Recommendations

Classification with class noises through probabilistic sampling

A probabilistic sampling (PSAM) scheme is proposed to improve the classifier accuracy with mislabeled training data.A multiple voting based method is proposed for PSAM to estimate the confidence of each training data.A novel sampling method is proposed ...
Read More
Non-IID always Bad? Semi-Supervised Heterogeneous Federated Learning with Local Knowledge Enhancement
CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

Federated learning (FL) is important for privacy-preserving services by training models without collecting raw user data. Most FL algorithms assume all data is annotated, which is impractical due to the high cost of labeling data in real applications. To ...
Read More
UnseenNet: Fast Training Detector for Unseen Concepts with No Bounding Boxes
Image and Vision Computing
Abstract
Training of object detection models using less data is currently the focus of existing N-shot learning models in computer vision. Such methods use object-level labels and takes hours to train on unseen classes. There are many cases where we have ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

CSAI '21: Proceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence
December 2021
437 pages
ISBN:9781450384155
DOI:10.1145/3507548

Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 9 March 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 13
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Sampling May Not Always Increase Detector Performance: A Study on Collecting Training Examples

CSAI '21: Proceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence

ABSTRACT

References

Cited By

Index Terms

Recommendations

Classification with class noises through probabilistic sampling

Non-IID always Bad? Semi-Supervised Heterogeneous Federated Learning with Local Knowledge Enhancement

UnseenNet: Fast Training Detector for Unseen Concepts with No Bounding Boxes

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Sampling May Not Always Increase Detector Performance: A Study on Collecting Training Examples

CSAI '21: Proceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence

ABSTRACT

References

Cited By

Index Terms

Recommendations

Classification with class noises through probabilistic sampling

Non-IID always Bad? Semi-Supervised Heterogeneous Federated Learning with Local Knowledge Enhancement

UnseenNet: Fast Training Detector for Unseen Concepts with No Bounding Boxes

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media