research-article

Active Learning for Large-scale Object Classification: from Exploration to Exploitation

Authors:
Ho-Gyeong Kim

KAIST, Daejeon, South Korea

KAIST, Daejeon, South Korea
View Profile

,
Jihyeon Roh

KAIST, Daejeon, South Korea

KAIST, Daejeon, South Korea
View Profile

,
Hwaran Lee

KAIST, Daejeon, South Korea

KAIST, Daejeon, South Korea
View Profile

,
Geonmin Kim

KAIST, Daejeon, South Korea

KAIST, Daejeon, South Korea
View Profile

,
Soo-Young Lee

KAIST, Daejeon, South Korea

KAIST, Daejeon, South Korea
View Profile

HAI '15: Proceedings of the 3rd International Conference on Human-Agent InteractionOctober 2015Pages 251–254https://doi.org/10.1145/2814940.2814989

Published:21 October 2015Publication History

HAI '15: Proceedings of the 3rd International Conference on Human-Agent Interaction

Pages 251–254

ABSTRACT

Information and communication technologies supply data every day at incredibly increasing rate, however, almost all of the accumulated data are unlabeled and obtaining their labels is expensive and time-consuming. Among the raw data, selecting and labeling some samples expected to be more informative than others can enhance machines without high cost. This process is called selective sampling, essential part of active learning. So far, most researches have concentrated on classical uncertainty measures to acquire informative data, which is related to "exploitation" process of learning. However, when the initial labeled dataset is too small or biased, the early stage model can be unreliable and its decision boundary would be over-fitted to the initial data. Moreover, the obtained data by the exploitation strategy may exacerbate the model further. We introduced "exploration" strategy as well as "exploitation" strategy. In this paper, we employ Self-Organizing Maps (SOM), one of neural networks to estimate and explore data distribution. For exploitation, margin sampling is applied to the classifier, neural network with soft-max output layer. The effectiveness proposed methods are demonstrated on ILSVRC-2011 image classification task based on features extracted from well-trained Convolutional Neural Networks (CNN). Active learning with exploration strategy shows its potential by stabilizing the early stage model and reducing the classification error rate, and finally making it to be high-quality models.

References

COHN, David; ATLAS, Les; LADNER, Richard. Improving generalization with active learning. Machine Learning, 1994, 15.2: 201--22 Google ScholarDigital Library
SCHOHN, Greg; COHN, David. Less is more: Active learning with support vector machines. In: ICML. 2000. p. 839--846. Google ScholarDigital Library
Xu, Zhao, et al. "Representative sampling for text classification using support vector machines." ECIR. Vol. 3. 2003. Google ScholarDigital Library
FREUND, Yoav, et al. Selective sampling using the query by committee algorithm. Machine learning, 1997, 28.2--3: 133--168. Google ScholarDigital Library
IYENGAR, Vijay S.; APTE, Chidanand; ZHANG, Tong. Active learning using adaptive resampling. In: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2000. p. 91--98. Google ScholarDigital Library
LINDENBAUM, Michael; MARKOVITCH, Shaul; RUSAKOV, Dmitry. Selective sampling for nearest neighbor classifiers. Machine learning, 2004, 54.2: 125--152. Google ScholarDigital Library
ROY, Nicholas; MCCALLUM, Andrew. Toward optimal active learning through monte carlo estimation of error reduction. ICML, Williamstown, 2001. Google ScholarDigital Library
T. Scheffer, C. Decomain, and S. Wrobel. Active hidden Markov models for information extraction. In Proceedings of the International Conference on Advances in Intelligent Data Analysis (CAIDA), pages 309--318. Springer-Verlag, 2001. Google ScholarDigital Library

Index Terms

Active Learning for Large-scale Object Classification: from Exploration to Exploitation
1. Information systems
  1. Information systems applications
    1. Multimedia information systems

Recommendations

An active learning-based SVM multi-class classification model

Traditional multi-class classification models are based on labeled data and are not applicable to unlabeled data. To overcome this limitation, this paper presents a multi-class classification model that is based on active learning and support vector ...
Read More
Combining active learning and semi-supervised for improving learning performance
ISABEL '11: Proceedings of the 4th International Symposium on Applied Sciences in Biomedical and Communication Technologies

In many learning tasks, there are abundant unlabeled samples but the number of labeled training samples is limited, because labeling the samples requires the efforts of human annotators and expertise. There are three major techniques for labeling the ...
Read More
Semi-supervised Dictionary Active Learning for Pattern Classification
Pattern Recognition and Computer Vision
Abstract
Gathering labeled data is one of the most time-consuming and expensive tasks in supervised machine learning. In practical applications, there are usually quite limited labeled training samples but abundant unlabeled data that is easy to collect. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
HAI '15: Proceedings of the 3rd International Conference on Human-Agent Interaction
October 2015
254 pages
ISBN:9781450335270
DOI:10.1145/2814940
General Chairs:
Minho Lee
Kyungpook National University, Korea
,
Takashi Omori
Tamagawa University, Japan
,
Program Chairs:
Hirotaka Osawa
University of Tsukuba, Japan
,
Hyeyoung Park
Kyungpook National University, Korea
,
James Young
University of Manitoba, Canada
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 October 2015
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
active learning
exploration-exploitation
large-scale image classification
neural network
self-organizing maps
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate121of404submissions,30%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 156
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Active Learning for Large-scale Object Classification: from Exploration to Exploitation

HAI '15: Proceedings of the 3rd International Conference on Human-Agent Interaction

ABSTRACT

References

Cited By

Index Terms

Recommendations

An active learning-based SVM multi-class classification model

Combining active learning and semi-supervised for improving learning performance

Semi-supervised Dictionary Active Learning for Pattern Classification