Article

Practical learning from one-sided feedback

Author:
D. Sculley

Tufts University

Tufts University
View Profile

KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data miningAugust 2007Pages 609–618https://doi.org/10.1145/1281192.1281258

Published:12 August 2007Publication History

KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 609–618

ABSTRACT

In many data mining applications, online labeling feedback is only available for examples which were predicted to belong to the positive class. Such applications includespam filtering in the case where users never checkemails marked "spam", document retrieval where users cannotgive relevance feedback on unretrieved documents,and online advertising where user behavior cannot beobserved for unshown advertisements. One-sided feedback can cripple the performance of classical mistake-driven online learners such as Perceptron. Previous work under the Apple Tasting framework showed how to transform standard online learners into successful learners from one sided feedback. However, we find in practice that this transformation may request more labels than necessary to achieve strong performance. In this paper,we employ two active learning methods which reduce the number of labels requested in practice. One method is the use of Label Efficient active learning. The other method,somewhat surprisingly, is the use of margin-based learners without modification, which we show combines implicit active learning and a greedy strategy to managing the exploration exploitation tradeoff. Experimental results show that these methods can be significantly more effective in practice than those using the Apple Tasting transformation, even on minority class problems.

References

N. Abe and T. Kamba. A web marketing system with automatic pricing. Comput. Networks, 33(1--6): 775--788, 2000. Google ScholarDigital Library
N. Cesa-Bianchi, C. Gentile, and L. Zaniboni. Worst-case analysis of selective sampling for linear classification. Journal of Machine Learning Research, 7: 1205--1230, 2006. Google ScholarDigital Library
D. Cohn, L. Atlas, and R. Ladner. Improving generalization with active learning. Mach. Learn., 15(2):201--221, 1994. Google ScholarDigital Library
G. V. Cormack. TREC 2006 spam track overview. In The Fifteenth Text REtrieval Conference (TREC 2006) Proceedings, 2006.Google Scholar
G. V. Cormack and T. R. Lynam. TREC 2005 spam track overview. In The Fourteenth Text REtrieval Conference (TREC 2005) Proceedings, 2005.Google Scholar
S. Dasgupta. Analysis of a greedy active learning strategy. NIPS: Advances in Neural Information Processing Systems, 2004.Google Scholar
Y. Freund, H. S. Seung, E. Shamir, and N. Tishby. Selective sampling using the query by committee algorithm. Machine Learning, 28(2--3):133----168, 1997. Google ScholarDigital Library
D. Helmbold and S. Panizza. Some label efficient learning results. In COLT '97: Proceedings of the tenth annual conference on Computational learning theory, pages 218--230, 1997. Google ScholarDigital Library
D. P. Helmbold, N. Littlestone, and P. M. Long. Apple tasting. Inf. Comput., 161(2): 85--139, 2000. Google ScholarDigital Library
S. Hettich and S. D. Bay. The UCI KDD archive. Technical report, 1999.Google Scholar
W. Krauth and M. Mézard. Learning algorithms with optimal stability in neural networks. Journal of Physics A, 20(11):745--752, 1987.Google ScholarCross Ref
D. D. Lewis and W. A. Gale. A sequential algorithm for training text classifiers. In SIGIR '94: Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval, pages 3--12, 1994. Google ScholarDigital Library
N. Littlestone. Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Mach. Learn., 2(4): 285--318, 1988. Google ScholarDigital Library
J. Platt. Sequenital minimal optimization: A fast algorithm for training support vector machines. In B. Scholkopf, C. Burges, and A. Smola, editors, Advances in Kernel Methods -- Support Vector Learning. MIT Press, 1998.Google Scholar
F. Rosenblatt. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65:386--407, 1958.Google ScholarDigital Library
N. Roy and A. McCallum. Toward optimal active learning through sampling estimation of error reduction. In ICML '01: Proceedings of the Eighteenth International Conference on Machine Learning, pages 441--448, 2001. Google ScholarDigital Library
G. Salton and C. Buckley. Improving retrieval performance by relevance feedback. Readings in information retrieval, pages 355--364, 1997. Google ScholarDigital Library
G. Schohn and D. Cohn. Less is more: Active learning with support vector machines. In ICML'00: Proceedings of the Seventeenth International Conference on Machine Learning, pages 839--846, 2000. Google ScholarDigital Library
B. Schölkopf and A. Smola. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, 2001. Google ScholarDigital Library
D. Sculley and G. Wachman. Relaxed online support vector machines for spam filtering. In To appear in The Thirtieth Annual ACM SIGIR Conference Proceedings, 2007. Google ScholarDigital Library
R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, 1998. Google ScholarDigital Library

Index Terms

Practical learning from one-sided feedback
1. Computing methodologies
  1. Machine learning
2. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Transductive Multilabel Learning via Label Set Propagation

The problem of multilabel classification has attracted great interest in the last decade, where each instance can be assigned with a set of multiple class labels simultaneously. It has a wide variety of real-world applications, e.g., automatic image ...
Read More
Active learning using on-line algorithms
KDD '11: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining

This paper describes a new technique and analysis for using on-line learning algorithms to solve active learning problems. Our algorithm is called Active Vote, and it works by actively selecting instances that force several perturbed copies of an on-...
Read More
Cost‐effective multi‐instance multilabel active learning
Abstract
Multi‐instance multi‐label (MIML) Active Learning (M2AL) aims to improve the learner while reducing the cost as much as possible by querying informative labels of complex bags composed of diverse instances. Existing M2AL solutions suffer high ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
August 2007
1080 pages
ISBN:9781595936097
DOI:10.1145/1281192
General Chair:
Pavel Berkhin
Yahoo!, USA
,
Program Chairs:
Rich Caruana
Cornell University, USA
,
Xindong Wu
University of Vermont, USA
Copyright © 2007 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 August 2007
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
active learning
apple tasting
data mining
online learning
streaming data
Qualifiers
- Article
Conference

Acceptance Rates
KDD '07 Paper Acceptance Rate111of573submissions,19%Overall Acceptance Rate1,133of8,635submissions,13%
More
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 482
  Total Downloads
- Downloads (Last 12 months)7
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Practical learning from one-sided feedback

KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Transductive Multilabel Learning via Label Set Propagation

Active learning using on-line algorithms

Cost‐effective multi‐instance multilabel active learning