research-article

Learning to Propagate Rare Labels

Authors:
Rakesh Pimplikar

IBM Research, New Delhi, India

IBM Research, New Delhi, India
View Profile

,
Dinesh Garg

IBM Research, Bangalore, India

IBM Research, Bangalore, India
View Profile

,
Deepesh Bharani

World Quant, Mumbai, India

World Quant, Mumbai, India
View Profile

,
Gyana Parija

IBM Research, New Delhi, India

IBM Research, New Delhi, India
View Profile

CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge ManagementNovember 2014Pages 201–210https://doi.org/10.1145/2661829.2661982

Published:03 November 2014Publication History

CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management

Pages 201–210

ABSTRACT

Label propagation is a well-explored family of methods for training a semi-supervised classifier where input data points (both labeled and unlabeled) are connected in the form of a weighted graph. For binary classification, the performance of these methods starts degrading considerably whenever input dataset exhibits following characteristics - (i) one of the class label is rare label or equivalently, class imbalance (CI) is very high, and (ii) degree of supervision (DoS) is very low -- defined as fraction of labeled points. These characteristics are common in many real-world datasets relating to network fraud detection. Moreover, in such applications, the amount of class imbalance is not known a priori. In this paper, we have proposed and justified the use of an alternative formulation for graph label propagation under such extreme behavior of the datasets. In our formulation, objective function is the difference of two convex quadratic functions and the constraints are box constraints. We solve this program using Concave-Convex Procedure (CCCP). Whenever the problem size becomes too large, we suggest to work with a k-NN subgraph of the given graph which can be sampled by using Locality Sensitive Hashing (LSH) technique. We have also discussed various issues that one typically faces while sampling such a k-NN subgraph in practice. Further, we have proposed a novel label flipping method on top of the CCCP solution, which improves the result of CCCP further whenever class imbalance information is made available a priori. Our method can be easily adopted for a MapReduce platform, such as Hadoop. We have conducted experiments on 11 datasets comprising a graph size of up to 20K nodes, CI as high as 99:6%, and DoS as low as 0:5%. Our method has resulted up to 19:5-times improvement in F-measure and up to 17:5-times improvement in AUC-PR measure against baseline methods.

References

A. Agovic and A. Banerjee. A unified view of graph-based semi-supervised learning: Label propagation, graph-cuts, and embeddings. Technical Report TR 09-012, University of Minnesota, 2009.Google Scholar
A. Andoni and P. Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Communications of the ACM, 51(1):117--122, 2008. Google ScholarDigital Library
M. Belkin, P. Niyogi, and V. Sindhwani. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. JMLR, 7:2399--2434, 2006. Google ScholarDigital Library
Y. Bengio, O. Delalleau, and N. Le Roux. Label propagation and quadratic criterion. In O. Chapelle, B. Schölkopf, and A. Zien, editors, Semi-Supervised Learning, pages 193--216. MIT Press, 2006.Google Scholar
V. Chandola, A. Banerjee, and V. Kumar. Anomaly detection: A survey. ACM Computing Surveys, 41(3):1--58, 2009. Google ScholarDigital Library
N. V. Chawla, N. Japkowicz, and A. Kotcz. Editorial: special issue on learning from imbalanced data sets. SIGKDD Explorations Newsletter, 6(1):1--6, 2004. Google ScholarDigital Library
F. R. K. Chung. Spectral Graph Theory. American Mathematical Society, 1997.Google Scholar
J. Davis and M. Goadrich. The relationship between precision-recall and ROC curves. In ICML, pages 233--240, 2006. Google ScholarDigital Library
W. Dong, C. Moses, and K. Li. Efficient k-nearest neighbor graph construction for generic similarity measures. In WWW, pages 577--586, 2011. Google ScholarDigital Library
M. Egele, G. Stringhini, C. Kruegel, and G. Vigna. COMPA: Detecting compromised accounts on social networks. In NDSS, 2013.Google Scholar
W. Fithian and T. Hastie. Local case-control sampling: Efficient subsampling in imbalanced data sets. arXiv:1306.3706, 2013.Google Scholar
J. Gao, H. Cheng, and P.-N. Tan. Semi-supervised outlier detection. In Symposium on Applied Computing, pages 635--636, 2006. Google ScholarDigital Library
T. Joachims. Transductive inference for text classification using support vector machines. In ICML, pages 200--209, 1999. Google ScholarDigital Library
B. K. Sriperumbudur and G. R. G. Lanckriet. On the convergence of the concave-convex procedure. In NIPS, 2009.Google Scholar
S. Li and I. W. Tsang. Maximum margin/volume outlier detection. In IEEE International Conference on Tools with Artificial Intelligence (ICTAI), pages 385--392, 2011. Google ScholarDigital Library
W. Liu and S. Chawla. A quadratic mean based supervised learning model for managing data skewness. In SDM, pages 188--198, 2011.Google ScholarCross Ref
D. G. Luenberger. Linear and Nonlinear Programming. Springer, 2003.Google Scholar
U. Luxburg. A tutorial on spectral clustering. Statistics and Computing, 17(4):395--416, 2007. Google ScholarDigital Library
A. K. Menon, H. Narasimhan, S. Agarwal, and S. Chawla. On the statistical consistency of algorithms for binary classification under class imbalance. In ICML, 2013.Google Scholar
J. Nocedal and S. J. Wright. Numerical Optimization. Springer, 1997.Google Scholar
J. Norstad. A MapReduce algorithm for matrix multiplication, 2009. http://www.norstad.org/matrix-multiply/index.html.Google Scholar
S. Pandit, D. H. Chau, S. Wang, and C. Faloutsos. Netprobe: A fast and scalable system for fraud detection in online auction networks. In WWW, 2007. Google ScholarDigital Library
I. N. C. S. Report. Mobile payments - a growing threat. Technical report, Bureau of International Narcotics and Law Enforcement Affairs, U.S. Department of State, 2008, URL: http://www.test.org/doe/.Google Scholar
J. Wang, T. Jebara, and S.-F. Chang. Graph transduction via alternating minimization. In ICML, pages 1144--1151, 2008. Google ScholarDigital Library
H. Yu, P. B. Gibbons, M. Kaminsky, and F. Xiao. Sybillimit: A near-optimal social network defense against sybil attacks. In IEEE Symposium on Security and Privacy, pages 3--17, 2008. Google ScholarDigital Library
A. L. Yuille and A. Rangarajan. The concave-convex procedure. Neural Computation, 12:915--936, 2003. Google ScholarDigital Library
D. Zhou, O. Bousquet, T. Lal, J. Weston, and B. Schölkopf. Learning with local and global consistency. In NIPS, 2004.Google ScholarDigital Library
D. Zhou and B. Schölkopf. A regularization framework for learning from graph data. In ICML Workshop on Statistical Relational Learning, pages 132--137, 2004.Google Scholar
X. Zhu, Z. Ghahramani, and J. Lafferty. Semi-supervised learning using gaussian fields and harmonic functions. In ICML, pages 912--919, 2003.Google ScholarDigital Library

Index Terms

Learning to Propagate Rare Labels
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
    2. Machine learning approaches
      1. Classification and regression trees
2. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Discriminatory Label-specific Weights for Multi-label Learning with Missing Labels
Abstract
Class labels in multi-label datasets are only associated with a very small fraction of the data instances leading to a class imbalance problem. There exist multi-label learning algorithms that handle the datasets’ class imbalance issue by ...
Read More
Semi-supervised partial label learning algorithm via reliable label propagation
Abstract
Partial label learning (PLL) is a weakly supervised learning method that is able to predict one label as the correct answer from a given candidate label set. In PLL, when all possible candidate labels are as signed to real-world training examples, ...
Read More
Semisupervised Learning Using Negative Labels

The problem of semisupervised learning has aroused considerable research interests in the past few years. Most of these methods aim to learn from a partially labeled dataset, i.e., they assume that the exact labels of some data are already known. In ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management
November 2014
2152 pages
ISBN:9781450325981
DOI:10.1145/2661829
General Chairs:
Jianzhong Li
Harbin Inst. of Technology
,
X. Sean Wang
Fudan University
,
Program Chairs:
Minos Garofalakis
Technical University of Crete, Greece
,
Ian Soboroff
National Institute of Standards, USA
,
Torsten Suel
New York University, USA
,
Min Wang
Google Research, USA
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 November 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
class imbalance
concave-convex procedure
label propagation
semi-supervised classification
Qualifiers
- research-article
Conference

Acceptance Rates
CIKM '14 Paper Acceptance Rate175of838submissions,21%Overall Acceptance Rate1,861of8,427submissions,22%
More
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 344
  Total Downloads
- Downloads (Last 12 months)4
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Learning to Propagate Rare Labels

CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Discriminatory Label-specific Weights for Multi-label Learning with Missing Labels

Semi-supervised partial label learning algorithm via reliable label propagation

Semisupervised Learning Using Negative Labels