skip to main content
10.1145/2661829.2661982acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Learning to Propagate Rare Labels

Published:03 November 2014Publication History

ABSTRACT

Label propagation is a well-explored family of methods for training a semi-supervised classifier where input data points (both labeled and unlabeled) are connected in the form of a weighted graph. For binary classification, the performance of these methods starts degrading considerably whenever input dataset exhibits following characteristics - (i) one of the class label is rare label or equivalently, class imbalance (CI) is very high, and (ii) degree of supervision (DoS) is very low -- defined as fraction of labeled points. These characteristics are common in many real-world datasets relating to network fraud detection. Moreover, in such applications, the amount of class imbalance is not known a priori. In this paper, we have proposed and justified the use of an alternative formulation for graph label propagation under such extreme behavior of the datasets. In our formulation, objective function is the difference of two convex quadratic functions and the constraints are box constraints. We solve this program using Concave-Convex Procedure (CCCP). Whenever the problem size becomes too large, we suggest to work with a k-NN subgraph of the given graph which can be sampled by using Locality Sensitive Hashing (LSH) technique. We have also discussed various issues that one typically faces while sampling such a k-NN subgraph in practice. Further, we have proposed a novel label flipping method on top of the CCCP solution, which improves the result of CCCP further whenever class imbalance information is made available a priori. Our method can be easily adopted for a MapReduce platform, such as Hadoop. We have conducted experiments on 11 datasets comprising a graph size of up to 20K nodes, CI as high as 99:6%, and DoS as low as 0:5%. Our method has resulted up to 19:5-times improvement in F-measure and up to 17:5-times improvement in AUC-PR measure against baseline methods.

References

  1. A. Agovic and A. Banerjee. A unified view of graph-based semi-supervised learning: Label propagation, graph-cuts, and embeddings. Technical Report TR 09-012, University of Minnesota, 2009.Google ScholarGoogle Scholar
  2. A. Andoni and P. Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Communications of the ACM, 51(1):117--122, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Belkin, P. Niyogi, and V. Sindhwani. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. JMLR, 7:2399--2434, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Y. Bengio, O. Delalleau, and N. Le Roux. Label propagation and quadratic criterion. In O. Chapelle, B. Schölkopf, and A. Zien, editors, Semi-Supervised Learning, pages 193--216. MIT Press, 2006.Google ScholarGoogle Scholar
  5. V. Chandola, A. Banerjee, and V. Kumar. Anomaly detection: A survey. ACM Computing Surveys, 41(3):1--58, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. N. V. Chawla, N. Japkowicz, and A. Kotcz. Editorial: special issue on learning from imbalanced data sets. SIGKDD Explorations Newsletter, 6(1):1--6, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. F. R. K. Chung. Spectral Graph Theory. American Mathematical Society, 1997.Google ScholarGoogle Scholar
  8. J. Davis and M. Goadrich. The relationship between precision-recall and ROC curves. In ICML, pages 233--240, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. W. Dong, C. Moses, and K. Li. Efficient k-nearest neighbor graph construction for generic similarity measures. In WWW, pages 577--586, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Egele, G. Stringhini, C. Kruegel, and G. Vigna. COMPA: Detecting compromised accounts on social networks. In NDSS, 2013.Google ScholarGoogle Scholar
  11. W. Fithian and T. Hastie. Local case-control sampling: Efficient subsampling in imbalanced data sets. arXiv:1306.3706, 2013.Google ScholarGoogle Scholar
  12. J. Gao, H. Cheng, and P.-N. Tan. Semi-supervised outlier detection. In Symposium on Applied Computing, pages 635--636, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. T. Joachims. Transductive inference for text classification using support vector machines. In ICML, pages 200--209, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. B. K. Sriperumbudur and G. R. G. Lanckriet. On the convergence of the concave-convex procedure. In NIPS, 2009.Google ScholarGoogle Scholar
  15. S. Li and I. W. Tsang. Maximum margin/volume outlier detection. In IEEE International Conference on Tools with Artificial Intelligence (ICTAI), pages 385--392, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. W. Liu and S. Chawla. A quadratic mean based supervised learning model for managing data skewness. In SDM, pages 188--198, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  17. D. G. Luenberger. Linear and Nonlinear Programming. Springer, 2003.Google ScholarGoogle Scholar
  18. U. Luxburg. A tutorial on spectral clustering. Statistics and Computing, 17(4):395--416, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. K. Menon, H. Narasimhan, S. Agarwal, and S. Chawla. On the statistical consistency of algorithms for binary classification under class imbalance. In ICML, 2013.Google ScholarGoogle Scholar
  20. J. Nocedal and S. J. Wright. Numerical Optimization. Springer, 1997.Google ScholarGoogle Scholar
  21. J. Norstad. A MapReduce algorithm for matrix multiplication, 2009. http://www.norstad.org/matrix-multiply/index.html.Google ScholarGoogle Scholar
  22. S. Pandit, D. H. Chau, S. Wang, and C. Faloutsos. Netprobe: A fast and scalable system for fraud detection in online auction networks. In WWW, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. I. N. C. S. Report. Mobile payments - a growing threat. Technical report, Bureau of International Narcotics and Law Enforcement Affairs, U.S. Department of State, 2008, URL: http://www.test.org/doe/.Google ScholarGoogle Scholar
  24. J. Wang, T. Jebara, and S.-F. Chang. Graph transduction via alternating minimization. In ICML, pages 1144--1151, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. H. Yu, P. B. Gibbons, M. Kaminsky, and F. Xiao. Sybillimit: A near-optimal social network defense against sybil attacks. In IEEE Symposium on Security and Privacy, pages 3--17, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. A. L. Yuille and A. Rangarajan. The concave-convex procedure. Neural Computation, 12:915--936, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. D. Zhou, O. Bousquet, T. Lal, J. Weston, and B. Schölkopf. Learning with local and global consistency. In NIPS, 2004.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. D. Zhou and B. Schölkopf. A regularization framework for learning from graph data. In ICML Workshop on Statistical Relational Learning, pages 132--137, 2004.Google ScholarGoogle Scholar
  29. X. Zhu, Z. Ghahramani, and J. Lafferty. Semi-supervised learning using gaussian fields and harmonic functions. In ICML, pages 912--919, 2003.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Learning to Propagate Rare Labels

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management
          November 2014
          2152 pages
          ISBN:9781450325981
          DOI:10.1145/2661829

          Copyright © 2014 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 3 November 2014

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          CIKM '14 Paper Acceptance Rate175of838submissions,21%Overall Acceptance Rate1,861of8,427submissions,22%

          Upcoming Conference

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader