Learning weighted distance metric from group level information and its parallel implementation

Mohebbi, Hamidreza; Mu, Yang; Ding, Wei

doi:10.1007/s10489-016-0826-7

Learning weighted distance metric from group level information and its parallel implementation

Published: 03 August 2016

Volume 46, pages 180–196, (2017)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

442 Accesses
8 Citations
3 Altmetric
Explore all metrics

Abstract

The performance of many machine learning algorithms heavily relies on the distance metrics. Usually a distance metric is learned from a training set, while other valuable information, such as group structure, is not. Samples within a short distance form a group, which may contain several classes; each sample may have partial memberships to multiple groups. The group structure exists in both training and test sets. Additionally, outliers have negative effects on a distance metric. Increasing the number of noisy samples during the learning phase may increase the negative effects of outliers. Use of weights is one way to alleviate this problem when more similar samples are given more weight. This paper introduces a learning technique for weighted-distance metric. This semi-supervised method learns labeled information from training set and identifies groups among the samples from test set to form a metric space. In the experiments, the nearest neighbors algorithm is used as a classifier. The proposed weighted-distance metric improves the classification accuracy by more than 10 %. Furthermore, parallel computing with optimized CPU and GPU code is developed to speed up the computing time. Two parallel implementations with Matlab and CUDA are compared in this research. Parallel code that uses both CPU and the GPU achieves more than 3.7 times speedup compared to the traditional CPU code in the experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey of transfer learning

Article Open access 28 May 2016

Learning from imbalanced data: open challenges and future directions

Article Open access 22 April 2016

Supervised Classification Algorithms in Machine Learning: A Survey and Review

References

Blake C, Merz CJ (1998) {UCI} repository of machine learning databases
Cai D, He X, Han J (2007) Semi-supervised discriminant analysis. In: Computer vision, 2007. ICCV 2007. IEEE 11th international conference on, pp. 1–7. IEEE
Chapelle O, Schölkopf B, Zien A et al. (2006) Semi-supervised learning
Coates A, Huval B, Wang T, Wu D, Catanzaro B, Andrew N (2013) Deep learning with cots hpc systems. In: Proceedings of the 30th international conference on machine learning, pp 1337–1345
cuBLAS (2015) cuBLAS the nvidia cuda basic linear algebra subroutines (cublas) library @ONLINE. https://developer.nvidia.com/cuBLAS
Gou J, Du L, Zhang Y, Xiong T (2012) A new distance-weighted k-nearest neighbor classifier. J Inf Comput Sci 9:1429–1436
Google Scholar
Guyon I, Gunn S, Ben-Hur A, Dror G (2004) Result analysis of the nips 2003 feature selection challenge. In: Advances in neural information processing systems, pp 545–552
Higuera C, Gardiner KJ, Cios KJ (2015) Self-organizing feature maps identify proteins critical to learning in a mouse model of down syndrome. PloS one 10(6):e0129,126
Article Google Scholar
Hoshida Y, Brunet JP, Tamayo P, Golub TR, Mesirov JP (2007) Subclass mapping: identifying common subtypes in independent disease data sets
Mathworks (2015) Matlab @ONLINE. http://www.mathworks.com/products/matlab/
Mathworks (2015) Matlab parallel toolbox @ONLINE. https://www.mathworks.com/parallel-computing
Mu Y, Ding W, Tao D (2013) Local discriminative distance metrics ensemble learning. Pattern Recogn 46(8):2337–2349
Article MATH Google Scholar
Mu Y, Lo H, Ding W, Tao D (2014) Face recognition from multiple images per subject. In: Proceedings of the ACM international conference on multimedia, pp. 889–892. ACM
Mu Y, Lo H Z, Ding W, Amaral K, Crouter SE (2014) Bipart: Learning block structure for activity detection. IEEE Trans Knowl Data Eng 26(10):2397–2409
Article Google Scholar
NVIDIA (2015) CUDA cuda instructions @ONLINE. https://developer.nvidia.com/cuda-zone
NVIDIA (2015) cuDNN nvidia cudnn - gpu accelerated deep learning @ONLINE. https://developer.nvidia.com/cuDNN
Reese J, Zaranek S (2012) Gpu programming in matlab. Mathworks News&Notes Natick, MA: The MathWorks Inc pp. 22–5
Schölkopf B, Smola AJ (2002) Learning with kernels: Support vector machines, regularization, optimization, and beyond MIT press
Sugiyama M (2007) Dimensionality reduction of multimodal labeled data by local fisher discriminant analysis. J Mach Learn Res 8:1027–1061
MATH Google Scholar
Tishby N, Pereira FC, Bialek W (2000) The information bottleneck method arXiv preprint physics/0004057
Weinberger KQ, Saul LK (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10:207–244
MATH Google Scholar
Whitehead N, Fit-Florea A (2011) Precision & performance: Floating point and ieee 754 compliance for nvidia gpus. rn (A+ B) 21:1–1874919,424
Google Scholar
Wilkinson JH, Wilkinson JH, Wilkinson JH (1965) The algebraic eigenvalue problem, vol 87. Clarendon Press, Oxford
MATH Google Scholar
Xing EP, Jordan MI, Russell S, Ng AY (2002) Distance metric learning with application to clustering with side-information. In: Advances in neural information processing systems, pp 505–512
Zavrel J (1997) An empirical re-examination of weighted voting for k-nn. In: Proceedings of the 7th belgian-dutch conference on machine learning, pp. 139–148. Citeseer

Download references

Acknowledgments

This study could not completed with the effort and co-operation of Professor Ming Ouyang from Computer Science Department of the University of Massachusetts Boston. His comments greatly improved the manuscript.

Author information

Authors and Affiliations

Computer Science Department, University of Massachusetts Boston, Boston, MA, 02125, USA
Hamidreza Mohebbi, Yang Mu & Wei Ding

Authors

Hamidreza Mohebbi
View author publications
You can also search for this author in PubMed Google Scholar
Yang Mu
View author publications
You can also search for this author in PubMed Google Scholar
Wei Ding
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hamidreza Mohebbi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mohebbi, H., Mu, Y. & Ding, W. Learning weighted distance metric from group level information and its parallel implementation. Appl Intell 46, 180–196 (2017). https://doi.org/10.1007/s10489-016-0826-7

Download citation

Published: 03 August 2016
Issue Date: January 2017
DOI: https://doi.org/10.1007/s10489-016-0826-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning weighted distance metric from group level information and its parallel implementation

Abstract

Access this article

Similar content being viewed by others

A survey of transfer learning

Learning from imbalanced data: open challenges and future directions

Supervised Classification Algorithms in Machine Learning: A Survey and Review

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning weighted distance metric from group level information and its parallel implementation

Abstract

Access this article

Similar content being viewed by others

A survey of transfer learning

Learning from imbalanced data: open challenges and future directions

Supervised Classification Algorithms in Machine Learning: A Survey and Review

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation