Enhancement of kernel dependency estimation with information generalization and a case study on skewed data

Chen, Qingzhi; Chang, Chia-Hui

doi:10.1007/s10489-014-0539-8

Enhancement of kernel dependency estimation with information generalization and a case study on skewed data

Published: 29 May 2014

Volume 41, pages 582–593, (2014)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Qingzhi Chen¹ &
Chia-Hui Chang¹

202 Accesses
Explore all metrics

Abstract

Kernel dependency estimation (KDE) is a learning framework of finding the dependencies between two general classes of objects. Although it has been successfully used for many types of applications, its properties are not fully studied. In this paper, we discuss two practical issues with KDE. The first one is its real-value output for each label, which differ from the desired binary value for the 1-of-k coding scheme. Thus, a gap usually exists between the predicted real-value from KDE and the ground truth binary value. One common practice to reduce the gap is using thresholding strategies. In this paper, we provide an alternative approach that combines a second-level classifier using a special degenerated form of stacked generalization. The second issue is the decreasing performance when KDE is applied to classification with skewed data. Our experiments show that standard KDE is not an appropriate approach for skewed data; we then provide a solution to handle skewed data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Kernel learning and optimization with Hilbert–Schmidt independence criterion

Article 11 April 2017

Kernel Learning with Hilbert-Schmidt Independence Criterion

Methodically Unified Procedures for Outlier Detection, Clustering and Classification

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

Note that v _t are not used by SG.
Note that sufficient training instances to tell ${\hat y_{j}}$ from ${\hat y_{i}}$ doesn’t mean sufficient training instances to know ${\mathrm {}}R\left ( {{{\hat y}_{j}}{\mathrm {|}}{{\hat y}_{i}}} \right )$
http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/
http://sci2s.ugr.es/keel/multilabel.php#sub10

References

Bi W, Kwok JT (2011) Multi-label classification on tree- and DAG-structured hierarchies. In: Proceedings of the 28th international conference on machine learning, pp 17–24
Dembczynski K, Waegeman W, Cheng W, Hllermeier E (2010) On label dependence in multi-label classification. In: Proceedings of the 2nd international workshop on learning from multi-label data, pp 5–12
Ganganwar V (2012) An overview of classification algorithms for imbalanced datasets. Int J Emerg Tech Adv Eng 2(4)
Hulse J V, Khoshgoftaar M, Napolitano A (2007) Experimental perspectives on learning from imbalanced data. In: Proceeding ICML ’07 proceedings of the 24th international conference on machine learning, pp 935–942
Ioannou M, Sakkas G, Tsoumakas G, Vlahavas I P (2010) Obtaining bipartitions from score vectors for multi-label classification. Int Conf Tools Artif Intell-ICTAI 1:409–416
Google Scholar
Lin Y, Hu X, Wu X (2014) Ensemble learning from multiple information sources via label propagation and consensus. Appl Intell 18
Quevedo J R, Luaces O, Bahamonde A (2012) Multilabel classifers with a probabilistic thresholding strategy. Pattern Recog 45(2):76?C883
Google Scholar
Rokach L (2009) Ensemble-based classifiers. Artif Intell Rev 33(1–2):1–39
MathSciNet Google Scholar
Russell S, Norvig P (2010) Artificial intelligence: a modern approach, 3rd edn. Prentice Hall
Sewell M (2008) Ensemble learning, edited by University College London
Tai F, Lin HT (2010) Multi-label classification with principle label space transformation. In: Proceedings of the 2nd international workshop on learning from multi-label data
Tsoumakas G, Katakis I, Vlahavas I (2010) Mining multi-label data. In: Maimon O, Rokach L (eds) Data mining and knowledge discovery handbook, 2nd edn. Springer
Weston J, Chapelle O, Elisseeff A, Scholkopf B, Vapnik V (2003) Kernel dependency estimation. In: Advances in neural information processing systems 15
Wolpert D H (1992) Stacked generalization. Neural Netw 5:241–259
Article Google Scholar
Yang YM (2001) A study of thresholding strategies for text categorization. In: Proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval, pp 137–145
Zhang M L, Zhou Z H (in press) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng

Download references

Author information

Authors and Affiliations

National Central University, Taoyuan County, Taiwan
Qingzhi Chen & Chia-Hui Chang

Authors

Qingzhi Chen
View author publications
You can also search for this author in PubMed Google Scholar
Chia-Hui Chang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chia-Hui Chang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, Q., Chang, CH. Enhancement of kernel dependency estimation with information generalization and a case study on skewed data. Appl Intell 41, 582–593 (2014). https://doi.org/10.1007/s10489-014-0539-8

Download citation

Published: 29 May 2014
Issue Date: September 2014
DOI: https://doi.org/10.1007/s10489-014-0539-8

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enhancement of kernel dependency estimation with information generalization and a case study on skewed data

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Kernel learning and optimization with Hilbert–Schmidt independence criterion

Kernel Learning with Hilbert-Schmidt Independence Criterion

Methodically Unified Procedures for Outlier Detection, Clustering and Classification

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Enhancement of kernel dependency estimation with information generalization and a case study on skewed data

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Kernel learning and optimization with Hilbert–Schmidt independence criterion

Kernel Learning with Hilbert-Schmidt Independence Criterion

Methodically Unified Procedures for Outlier Detection, Clustering and Classification

Explore related subjects

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation