Tri-partition cost-sensitive active learning through kNN

Min, Fan; Liu, Fu-Lun; Wen, Liu-Ying; Zhang, Zhi-Heng

doi:10.1007/s00500-017-2879-x

Tri-partition cost-sensitive active learning through kNN

Methodologies and Application
Published: 11 October 2017

Volume 23, pages 1557–1572, (2019)
Cite this article

Soft Computing Aims and scope Submit manuscript

Fan Min ORCID: orcid.org/0000-0002-3290-1036¹,
Fu-Lun Liu¹,
Liu-Ying Wen¹ &
…
Zhi-Heng Zhang¹

780 Accesses
37 Citations
Explore all metrics

Abstract

Active learning differs from the training–testing scenario in that class labels can be obtained upon request. It is widely employed in applications where the labeling of instances incurs a heavy manual cost. In this paper, we propose a new algorithm called tri-partition active learning through k-nearest neighbors (TALK). The optimization objective is to minimize the total teacher and misclassification costs. First, a k-nearest neighbors classifier is employed to divide unlabeled instances into three disjoint regions. Region I contains instances for which the expected misclassification cost is lower than the teacher cost, Region II contains instances to be labeled by human experts, and Region III contains the remaining instances. Various strategies are designed to determine which instances are in Region II. Second, instances in Regions I and II are labeled and added to the training set, and the tri-partition process is repeated until all instances have been labeled. Experiments are undertaken on eight University of California, Irvine, datasets using different cost settings. Compared with the state-of-the-art cost-sensitive classification and active learning algorithms, our new algorithm generally exhibits a lower total cost.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A $$k$$ -Nearest Neighbor Based Algorithm for Multi-Instance Multi-Label Active Learning

Adaptive active learning through k-nearest neighbor optimized local density clustering

Article 04 November 2022

Improving Active Learning by Avoiding Ambiguous Samples

References

Aha DW (1997) Lazy learning. Artif Intell Rev 11:7–10
Article MATH Google Scholar
Basu S (2010) Semi-supervised learning. J Roy Stat Soc 6493(10):2465–2472
Google Scholar
Blake C, Merz CJ (1998) UCI repository of machine learning databases
Bradford JP, Kunz C, Kohavi R, Brunk C, Brodley CE (2006) Pruning decision trees with misclassification costs. Lect Notes Comput Sci 51(1398):131–136
Google Scholar
Brighton H, Mellish C (2001) Identifying competence-critical instances for instance-based learners. Springer 608:77–94
Google Scholar
Cai D, He X (2012) Manifold adaptive experimental design for text categorization. IEEE Trans Knowl Data Eng 24(4):707–719
Article Google Scholar
Dasgupta S, Hsu D (2008) Hierarchical sampling for active learning. In: International conference on machine learning, pp 208–215
Guo G, Wang H, Bell D, Bi Y, Greer K (2004) KNN model-based approach in classification. Springer, Berlin
Google Scholar
Harpale AS, Yang Y (2008) Personalized active learning for collaborative filtering. In: International ACM SIGIR conference on research and development in information retrieval, pp 91–98
He YW, Zhang HR, Min F (2015) A teacher-cost-sensitive decision-theoretic rough set model. Springer, New York
Book Google Scholar
Jin R, Si L (2004) A bayesian approach toward active learning for collaborative filtering, pp 278–285
Lesot MJ, Rifqi M, Benhadda H (2009) Similarity measures for binary and numerical data: a survey. Int J Knowl Eng Soft Data Paradig 1(1):63–84
Article Google Scholar
Li HX, Zhang LB, Huang B, Zhou XZ (2016) Sequential three-way decision and granulation for cost-sensitive face recognition. Knowl Based Syst 91:241–251
Article Google Scholar
Li JH, Ren Y, Mei CL, Qian YH, Yang XB (2016) A comparative study of multigranulation rough sets and concept lattices via rule acquisition. Knowl Based Syst 91:152–164
Article Google Scholar
Li XN, Yi HJ, She YH, Sun BZ (2017) Generalized three-way decision models based on subset evaluation. Int J Approximate Reasoning 83:142–159
Article MathSciNet MATH Google Scholar
Liu D, Li TR, Ruan D (2011) Probabilistic model criteria with decision-theoretic rough sets. Inf Sci 181:3709–3722
Article MathSciNet Google Scholar
Liu D, Li TR, Liang DC (2014) Incorporating logistic regression to decision-theoretic rough sets for classifications. Int J Approx Reason 55:197–210
Article MathSciNet MATH Google Scholar
Liu D, Liang D, Wang C (2016) A novel three-way decision model based on incomplete information system. Knowl-Based Syst 91:32–45
Article Google Scholar
Long B, Bian J, Chapelle O, Zhang Y (2015) Active learning for ranking through expected loss optimization. IEEE Trans Knowl Data Eng 27(5):1180–1191
Article Google Scholar
Long B, Chapelle O, Zhang Y, Chang Y, Zheng Z, Tseng B (2010) Active learning for ranking through expected loss optimization. In: Proceeding of the international ACM SIGIR conference on research and development in information retrieval, SIGIR 2010, Geneva, Switzerland, pp 267–274
Mccallum A, Nigam K (1998) Employing EM and pool-based active learning for text classification. In: Fifteenth international conference on machine learning, pp 350–358
Min F, Liu QH (2009) A hierarchical model for test-cost-sensitive decision systems. Inf Sci 179:2442–2452
Article MathSciNet MATH Google Scholar
Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106
Google Scholar
Quinlan JR (2014) C.45: programs for machine learning. Elsevier, Amsterdam
Google Scholar
Rand GK (1979) Decision systems for inventory management and production planning. Wiley, New York
Google Scholar
Saartsechansky M, Provost F (2004) Active sampling for class probability estimation and ranking. Mach Learn 54(2):153–178
Article MATH Google Scholar
Settles B (2012) Active learning. Synth Lect Artif Intell Mach Learn 6(1):1–114
Article MathSciNet MATH Google Scholar
Seung HS, Opper M, Sompolinsky H (1992) Query by committee. In: Proceedings of the fifth workshop on computational learning theory, vol 284, pp 287–294
Sheng VS (2012) Studying active learning in the cost-sensitive framework. In: Hawaii international conference on system sciences, pp 1097–1106
Tong S, Koller D (2001) Support vector machine active learning with applications to text classification. J Mach Learn Res 2(1):45–66
MATH Google Scholar
Turney PD (2000) Types of cost in inductive concept learning. In: Proceedings of the workshop on cost-sensitive learning at the 17th ICML, pp 1–7
Wang M, Min F, Zhang ZH, Wu YX (2017) Active learning through density clustering. Expert Syst Appl 85:305–317
Article Google Scholar
Yao YY (2012) An outline of a theory of three-way decisions. In: International conference on rough sets and current trends in computing, Springer, New York, pp 1–17
Yao YY (2016) Three-way decisions and cognitive computing. Cognit Comput 8(4):543–554
Yao YY (2010) Three-way decisions with probabilistic rough sets. Inf Sci 180(3):341–353
Article MathSciNet Google Scholar
Zhang HR, Min F, Shi B (2016) Regression-based three-way recommendation. Inf Sci
Zhang HR, Min F (2016) Three-way recommender systems based on random forests. Knowl Based Syst 91:275–286
Article Google Scholar
Zhang BW, Min F, Ciucci D (2015) Representative-based classification through covering-based neighborhood rough sets. Appl Intell 43(4):840–854
Article Google Scholar
Zhang Y, Zhou ZH (2008) Cost-sensitive face recognition. In: IEEE conference on computer vision and pattern recognition, pp 1–8
Zhao H, Zhu W (2014) Optimal cost-sensitive granularization based on rough sets for variable costs. Knowl Based Syst 65:72–82
Article Google Scholar
Zhao Y, Yao Y, Luo F (2007) Data analysis based on discernibility and indiscernibility. Inf Sci 177(22):4959–4976
Article MATH Google Scholar
Zhao H, Wang P, Hu QH (2016) Cost-sensitive feature selection based on adaptive neighborhood granularity with multi-level confidence. Inf Sci 366:134–149
Article MathSciNet Google Scholar
Zhou ZH, Liu XY (2006) Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans Knowl Data Eng 18(1):63–77
Article MathSciNet Google Scholar
Zhou B, Yao Y, Luo J (2014) Cost-sensitive three-way email spam filtering. J Intell Inf Syst 42(1):19–45
Article Google Scholar
Zhu XQ, Wu XD (2005) Cost-constrained data acquisition for intelligent data preparation. IEEE Trans Knowl Data Eng 17(11):1542–1556
Article Google Scholar

Download references

Acknowledgements

This work is supported in part by National Natural Science Foundation of China (Grant No. 61379089) and the Natural Science Foundation of Department of Education of Sichuan Province (Grant No. 16ZA0060).

Author information

Authors and Affiliations

School of Computer Science, Southwest Petroleum University, Chengdu, 610500, China
Fan Min, Fu-Lun Liu, Liu-Ying Wen & Zhi-Heng Zhang

Authors

Fan Min
View author publications
You can also search for this author in PubMed Google Scholar
Fu-Lun Liu
View author publications
You can also search for this author in PubMed Google Scholar
Liu-Ying Wen
View author publications
You can also search for this author in PubMed Google Scholar
Zhi-Heng Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fan Min.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interests regarding the publication of this paper.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Min, F., Liu, FL., Wen, LY. et al. Tri-partition cost-sensitive active learning through kNN. Soft Comput 23, 1557–1572 (2019). https://doi.org/10.1007/s00500-017-2879-x

Download citation

Published: 11 October 2017
Issue Date: 15 March 2019
DOI: https://doi.org/10.1007/s00500-017-2879-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Tri-partition cost-sensitive active learning through kNN

Abstract

Access this article

Similar content being viewed by others

A $$k$$ -Nearest Neighbor Based Algorithm for Multi-Instance Multi-Label Active Learning

Adaptive active learning through k-nearest neighbor optimized local density clustering

Improving Active Learning by Avoiding Ambiguous Samples

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Tri-partition cost-sensitive active learning through kNN

Abstract

Access this article

Similar content being viewed by others

A $$k$$ -Nearest Neighbor Based Algorithm for Multi-Instance Multi-Label Active Learning

Adaptive active learning through k-nearest neighbor optimized local density clustering

Improving Active Learning by Avoiding Ambiguous Samples

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation