“Real-time” Instance Selection for Biomedical Data Classification

Zhang, Chongsheng; D’Ambrosio, Roberto; Soda, Paolo

doi:10.1007/978-3-319-10160-6_35

Chongsheng Zhang¹⁷,
Roberto D’Ambrosio¹⁸ &
Paolo Soda¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8646))

Included in the following conference series:

International Conference on Data Warehousing and Knowledge Discovery

1878 Accesses

Abstract

Computer-based medical systems play a very important role in medical applications because they can strongly support the physicians in the decision making process. Several existing methods infer a classification function from labeled training data. The large amount of data nowadays available, although collected from high quality sources, usually contain irrelevant, redundant, or noisy information, suggesting that not all the training instances are useful for the classification task. To address this issue, we present here an instance selection method that, different from the existing approaches, selects in “real-time” a subset of instances from the original training set on the basis of the information derived from each test instance to be classified. We apply our method to seven public benchmark datasets, showing that the recognition performances are improved. We will also discuss how method parameters affect the experimental results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Asuncion, A., Newman, D.J.: UCI machine learning repository (2007)
Google Scholar
D’Ambrosio, R., Iannello, G., Soda, P.: Automatic facial expression recognition using statistical-like moments. In: Maino, G., Foresti, G.L. (eds.) ICIAP 2011, Part I. LNCS, vol. 6978, pp. 585–594. Springer, Heidelberg (2011)
Chapter Google Scholar
Fernández, A., Duarte, A., Hernández, R., Sánchez, Á.: GRASP for instance selection in medical data sets. In: Rocha, M.P., Riverola, F.F., Shatkay, H., Corchado, J.M. (eds.) IWPACBB 2010. AISC, vol. 74, pp. 53–60. Springer, Heidelberg (2010)
Chapter Google Scholar
Fung, G., Mangasarian, O.L.: Data selection for support vector machine classifiers. In: ACM SIGKDD, pp. 64–70 (2000)
Google Scholar
Galar, M., Fernández, A., Barrenechea, E., Bustince, H., Herrera, F.: An overview of ensemble methods for binary classifiers in multi-class problems: Experimental study on one-vs-one and one-vs-all schemes. Pattern Recognition 44(8), 1761–1776 (2011)
Article Google Scholar
Garcia, S., Derrac, J., Cano, J.R., Herrera, F.: Prototype selection for nearest neighbor classification: Taxonomy and empirical study. IEEE Transactions on PAMI 34(3), 417–435 (2012)
Article Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explorations Newsletter 11(1), 10–18 (2009)
Article Google Scholar
Kim, S., Oommen, B.J.: A brief taxonomy and ranking of creative prototype reduction schemes. Pattern Analysis & Applications 6(3), 232–244 (2003)
Article MathSciNet Google Scholar
Kuncheva, L.I.: Editing for the k-nearest neighbors rule by a genetic algorithm. Pattern Recognition Letters 16, 809–814 (1995)
Article Google Scholar
Liu, H., Motoda, H.: On issues of instance selection. Data Mining and Knowledge Discovery 6(2), 115–130 (2002)
Article MathSciNet Google Scholar
Soda, P.: A multi-objective optimisation approach for class-imbalance learning. Pattern Recognition 44, 1801–1810 (2011)
Article MATH Google Scholar
Sordo, M., Zeng, Q.: On sample size and classification accuracy: A performance comparison. In: Oliveira, J.L., Maojo, V., Martín-Sánchez, F., Pereira, A.S. (eds.) ISBMDA 2005. LNCS (LNBI), vol. 3745, pp. 193–201. Springer, Heidelberg (2005)
Chapter Google Scholar
Wilson, D.R., Martinez, T.R.: Reduction techniques for instance-based learning algorithms. Machine Learning 38(3), 257–286 (2000)
Article MATH Google Scholar
Zhang, C., D’Ambrosio, R., Soda, P.: Real-time biomedical instance selection. In: The 27th IEEE International Symposium on Computer-Based Medical Systems, CBMS 2014 (2014)
Google Scholar
Zhang, C., Soda, P.: A double-ensemble approach for classifying skewed data streams. In: Tan, P.-N., Chawla, S., Ho, C.K., Bailey, J. (eds.) PAKDD 2012, Part I. LNCS, vol. 7301, pp. 254–265. Springer, Heidelberg (2012)
Chapter Google Scholar
Zhu, X., Wu, X.: Scalable representative instance selection and ranking. In: Proceedings of the 18th International Conference on Pattern Recognition, pp. 352–355 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Henan University, China
Chongsheng Zhang
Università Campus Bio-Medico di Roma, Italy
Roberto D’Ambrosio & Paolo Soda

Authors

Chongsheng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Roberto D’Ambrosio
View author publications
You can also search for this author in PubMed Google Scholar
Paolo Soda
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

LIAS/ISAE-ENSMA, Téléport 2, 1 avenue Clément Ader, BP 40109, 86961, Futuroscope Chasseneuil Cedex, France
Ladjel Bellatreche
IBM Research - India, 4, Block-C, Institutional Area, 110070, Vasant Kunj, New Delhi, India
Mukesh K. Mohania

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, C., D’Ambrosio, R., Soda, P. (2014). “Real-time” Instance Selection for Biomedical Data Classification. In: Bellatreche, L., Mohania, M.K. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2014. Lecture Notes in Computer Science, vol 8646. Springer, Cham. https://doi.org/10.1007/978-3-319-10160-6_35

Download citation

DOI: https://doi.org/10.1007/978-3-319-10160-6_35
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10159-0
Online ISBN: 978-3-319-10160-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics