Abstract
Computer-based medical systems play a very important role in medical applications because they can strongly support the physicians in the decision making process. Several existing methods infer a classification function from labeled training data. The large amount of data nowadays available, although collected from high quality sources, usually contain irrelevant, redundant, or noisy information, suggesting that not all the training instances are useful for the classification task. To address this issue, we present here an instance selection method that, different from the existing approaches, selects in “real-time” a subset of instances from the original training set on the basis of the information derived from each test instance to be classified. We apply our method to seven public benchmark datasets, showing that the recognition performances are improved. We will also discuss how method parameters affect the experimental results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Asuncion, A., Newman, D.J.: UCI machine learning repository (2007)
D’Ambrosio, R., Iannello, G., Soda, P.: Automatic facial expression recognition using statistical-like moments. In: Maino, G., Foresti, G.L. (eds.) ICIAP 2011, Part I. LNCS, vol. 6978, pp. 585–594. Springer, Heidelberg (2011)
Fernández, A., Duarte, A., Hernández, R., Sánchez, Á.: GRASP for instance selection in medical data sets. In: Rocha, M.P., Riverola, F.F., Shatkay, H., Corchado, J.M. (eds.) IWPACBB 2010. AISC, vol. 74, pp. 53–60. Springer, Heidelberg (2010)
Fung, G., Mangasarian, O.L.: Data selection for support vector machine classifiers. In: ACM SIGKDD, pp. 64–70 (2000)
Galar, M., Fernández, A., Barrenechea, E., Bustince, H., Herrera, F.: An overview of ensemble methods for binary classifiers in multi-class problems: Experimental study on one-vs-one and one-vs-all schemes. Pattern Recognition 44(8), 1761–1776 (2011)
Garcia, S., Derrac, J., Cano, J.R., Herrera, F.: Prototype selection for nearest neighbor classification: Taxonomy and empirical study. IEEE Transactions on PAMI 34(3), 417–435 (2012)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explorations Newsletter 11(1), 10–18 (2009)
Kim, S., Oommen, B.J.: A brief taxonomy and ranking of creative prototype reduction schemes. Pattern Analysis & Applications 6(3), 232–244 (2003)
Kuncheva, L.I.: Editing for the k-nearest neighbors rule by a genetic algorithm. Pattern Recognition Letters 16, 809–814 (1995)
Liu, H., Motoda, H.: On issues of instance selection. Data Mining and Knowledge Discovery 6(2), 115–130 (2002)
Soda, P.: A multi-objective optimisation approach for class-imbalance learning. Pattern Recognition 44, 1801–1810 (2011)
Sordo, M., Zeng, Q.: On sample size and classification accuracy: A performance comparison. In: Oliveira, J.L., Maojo, V., Martín-Sánchez, F., Pereira, A.S. (eds.) ISBMDA 2005. LNCS (LNBI), vol. 3745, pp. 193–201. Springer, Heidelberg (2005)
Wilson, D.R., Martinez, T.R.: Reduction techniques for instance-based learning algorithms. Machine Learning 38(3), 257–286 (2000)
Zhang, C., D’Ambrosio, R., Soda, P.: Real-time biomedical instance selection. In: The 27th IEEE International Symposium on Computer-Based Medical Systems, CBMS 2014 (2014)
Zhang, C., Soda, P.: A double-ensemble approach for classifying skewed data streams. In: Tan, P.-N., Chawla, S., Ho, C.K., Bailey, J. (eds.) PAKDD 2012, Part I. LNCS, vol. 7301, pp. 254–265. Springer, Heidelberg (2012)
Zhu, X., Wu, X.: Scalable representative instance selection and ranking. In: Proceedings of the 18th International Conference on Pattern Recognition, pp. 352–355 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Zhang, C., D’Ambrosio, R., Soda, P. (2014). “Real-time” Instance Selection for Biomedical Data Classification. In: Bellatreche, L., Mohania, M.K. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2014. Lecture Notes in Computer Science, vol 8646. Springer, Cham. https://doi.org/10.1007/978-3-319-10160-6_35
Download citation
DOI: https://doi.org/10.1007/978-3-319-10160-6_35
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10159-0
Online ISBN: 978-3-319-10160-6
eBook Packages: Computer ScienceComputer Science (R0)