Abstract
In this paper we address a problem arising from the classification of breast cancer malignancy data. Due to the fact that there is much smaller number of patients which are diagnosed with high malignancy, data sets are prone to have a high imbalance between malignancy classes. To overcome this problem we have applied state-of-the-art methods for imbalanced classification to our data set and demonstrate an improvement in the classification sensitivity. The achieved sensitivity for our data set was recorded at 92.34%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
National Cancer Registry (December 2013), http://85.128.14.124/krn/ (accessed on December 13, 2013)
Alpaydin, E.: Combined 5 x 2 cv F Test for comparing supervised classification learning algorithms. Neural Computation 11(8), 1885–1892 (1999)
Błaszczyński, J., Deckert, M., Stefanowski, J., Wilk, S.: Integrating selective pre-processing of imbalanced data with Ivotes ensemble. In: Szczuka, M., Kryszkiewicz, M., Ramanna, S., Jensen, R., Hu, Q. (eds.) RSCTC 2010. LNCS (LNAI), vol. 6086, pp. 148–157. Springer, Heidelberg (2010)
Chawla, N.V., Lazarevic, A., Hall, L.O., Bowyer, K.W.: SMOTEBoost: Improving prediction of the minority class in boosting. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) PKDD 2003. LNCS (LNAI), vol. 2838, pp. 107–119. Springer, Heidelberg (2003)
Czogała, E., Łęski, J.: Application of entropy and energy measures of fuzziness to processing of ECG signal. Fuzzy Sets and Systems 97(1), 9–18 (1998)
Detyna, J., Jeleń, L., Jeleń, M.: Role of Image Processing in the Cancer Diagnosis. Bio-Algorithms and Med-Systems 7(4), 5–9 (2011)
Dietterich, T., Bakiri, G.: Solving multiclass learning problems via error-correcting output codes. J. Artif. Int. Res. 2, 263–286 (1995)
Filipczuk, P., Fevens, T., Krzyżak, A., Monczak, R.: Computer-aided breast cancer diagnosis based on the analysis of cytological images of fine needle biopsies. IEEE Transactions on Medical Imaging 32(12), 2169–2178 (2013)
Galar, M., Fernández, A., Barrenechea, E., Bustince, H., Herrera, F.: An overview of ensemble methods for binary classifiers in multi-class problems: Experimental study on one-vs-one and one-vs-all schemes. Pattern Recognition 44(8), 1761–1776 (2011)
Harman, M., McMinn, P.: A theoretical and empirical study of search-based testing: Local, global, and hybrid search. IEEE Transactions on Software Engineering 36(2), 226–247 (2010)
Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20, 832–844 (1998)
ŁJeleń, Krzyżak, A., Fevens, T., Jeleń, M.: Influence of Pattern Recognition Techniques on Breast Cytology Grading. Scientific Bulletin of Wroclaw School of Applied Informatics 2, 16–23 (2012)
Klir, G., Yuan, B.: Fuzzy Sets and Fuzzy Logic: Theory and Applications. Prentice Hall, New Jersey (1995)
Kowal, M., Filipczuk, P., Obuchowicz, A., Korbicz, J., Monczak, R.: Computer-aided diagnosis of breast cancer based on fine needle biopsy microscopic images. Computers in Biology and Medicine 43(10), 1563–1572 (2013)
Krawczyk, B., Jeleń, Ł., Krzyżak, A., Fevens, T.: Oversampling methods for classification of imbalanced breast cancer malignancy data. In: Bolc, L., Tadeusiewicz, R., Chmielewski, L.J., Wojciechowski, K. (eds.) ICCVG 2012. LNCS, vol. 7594, pp. 483–490. Springer, Heidelberg (2012)
Krawczyk, B., Woźniak, M.: Diversity measures for one-class classifier ensembles. Neurocomputing 126, 36–44 (2014)
Krawczyk, B., Woźniak, M., Schaefer, G.: Cost-sensitive decision tree ensembles for effective imbalanced classification. Applied Soft Computing 14, Part C, 554–562 (2014)
Liu, X., Wu, J., Zhou, Z.: Exploratory undersampling for class-imbalance learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 39(2), 539–550 (2009)
Ridler, T., Calvard, S.: Picture thresholding using an iterative selection. IEEE Trans. on Systems, Man and Cybernetics 8, 630–632 (1978)
Sun, Y., Wong, A.K.C., Kamel, M.S.: Classification of imbalanced data: A review. Inter’l Journal of Pattern Recognition & Artificial Intell. 23(4), 687–719 (2009)
Tax, D., Duin, R.: Support vector data description. Machine Learning 54(1), 45–66 (2004)
Tax, D., Duin, R.: Characterizing one-class datasets. In: Proceedings of the 16th Annual Symp. of the Pattern Recogn. Assoc. of South Africa, pp. 21–26 (2005)
Theera-Umpon, N.: Patch–Based white blood cell nucleus segmentation using fuzzy clustering. ECTI Transactions on Electrical Engineering, Electronics and Communications 3(1), 15–19 (2005)
Umbaugh, S.: Computer Imaging: Digital Image Analysis and Processing. CRC Press, New York (2005)
Wilk, T., Woźniak, M.: Soft computing methods applied to combination of one-class classifiers. Neurocomputing 75, 185–193 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Krawczyk, B., Jeleń, Ł., Krzyżak, A., Fevens, T. (2014). One-Class Classification Decomposition for Imbalanced Classification of Breast Cancer Malignancy Data. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2014. Lecture Notes in Computer Science(), vol 8467. Springer, Cham. https://doi.org/10.1007/978-3-319-07173-2_46
Download citation
DOI: https://doi.org/10.1007/978-3-319-07173-2_46
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07172-5
Online ISBN: 978-3-319-07173-2
eBook Packages: Computer ScienceComputer Science (R0)