Transductive cost-sensitive lung cancer image classification

Shi, Yinghuan; Gao, Yang; Wang, Ruili; Zhang, Ying; Wang, Dong

doi:10.1007/s10489-012-0354-z

Transductive cost-sensitive lung cancer image classification

Published: 17 May 2012

Volume 38, pages 16–28, (2013)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Yinghuan Shi¹,
Yang Gao¹,
Ruili Wang²,
Ying Zhang³ &
…
Dong Wang³

670 Accesses
15 Citations
Explore all metrics

Abstract

Previous computer-aided lung cancer image classification methods are all cost-blind, which assume that the misdiagnosis (categorizing a cancerous image as a normal one or categorizing a normal image as a cancerous one) costs are equal. In addition, previous methods usually require experienced pathologists to label a large amount of images as training samples. To this end, a novel transductive cost-sensitive method is proposed for lung cancer image classification on needle biopsies specimens, which only requires the pathologist to label a small amount of images. The proposed method analyzes lung cancer images in the following procedures: (i) an image capturing procedure to capture images from the needle biopsies specimens; (ii) a preprocessing procedure to segment the individual cells from the captured images; (iii) a feature extraction procedure to extract features (i.e. shape, color, texture and statistical information) from the obtained individual cells; (iv) a codebook learning procedure to learn a codebook on the extracted features by adopting k-means clustering, which aims to represent each image as a histogram over different codewords; (v) an image classification procedure to predict labels for testing images using the proposed multi-class cost-sensitive Laplacian regularized least squares (mCLRLS). We evaluate the proposed method on a real-image set provided by Bayi Hospital, which contains 271 images including normal ones and four types of cancerous ones (squamous carcinoma, adenocarcinoma, small cell cancer and nuclear atypia). The experimental results demonstrate that the proposed method achieves a lower cancer-misdiagnosis rate and lower total misdiagnosis costs comparing with previous methods, which includes the supervised learning approach (kNN, mcSVM and MCMI-AdaBoost), semi-supervised learning approach (LapRLS) and cost-sensitive approach (CS-SVM). Meanwhile, the experiments also disclose that both transductive and cost-sensitive settings are useful when only a small amount of training images are available.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Cell Encoding for Histopathology Image Classification

Large Margin Aggregation of Local Estimates for Medical Image Classification

Generalized Multiple Instance Learning for Cancer Detection in Digital Histopathology

Notes

The clusters number is selected by searching from 1 to 20 using 10-fold cross validation, we found that choosing 7, the approaches which are related with codebook learning can obtain the best results.
The number of neighbors in kNN is selected by searching from 1 to 10 using 10-fold cross validation, and the number corresponding to the best results is chosen.

References

Aribarg T, Supratid S, Lursinsap C (2012) Optimizing the modified fuzzy ant-miner for efficient medical diagnosis. Appl Intell. doi:10.1007/s10489-011-0332-x
Google Scholar
Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434
MathSciNet MATH Google Scholar
Cai H, Yan F, Mikolajczyk K (2010) Learning weights for codebook in image classification and retrieval. In: Proceedings of the IEEE international conference on computer vision and pattern recognition (CVPR), pp 2320–2327
Google Scholar
Chiang I-J, Shieh M-J, Hsu JY, Wong J-M (2005) Building a medical decision support system for colon polyp screening by using fuzzy classification trees. Appl Intell 22:61–75
Article Google Scholar
Cho S-B, Won H-H (2007) Cancer classification using ensemble of neural networks with multiple significant gene subsets. Appl Intell 26:243–250
Article MATH Google Scholar
Dasovich G, Kim R, Raicu D, Furst J (2010) A model for the relationship between semantic and content based similarity using LIDC. SPIE Med Imaging
Depeursinge A, Racoceanu D, Iavindrasana J, Cohen G, Platon A, Poletti PA, Müller H (2011) Fusing visual and clinical information for lung tissue classification in high-resolution computed tomography. Artif Intell Med 50:13–21
Article Google Scholar
Freund Y, Schapire R (1996) Experiments with a new boosting algorithm. In: Proceedings of the 13th international conference on machine learning (ICML), pp 325–332
Google Scholar
Galar M, Fernández A, Barrenechea E, Bustince H, Herrera F (2011) An overview of ensemble methods for binary classifiers in multi-class problems: experimental study on one-vs-one and one-vs-all schemes. Pattern Recognit 44:1761–1776
Article Google Scholar
García-Nieto J, Alba E (2011) Parallel multi-swarm optimizer for gene selection in DNA microarrays. Appl Intell. doi:10.1007/s10489-011-0325-9
Google Scholar
Gómez-Ruiz JA, Jerez-Aragonés JM, Muñoz-Pérez J, Alba-Conejo E (2004) A neural network based model for prognosis of early breast cancer. Appl Intell 20(3):231–238
Article Google Scholar
Kovalev V, Harder N, Neumann B, Held M, Liebel U, Erfle H, Ellenberg J, Ellis R, Rohr K (2006) Feature selection for evaluating florescence microscopy images in geneme-wide cell screens. In: Proceedings of the IEEE international conference on computer vision and pattern recognition (CVPR), pp 276–283
Google Scholar
Huang H, Shen L, Ford J, Gao L, Pearlman J (2005) Early lung cancer detection based on registered perfusion MRI. J Oncol Rep 15:1080–1084
Google Scholar
Lazebnik S, Raqinsky M (2009) Supervised learning of quantizer codebooks by information loss minimization. IEEE Trans Pattern Anal Mach Intell 31(7):1294–1309
Article Google Scholar
Lee Y, Lin Y, Wahba G (2004) Multicategory support vector machines: theory and application to the classification of microarray data and satellite radiance data. J Am Stat Assoc 99:67–81
Article MathSciNet MATH Google Scholar
Lee MC, Boroczky L, Sungur-Stasik K, Cann AD, Borczuk AC, Kawut SM, Powell CA (2010) Computer-aided diagnosis of pulmonary nodules using a two-step approach for feature selection and classifier ensemble construction. Artif Intell Med 50:43–53
Article Google Scholar
Fei-Fei L, Perona P (2005) A Bayesian hierarchical model for learning natural scene categories. In: Proceedings of the IEEE international conference on computer vision and pattern recognition (CVPR), pp 524–531
Google Scholar
Madabhushi A, Feldman MD, Metaxas DN, Tomaszeweski J, Chute D (2005) Automated detection of prostatic adenocarcinoma from high-resolution ex vivo MRI. IEEE Trans Med Imaging 24(12):1611–1625
Article Google Scholar
Maglogiannis I, Zafiropoulos E, Anagnostopoulos I (2009) An intelligent system for automated breast cancer diagnosis and prognosis using SVM based classifiers. Appl Intell 30:24–36
Article Google Scholar
Montani S (2008) Exploring new roles for case-based reasoning in heterogeneous AI systems for medical decision support. Appl Intell 28:275–285
Article Google Scholar
Mori K, Hasegawa J, Toriwaki J, Anno H, Katada K (1996) Recognition of bronchus in three-dimensional X-ray CT images with applications to virtualized bronchoscopy system. In: Proceedings of the international conference on pattern recognition (ICPR), pp 528–532
Chapter Google Scholar
Morik K, Brochhausen P, Joachims T (1999) Combining statistical learning with a knowledge-based approach: a case study in intensive care monitoring. In: Proceedings of the 16th international conference on machine learning (ICML), pp 268–277
Google Scholar
Own C-M (2009) Switching between type-2 fuzzy sets and intuitionistic fuzzy sets: an application in medical diagnosis. Appl Intell 31:283–291
Article Google Scholar
Scholkopf B, Herbrich R, Smola AJ (2001) A generalized representer theorem. In: Proceedings of the annual conference on learning theory (COLT), pp 416–426
Google Scholar
Sparks R, Madabhushi A (2011) Out-of-sample extrapolation using semi-supervised manifold learning (OSE-SSL): content-based image retrieval for prostate histology grading. In: Proceedings of the IEEE international symposium on biomedical imaging (ISBI), pp 734–737
Google Scholar
Tiwari P, Kurhanewicz J, Rosen M, Madabhushi A (2010) Semi supervised multi kernel (SeSMiK) graph embedding: identifying aggressive prostate cancer via magnetic resonance imaging and spectroscopy. In: Proceedings of the international conference on medical image computing and computer-assisted intervention (MICCAI), pp 667–673
Google Scholar
Wang J, Zucker J-D (2000) Solving the multiple-instance problem: a lazy learning approach. In: Proceedings of the 17th international conference on machine learning (ICML), pp 1119–1125
Google Scholar
Wang D, Lim J, Han M, Lee B (2005) Learning similarity for semantic images classification. Neurocomputing 67:363–368
Article Google Scholar
Yang Y, Chen S, Lin H, Ye Y (2004) A chromatic image understanding system for lung cancer cell identification based ob fuzzy knowledge. In IEA/AIE, pp 392–401
Zhang Y, Zhou Z-H (2008) Cost-sensitive face recognition. In: Proceedings of the IEEE international conference on computer vision and pattern recognition (CVPR), pp 1758–1769
Google Scholar
Zhou Z-H, Jiang Y, Yang Y-B, Chen S-F (2002) Lung cancer cell identification based on artificial neural network ensembles. Artif Intell Med 24(1):25–36
Article MATH Google Scholar
Zhou Z-H, Liu X-Y (2006) On multi-class cost-sensitive learning. In: Proceedings of the 21st national conference on artificial intelligence (AAAI), pp 567–572
Google Scholar
Zhu L, Zhao B, Gao Y (2008) Multi-class multi-instance learning approach for lung cancer cell classification based on bag feature selection. In: Proceedings of the 5th international conference on fuzzy systems and knowledge discovery, pp 487–492
Chapter Google Scholar

Download references

Acknowledgements

The authors would like to acknowledge the support for this work from the National Science Foundation of China (Grant Nos. 61035003, 61175042, 61021062), the National 973 Program of China (Grant No. 2009CB320702), the 973 Program of Jiangsu, China (Grant No. BK2011005) and Program for New Century Excellent Talents in University (Grant No. NCET-10-0476). The authors wish to thank the anonymous reviewers for their valuable suggestions.

Author information

Authors and Affiliations

State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China
Yinghuan Shi & Yang Gao
School of Engineering and Advanced Technology, Massey University, Massey, New Zealand
Ruili Wang
Bayi Hospital, Nanjing, China
Ying Zhang & Dong Wang

Authors

Yinghuan Shi
View author publications
You can also search for this author in PubMed Google Scholar
Yang Gao
View author publications
You can also search for this author in PubMed Google Scholar
Ruili Wang
View author publications
You can also search for this author in PubMed Google Scholar
Ying Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Dong Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yang Gao.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shi, Y., Gao, Y., Wang, R. et al. Transductive cost-sensitive lung cancer image classification. Appl Intell 38, 16–28 (2013). https://doi.org/10.1007/s10489-012-0354-z

Download citation

Published: 17 May 2012
Issue Date: January 2013
DOI: https://doi.org/10.1007/s10489-012-0354-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Transductive cost-sensitive lung cancer image classification

Abstract

Access this article

Similar content being viewed by others

Cell Encoding for Histopathology Image Classification

Large Margin Aggregation of Local Estimates for Medical Image Classification

Generalized Multiple Instance Learning for Cancer Detection in Digital Histopathology

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Transductive cost-sensitive lung cancer image classification

Abstract

Access this article

Similar content being viewed by others

Cell Encoding for Histopathology Image Classification

Large Margin Aggregation of Local Estimates for Medical Image Classification

Generalized Multiple Instance Learning for Cancer Detection in Digital Histopathology

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation