Automated morphological classification of lung cancer subtypes using H&E tissue images

Wang, Ching-Wei; Yu, Cheng-Ping

doi:10.1007/s00138-012-0457-x

Automated morphological classification of lung cancer subtypes using H&E tissue images

Special Issue Paper
Published: 14 October 2012

Volume 24, pages 1383–1391, (2013)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Ching-Wei Wang¹ &
Cheng-Ping Yu^2,3

948 Accesses
20 Citations
Explore all metrics

Abstract

Patient-targeted therapies have recently been highlighted as important. An important development in the treatment of metastatic non-small cell lung cancer (NSCLC) has been the tailoring of therapy on the basis of histology. A pathology diagnosis of “non-specified NSCLC” is no longer routinely acceptable; an effective approach for classification of adenocarcinoma (AC) and squamous carcinoma (SC) histotypes is needed for optimizing therapy. In this study, we present a robust and objective automatic computer vision system for real-time classification of AC and SC based on the morphological tissue patterns of hematoxylin and eosin (H&E) staining images to assist medical experts in the diagnosis of lung cancer. Various original and extended densitometric and Haralick’s texture features are used to extract image features, and a boosting algorithm is utilized to train the classifier, together with alternative decision tree as the base learner. For evaluation, two types of data with 653 tissue samples were tested, including 369 samples from tissue microarray data set and 284 samples from full-face tissue sections. Regarding the data distribution, 45 % are AC samples (288) and 55 % are SC samples (365), which is considerably well balanced for each class. Using tenfold cross-validation, the technique achieved high accuracy of $92.41~\%$ on tissue microarray cores and $95.42~\%$ on full tissue sections. We also found that the two boosting algorithms (cw-Boost and AdaBoost.M1) perform consistently well in comparison with other popularly adopted machine learning methods, including support vector machine, neural network and decision tree. This approach offers a robust, objective and rapid procedure for optimized patient-targeted therapy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Usefulness of gradient tree boosting for predicting histological subtype and EGFR mutation status of non-small cell lung cancer on ¹⁸F FDG-PET/CT

Article 28 October 2019

Classification of Histological Types and Stages in Non-small Cell Lung Cancer Using Radiomic Features Based on CT Images

Article 24 February 2023

Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features

Article Open access 16 August 2016

References

Argiris, A., Gadgeel, S.M., Dacic, S.: Subdividing nsclc: Reflections on the past, present, and future of lung cancer therap. Oncology 23, 1–4 (2009)
Google Scholar
Ambroise, C., McLachlan, G.J.: Selection bias in gene extraction on the basis of microarray gene-expression data. Proc. Nat. Acad. Sci. USA 99(10), 6562–6566 (2002)
Article MATH Google Scholar
American Cancer Society: http://www.cancer.org/. Accessed June 9 (2009)
Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press, Inc., New York (1995)
Google Scholar
Chen, W., Foran, D.J.: Advances in cancer tissue microarray technology: towards improved understanding and diagnostics. Anal. Chim. Acta 564, 74–81 (2006)
Article Google Scholar
Chapman, J., Miller, N., Lickley, H., Qian, J., Christens-Barry, W., Fu, Y., Yuan, Y., Axelrod, D.: Ductal carcinoma in situ of the breast (dcis) with heterogeneity of nuclear grade: prognostic effects of quantitative nuclear assessment. BMC Cancer 7(1), 174 (2007)
Article Google Scholar
Dubey, S., Powell, C.A.: Update in lung cancer 2008. Am. J. Respir. Crit. Care Med. 179(10), 860–868 (2009)
Article Google Scholar
Edwards, S.L., Roberts, C., McKean, M.A., Cockburn, J.S., Jeffrey, R.R., Kerr, K.M.: Pre-operative histological classification of primary lung cancer: accuracy of diagnosis and use of the non-small cell carcinomas. J. Clin. Pathol. 53, 537–540 (2000)
Article Google Scholar
Freund, Y., Mason, L.: The alternating decision tree learning algorithm. In: Proceedings of the 16th International Conference on Machine Learning, pp 124–133. Morgan Kaufmann, San Francisco (1999)
Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Proceedings of the Thirteenth International Conference on Machine Learning, pp. 148–156 (1996)
Grilley-Olson, J.E., Hayes, D.N., Qaqish, B.F., Moore, D.T., Socinski, M.A., Yin, X., Travis, W.D., Funkhouser, W.K., et al.: Validation of inter-observer agreement in lung cancer assessment. J. Clin. Oncol. 27, 15s (2009)
Haralick R.M., Shanmugam K., Dinstein (1973) Textural features for image classification. IEEE Trans. Syst. Man Cybern. 3(6):610–621
Keerthi, S.S., Shevade, S.K., Bhattacharyya, C., Murthy, K.R.K.: Improvements to platt’s smo algorithm for svm classifier design. Neural Comput. 13(3), 637–649 (2001)
Google Scholar
Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: IJCAI, pp. 1137–1145 (1995)
Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)
Article MathSciNet Google Scholar
Otsu, N.: A threshold selection method from gray level histograms. IEEE Trans. Syst. Man Cybern. 9, 62–66 (1979) (minimize inter class variance)
Google Scholar
Quinlan, R.J.: C4.5: Programs for Machine Learning (Morgan Kaufmann Series in Machine Learning). Morgan Kaufmann, San Francisco (1993)
Sandler, A., Gray, R., Perry, M.C., Brahmer, J., Schiller, J.H., Dowlati, A., Lilenbaum, R., Johnson, D.H.: Paclitaxel-carboplatin alone or with bevacizumab for non-small-cell lung cancer. N Engl. J. Med. 355(24), 2542–2550 (2006)
Article Google Scholar
Selvaggi, G.: Histologic subtype in nsclc: does it matter? Oncology 23, 1–11 (2009)
Google Scholar
Ullmann, R., Morbini, P., Halbwedl, I., Bongiovanni, M., Gogg-Kammerer, M., Papotti, M., Gabor, S., Renner, H., Popper, H.H.: Protein expression profiles in adenocarcinomas and squamous cell carcinomas of the lung generated using tissue microarrays. J Pathol 203(3), 798–807 (2004)
Article Google Scholar
Wallace, W.: The challenge of classifying poorly differentiated tumours in the lung. J. Histopathol. 54, 28–42 (2009)
Article Google Scholar
Wang, C.-W., Hunter, A.: A low variance error boosting algorithm. Appl. Intell. 33, 357–369 (2009)
Google Scholar
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaugmann Series in Data Management Systems. Morgan Kaufmann, San Francisco (2005)
Zapotoczny, P., Zielinska, M., Nita, Z.: Application of image analysis for the varietal classification of barley: morphological features. J. Cereal Sci. 48(1), 104–110 (2008)
Article Google Scholar

Download references

Acknowledgments

This project is partially funded by Tri-Service General Hospital, Taiwan, TSGH-C101-011. This work of C.W.W. was partially funded by NSC, 101-2628-E-011-006-MY3.

Author information

Authors and Affiliations

Graduate Institute of Biomedical Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
Ching-Wei Wang
Department of Pathology, Tri-Service General Hospital, Taipei, Taiwan
Cheng-Ping Yu
Institute of Pathology and Parasitology, National Defense Medical Center, Taipei, Taiwan
Cheng-Ping Yu

Authors

Ching-Wei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Cheng-Ping Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ching-Wei Wang.

Appendix

In the calculation of Haralick features, four matrices are needed to describe different orientations ($0^\circ , 45^\circ , 90^\circ , 135^\circ $), and each matrix is a symmetric matrix with the dimensionality of $N_\mathrm{{g}} \times N_\mathrm{{g}}$ where $N_\mathrm{{g}}$ is the number of possible gray levels for a particular image. The co-occurrence matrices can be represented by a three-dimensional data structure; the third dimension is $d$, which varies depending on the dimensions of the original input image. The statistical properties of co-occurrence matrix can be formulated as follows. $p(i,j)$ is defined as the $(i,j)$th entry in a normalized single-channel/tone spatial-dependence matrix.

$$\begin{aligned} p(i,j)=P(i,j)/R \end{aligned}$$

(3)

where $R=\sum _{i=1}^{N_\mathrm{{g}}}\sum _{j=1}^{N_\mathrm{{g}}}P(i,j)$.

$p_x(i)$: the i-th entry in the marginal probability matrix obtained by summing the rows of $p(i,j)$.

$$\begin{aligned} p_x(i)=\sum _{j=1}^{N_\mathrm{{g}}}p(i,j) \end{aligned}$$

(4)

The 11 Haralick features are defined as follows.

$$\begin{aligned} f_1&= \sum _{i=1}^{N_\mathrm{{g}}}\sum _{j=1}^{N_\mathrm{{g}}}(p(i,j))^2 \end{aligned}$$

(5)

$$\begin{aligned} f_2&= \sum _{k=0}^{N_\mathrm{{g}}-1}k^2 p_{x-y}(k) \end{aligned}$$

(6)

$$\begin{aligned} f_3&= \frac{\sum _{i=1}^{N_\mathrm{{g}}}\sum _{j=1}^{N_\mathrm{{g}}}(ij)p(i,j)-\upmu _x\upmu _y}{\sigma _x\sigma _y} \end{aligned}$$

(7)

where $\upmu _x, \upmu _y,\sigma _x, \sigma _y$ are the means and standard deviations of $p_x$ and $p_y$.

$$\begin{aligned} f_4&= \sum _{i=1}^{N_\mathrm{{g}}}\sum _{j=1}^{N_\mathrm{{g}}}(i-\upmu )^2p(i,j) \end{aligned}$$

(8)

$$\begin{aligned} f_5&= \sum _{i=1}^{N_\mathrm{{g}}}\sum _{j=1}^{N_\mathrm{{g}}}\frac{1}{1+(i-j)^2}p(i,j) \end{aligned}$$

(9)

$$\begin{aligned} f_6&= \sum _{i=2}^{2N_\mathrm{{g}}}ip_{x+y}(i) \end{aligned}$$

(10)

$$\begin{aligned} f_7&= \sum _{i=2}^{2N_\mathrm{{g}}}(i-f_8)^2p_{x+y}(i) \end{aligned}$$

(11)

$$\begin{aligned} f_8&= -\sum _{i=2}^{2N_\mathrm{{g}}}p_{x+y}(i)\log (p_{x+y}(i)) \end{aligned}$$

(12)

$$\begin{aligned} f_9&= -\sum _{i=1}^{N_\mathrm{{g}}}\sum _{j=1}^{N_\mathrm{{g}}}p(i,j)\log (p(i,j)) \end{aligned}$$

(13)

$$\begin{aligned} f_{10}&= \sigma _{p_{x-y}} \end{aligned}$$

(14)

$$\begin{aligned} f_{11}&= -\sum _{i=0}^{N_\mathrm{{g}}-1}p_{x-y}(i)\log (p_{x+y}(i)) \end{aligned}$$

(15)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, CW., Yu, CP. Automated morphological classification of lung cancer subtypes using H&E tissue images. Machine Vision and Applications 24, 1383–1391 (2013). https://doi.org/10.1007/s00138-012-0457-x

Download citation

Received: 23 December 2011
Revised: 05 June 2012
Accepted: 17 September 2012
Published: 14 October 2012
Issue Date: October 2013
DOI: https://doi.org/10.1007/s00138-012-0457-x

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automated morphological classification of lung cancer subtypes using H&E tissue images

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Usefulness of gradient tree boosting for predicting histological subtype and EGFR mutation status of non-small cell lung cancer on ¹⁸F FDG-PET/CT

Classification of Histological Types and Stages in Non-small Cell Lung Cancer Using Radiomic Features Based on CT Images

Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Automated morphological classification of lung cancer subtypes using H&E tissue images

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Usefulness of gradient tree boosting for predicting histological subtype and EGFR mutation status of non-small cell lung cancer on 18F FDG-PET/CT

Classification of Histological Types and Stages in Non-small Cell Lung Cancer Using Radiomic Features Based on CT Images

Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation

Usefulness of gradient tree boosting for predicting histological subtype and EGFR mutation status of non-small cell lung cancer on ¹⁸F FDG-PET/CT