Abstract
The present work aims to develop a methodology for classifying lung nodules using the LIDC-IDRI image database. The proposed methodology is based on image-processing and pattern-recognition techniques. To describe the texture of nodule and non-nodule candidates, we use the Taxonomic Diversity and Taxonomic Distinctness Indexes from ecology. The calculation of these indexes is based on phylogenetic trees, which, in this work, are applied to the candidate characterization. Finally, we apply a Support Vector Machine (SVM) as a classifier. In the testing stage, we used 833 exams from the LIDC-IDRI image database. To apply the methodology, we divided the complete database into two groups for training and testing. We used training and testing partitions of 20/80 %, 40/60 %, 60/40 %, and 80/20 %. The division was repeated five times at random. The presented methodology shows promising results for classifying nodules and non-nodules, presenting a mean accuracy of 98.11 %. Lung cancer presents the highest mortality rate and has one of the lowest survival rates after diagnosis. Therefore, the earlier the diagnosis, the higher the chances of a cure for the patient. In addition, the more information available to the specialist, the more precise the diagnosis will be. The methodology proposed here contributes to this.
Similar content being viewed by others
References
Akram, S., Javed, M.Y., Hussain, A., Riaz, F., & Akram, M.U. (2015). Intensity-based statistical features for classification of lungs ct scan nodules using artificial intelligence techniques. Journal of Experimental & Theoretical Artificial Intelligence, 27(6), 737–751. doi:10.1080/0952813X.2015.1020526.
Al-Absi, H., Samir, B., Shaban, K., & Sulaiman, S. (2012). Computer aided diagnosis system based on machine learning techniques for lung cancer. In 2012 International conference on computer information science (ICCIS) (Vol. 1, pp. 295–300). doi:10.1109/ICCISci.2012.6297257.
Armato, S.G., McLennan, G., Bidaut, L., McNitt-Gray, M.F., Meyer, C.R., Reeves, A.P., Zhao, B., Aberle, D.R., Henschke, C.I., Hoffman, E.A., Kazerooni, E.A., MacMahon, H., Van Beeke, E.J.R., Yankelevitz, D., Biancardi, A.M., Bland, P.H., Brown, M.S., Engelmann, R.M., Laderach, G.E., Max, D., Pais, R.C., Qing, D.P.Y., Roberts, R.Y., Smith, A.R., Starkey, A., Batrah, P., Caligiuri, P., Farooqi, A., Gladish, G.W., Jude, C.M., Munden, R.F., Petkovska, I., Quint, L.E., Schwartz, L.H., Sundaram, B., Dodd, L.E., Fenimore, C., Gur, D., Petrick, N., Freymann, J., Kirby, J., Hughes, B., Casteele, A.V., Gupte, S., Sallamm, M., Heath, M.D., Kuhn, M.H., Dharaiya, E., Burns, R., Fryd, D.S., Salganicoff, M., Anand, V., Shreter, U., Vastagh, S., & Croft, B.Y. (2011). The lung image database consortium (lidc) and image database resource initiative (idri): a completed reference database of lung nodules on ct scans. Medical Physiology, 38(2), 915–931. http://www.biomedsearch.com/nih/Lung-Image-Database-Consortium-LIDC/21452728.html.
Baxevanis, A.D., & Ouellette, B.F.F. (2004). Bioinformatics: a practical guide to the analysis of genes and proteins. Methods of biochemical analysis. Wiley. http://books.google.com.br/books?id=ghUZaEAdHUC.
Bolboaca, S.D., Jantschi, L., Sestraa, A.F., Sestra, R.E., & Pamfil, D.C. (2011). Pearson-fisher chi-square statistic revisited. Information, 2(3), 528–545. doi:10.3390/info2030528. http://www.mdpi.com/2078-2489/2/3/528.
Câncer, I.N. (2014). Estimativas da incidência e mortalidade por câncer no brasil. Available: http://www.inca.gov.br/estimativa/2012/. (Accessed: 1 January 2014).
de Carvalho Filho, A.O, de Sampaio, W.B., Silva, A.C., de Paiva, A.C., Nunes, R.A., & Gattass, M. (2013). Automatic detection of solitary lung nodules using quality threshold clustering, genetic algorithm and diversity index. Artificial Intelligence in Medicine. doi:10.1016/j.artmed.2013.11.002. http://www.sciencedirect.com/science/article/pii/S0933365713001541.
Chang, C.C., & Lin, C.J. LIBSVM—a library for support vector machines (2013). Available at http://www.csie.ntu.edu.tw/cjlin/libsvm/.
Chen, W., Li, Z., Bai, L., & Lin, Y. (2011). Nf-kappab in lung cancer, a carcinogenesis mediator and a prevention and therapy target. Frontiers in Bioscience (Landmark edition), 16, 1172–1185. doi:10.2741/3782.
Dehmeshki, J., Ye, X., Casique, M.V., & Lin, X. (2006). A hybrid approach for automated detection of lung nodules in ct images. In ISBI (pp. 506–509). IEEE. http://dblp.uni-trier.de/db/conf/isbi/isbi2006.html.
Duda, R.O., & Hart, P.E. (1973). Pattern classification and scene analysis. New York: Wiley-Interscience Publication.
van Erkel, A., & Pattynama, P. (1998). Receiver operating characteristic (ROC) analysis: basic principles and applications in radiology. European Journal of Radiology, 27(2), 88–94.
Farag, A., Ali, A., Graham, J., Farag, A., Elshazly, S., & Falk, R. (2011). Evaluation of geometric feature descriptors for detection and classification of lung nodules in low dose ct scans of the chest. In 2011 IEEE international symposium on biomedical imaging: from nano to macro (pp. 169–172). doi:10.1109/ISBI.2011.5872380.
Galloway, M.M. (1975). Texture analysis using gray level run lengths. Computer Graphics and Image Processing, 4(2), 172–179. doi:10.1016/S0146-664X(75)80008-6. http://www.sciencedirect.com/science/article/pii/S0146664X75800086.
Hardie, R.C., Rogers, S.K., Wilson, T.A., & Rogers, A. (2008). Performance analysis of a new computer aided detection system for identifying lung nodules on chest radiographs. Medical Image Analysis, 12(3), 240–258. http://dblp.uni-trier.de/db/journals/mia/mia12.html#HardieRWR08.
Huang, P.W., Lin, P.L., Lee, C.H., & Kuo, C. (2013). A classification system of lung nodules in ct images based on fractional brownian motion model. In 2013 International conference on system science and engineering (ICSSE) (pp. 37–40). doi:10.1109/ICSSE.2013.6614710.
Jing, Z., Bin, L., & Lianfang, T. (2010). Lung nodule classification combining rule-based and svm. In 2010 IEEE fifth international conference on bio-inspired computing: theories and applications (BIC-TA) (pp. 1033–1036). doi:10.1109/BICTA.2010.5645114.
King, P.H. (2012). Digital image processing and analysis: Human and computer applications with cviptools, 2nd edition (umbaugh, s.; 2011) [book reviews]. IEEE Pulse, 3(4), 84–85. doi:10.1109/MPUL.2012.2196843.
Lee, S., Kouzani, A., & Hu, E. (2010). Random forest based lung nodule classification aided by clustering. Computerized Medical Imaging and Graphics, 34(7), 535–542. doi:10.1016/j.compmedimag.2010.03.006. http://www.sciencedirect.com/science/article/pii/S0895611110000418.
Leef, J. 3rd, & Klein, J. (2002). The solitary pulmonary nodule. Radiologic Clinics of North America, 40 (1), 123–143, ix. doi:10.1056/NEJMcp012290.
Liu, Y., Yang, J., Zhao, D., & Liu, J. (2009). Computer aided detection of lung nodules based on voxel analysis utilizing support vector machines. In International conference on future biomedical information engineering, 2009. FBIE 2009 (pp. 90–93).
Magurran, A.E. (2004). Measuring biological diversity. African Journal of Aquatic Science, 29(2), 285–286.
Moura, H., & Viana, G. (2011). Phylogenetic trees drawing web service. In BIOTECHNO 2011, the third international conference on bioinformatics, biocomputational systems and biotechnologies (pp. 73–77).
Netto, S.M.B., Silva, A.C., Nunes, R.A., & Gattass, M. (2012). Automatic segmentation of lung nodules with growing neural gas and support vector machine. Computers in Biology and Medicine, 42(11), 1110–1121. doi:10.1016/j.compbiomed.2012.09.003.
Orozco, H., Osiris Vergara Villegas, O., Maynez, L., Sanchez, V., & de Jesus Ochoa Dominguez, H. (2012). Lung nodule classification in frequency domain using support vector machines. In 2012 11th international conference on information science, signal processing and their applications (ISSPA) (pp. 870–875). doi:10.1109/ISSPA.2012.6310676.
Pienkowski, M.W., Watkinson, A.R., Kerby, G., Clarke, K.R., & Warwick, R.M. (1998). A taxonomic distinctness index and its statistical properties. Journal of Applied Ecology, 35(4), 523–531. doi:10.1046/j.1365-2664.1998.3540523.x.
Schölkopf, B., & Smola, A. (2002). Learning with kernels: support vector machines, regularization, optimization, and beyond. Cambridge: MIT Press.
da Silva, I.A., & Batalha, M.A. (2006). Taxonomic distinctness and diversity of a hyperseasonal savanna in central brazil. Diversity and Distributions, 12(6), 725–730. doi:10.1111/j.1472-4642.2006.00264.x.
Sivakumar, S., & Chandrasekar, C. (2013). Lung nodule detection using fuzzy clustering and support vector machines. International Journal of Engineering and Technology (IJET), 5 (11), 179–185.
Soliman, A.A., Abd Ellah, A.H., Abou-Elheggag, N.A., & Modhesh, A.A. (2012). Estimation of the coefficient of variation for non-normal model using progressive first-failure-censoring data. Journal of Applied Statistics, 39(12), 2741–2758. http://EconPapers.repec.org/RePEc:taf:japsta:v:39:y:2012:i:12:p:2741-2758.
Tartar, A., Kilic, N., & Akan, A. (2013). Classification of pulmonary nodules by using hybrid features. Computational and Mathematical Methods in Medicine, 2013, 148363. doi:10.1155/2013/148363. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3708407/.
Wagner, J.M., & Shimshak, D.G. (2007). Stepwise selection of variables in data envelopment analysis: Procedures and managerial perspectives. European Journal of Operational Research, 180(1), 57–67. doi:10.1016/j.ejor.2006.02.048. http://www.sciencedirect.com/science/article/pii/S0377221706002839.
Walker, R.F., Jackway, P.T., & Longstaff, I.D. (1997). Recent developments in the use of the co-occurrence matrix for texture recognition. In 1997 13th international conference on digital signal processing proceedings, 1997. DSP 97 (Vol. 1, pp. 63–65). doi:10.1109/ICDSP.1997.627968.
Ye, X., Lin, X., Dehmeshki, J., Slabaugh, G., & Beddoe, G. (2009). Shape-based computer-aided detection of lung nodules in thoracic ct images. IEEE Transactions on Biomedical Engineering, 56(7), 1810–1820. doi:10.1109/TBME.2009.2017027.
Acknowledgments
The authors acknowledge Coordination for the Improvement of Higher Education Personnel (CAPES), the National Council for Scientific and Technological Development (CNPq), and the Foundation for the Protection of Research and Scientific and Technological Development of the State of Maranho (FAPEMA) for financial support
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
de Carvalho Filho, A.O., Silva, A.C., de Paiva, A.C. et al. Lung-Nodule Classification Based on Computed Tomography Using Taxonomic Diversity Indexes and an SVM. J Sign Process Syst 87, 179–196 (2017). https://doi.org/10.1007/s11265-016-1134-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-016-1134-5