Skip to main content
Log in

CAM-K: a novel framework for automated estimating pixel area using K-Means algorithm integrated with deep learning based-CAM visualization techniques

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

This study proposed and implemented a novel framework that can automatically generate accurate area estimation of the identified brick-labeled pixels with the pixel-based intersection of union (IoU) technique. This novel framework employs a combination of fully convolutional neural network with class activation map and K-Means algorithm (CAM-K) to classify, visualize and calculate the pixel areas of brick-labeled images. The existing IoU method based on ground truth and estimated bounding boxes is not suitable for the calculation of localized pixel area. Experiment with our CAM-K framework revealed that it can reliably estimate the pixel areas of the detected object in classified images. Compared with the current state of IoU application, the proposed framework can realize specifically just those targeted pixels objects, and therefore, it can offer a far more realistic IoU metric.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

References

  1. Goodfellow I, Bengio Y, Courville A (2016) 6.2.2.3 Softmax units for multinoulli output distributions. In: Deep Learning. MIT Press., pp. 180–184

  2. Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117. https://doi.org/10.1016/j.neunet.2014.09.003

    Article  Google Scholar 

  3. Muhammad MB, Yeasin M (2020) Eigen-CAM: class activation map using principal components. Proc Int Jt Conf Neural Netw. https://doi.org/10.1109/IJCNN48605.2020.9206626

    Article  Google Scholar 

  4. Selvaraju RR, Das A, Vedantam R, et al. (2016) Grad-CAM: Why did you say that?

  5. Chattopadhay A, Sarkar A, Howlader P, Balasubramanian VN (2018) Grad-CAM++: Generalized gradient-based visual explanations for deep convolutional networks. Proc—2018 IEEE Winter Conf Appl Comput Vision, WACV 2018 2018-Janua:839–847. https://doi.org/10.1109/WACV.2018.00097

  6. Wang H, Wang Z, Du M, et al (2020) Score-CAM: Score-weighted visual explanations for convolutional neural networks. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. pp. 111–119

  7. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, pp. 770–778

  8. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet Classification with Deep Convolutional Neural Networks. Adv Neural Inf Process Syst 25

  9. Samuel AL (2000) Some studies in machine learning using the game of checkers. IBM J Res Dev 44:207–219. https://doi.org/10.1147/rd.441.0206

    Article  Google Scholar 

  10. Rokach L, Maimon O (2007) Data mining with decision trees: theory and applications. Data Min Decis Trees Theory Appl. https://doi.org/10.1142/6604

    Article  MATH  Google Scholar 

  11. Kanungo T, Mount DM, Netanyahu NS et al (2002) An efficient k-means clustering algorithms: analysis and implementation. IEEE Trans Pattern Anal Mach Intell 24:881–892. https://doi.org/10.1109/TPAMI.2002.1017616

    Article  Google Scholar 

  12. Nieddu L, Vicari D (2020) Supervised nested algorithm for classification based on k-means. In: Studies in Classification, Data Analysis, and Knowledge Organization. pp. 79–88

  13. O’Shea K, Nash R (2015) An introduction to convolutional neural networks. arXiv Prepr arXiv151108458

  14. Cira CI, Kada M, Manso-Callejo MÁ et al (2022) Improving road surface area extraction via semantic segmentation with conditional generative learning for deep inpainting operations. ISPRS Int J Geo-Inf. https://doi.org/10.3390/ijgi11010043

    Article  Google Scholar 

  15. Zhou B, Khosla A, Lapedriza A, et al (2015) Learning Deep Features for Discriminative Localization. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit 2016-Decem:2921–2929. https://doi.org/10.1109/CVPR.2016.319

  16. Selvaraju RR, Cogswell M, Das A et al (2016) Grad-CAM: visual explanations from deep networks via gradient-based localization. Int J Comput Vis 128:336–359. https://doi.org/10.1007/s11263-019-01228-7

    Article  Google Scholar 

  17. Aslam F, Farooq F, Amin MN et al (2020) Applications of gene expression programming for estimating compressive strength of high-strength concrete. Adv Civ Eng. https://doi.org/10.1155/2020/8850535

    Article  Google Scholar 

  18. Hacıefendioğlu K, Demir G, Başağa HB (2021) Landslide detection using visualization techniques for deep convolutional neural network models. Nat Hazards 109:329–350. https://doi.org/10.1007/s11069-021-04838-y

    Article  Google Scholar 

  19. Jiang PT, Bin ZC, Hou Q et al (2021) LayerCAM: exploring hierarchical class activation maps for localization. IEEE Trans Image Process 30:5875–5888. https://doi.org/10.1109/TIP.2021.3089943

    Article  Google Scholar 

  20. Lin D, Li Y, Prasad S, et al (2020) CAM-UNET: class activation MAP guided UNET with feedback refinement for defect segmentation. Proc—Int Conf Image Process ICIP 2020-Octob:2131–2135. https://doi.org/10.1109/ICIP40778.2020.9190900

  21. Meng Q, Wang H, He M et al (2020) Displacement prediction of water-induced landslides using a recurrent deep learning model. Eur J Environ Civ Eng. https://doi.org/10.1080/19648189.2020.1763847

    Article  Google Scholar 

  22. Vinogradova K, Dibrov A, Myers G (2020) Towards interpretable semantic segmentation via gradient-weighted class activation mapping (student abstract). AAAI 2020—34th AAAI Conf Artif Intell 13943–13944. https://doi.org/10.1609/aaai.v34i10.7244

  23. Choe J, Oh SJ, Lee S et al (2020) Evaluating weakly supervised object localization methods right. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. https://doi.org/10.1109/CVPR42600.2020.00320

    Article  Google Scholar 

  24. Sun K, Shi H, Zhang Z, Huang Y (2021) ECS-Net: improving weakly supervised semantic segmentation by using connections between class activation maps. In: IEEE International Conference on Computer Vision. pp 7283–7292

  25. Kim I, Rajaraman S, Antani S (2019) Visual interpretation of convolutional neural network predictions in classifying medical image modalities. Diagnostics. https://doi.org/10.3390/diagnostics9020038

    Article  Google Scholar 

  26. Rajaraman S, Kim I, Antani SK (2020) Detection and visualization of abnormality in chest radiographs using modality-specific convolutional neural network ensembles. PeerJ. https://doi.org/10.7717/peerj.8693

    Article  Google Scholar 

  27. Medela A, Mac CT, Aguilar Robles SA et al (2022) Automatic SCOring of atopic dermatitis using deep learning (ASCORAD): a pilot study. JID Innov. https://doi.org/10.1016/j.xjidi.2022.100107

    Article  Google Scholar 

  28. Chen B, Zhang H, Li Y et al (2022) Quantify pixel-level detection of dam surface crack using deep learning. Meas Sci Technol 33:065402. https://doi.org/10.1088/1361-6501/ac4b8d

    Article  Google Scholar 

  29. Liu C, Fan X, Guo Z et al (2019) Wound area measurement with 3D transformation and smartphone images. BMC Bioinformatics 20:1–21. https://doi.org/10.1186/s12859-019-3308-1

    Article  Google Scholar 

  30. Nahavandi D, Abobakr A, Haggag H, Hossny M (2017) A skeleton-free body surface area estimation from depth images using deep neural networks. 2017 IEEE Int Conf Syst Man, Cybern SMC 2017 2017-January:2707–2711. https://doi.org/10.1109/SMC.2017.8123035

  31. James and others M (1967) Some methods for classification and analysis of multivariate observations. Proc fifth Berkeley Symp Math Stat Probab 1:281–297

  32. Shelhamer E, Long J, Darrell T (2017) Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 39:640–651. https://doi.org/10.1109/TPAMI.2016.2572683

    Article  Google Scholar 

  33. Nair V, Hinton GE (2010) Rectified linear units improve restricted Boltzmann Machines. In: ICML’10: Proceedings of the 27th International Conference on International Conference on Machine Learning. pp 807–814

  34. Selvaraju RR, Cogswell M, Das A et al (2020) Grad-CAM: visual explanations from deep networks via gradient-based localization. Int J Comput Vis 128:336–359. https://doi.org/10.1007/s11263-019-01228-7

    Article  Google Scholar 

  35. Ahmad A, Dey L (2007) A k-mean clustering algorithm for mixed numeric and categorical data. Data Knowl Eng 63:503–527. https://doi.org/10.1016/j.datak.2007.03.016

    Article  Google Scholar 

  36. Chiang MMT, Mirkin B (2007) Experiments for the number of clusters in K-means. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Springer, Berlin, Heidelberg, pp 395–405

    Google Scholar 

  37. Pedregosa F, Weiss R, Brucher M et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830

    MathSciNet  MATH  Google Scholar 

  38. Agapaki E, Brilakis I (2021) CLOI: an automated benchmark framework for generating geometric digital twins of industrial facilities. J Constr Eng Manag 147:04021145. https://doi.org/10.1061/(asce)co.1943-7862.0002171

    Article  Google Scholar 

  39. Zeugmann T, Poupart P, Kennedy J, et al (2011) Precision and Recall. In: Sammut C., Webb G.I. (eds). Springer, Boston, MA

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kemal Hacıefendioğlu.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

See Fig. 

Fig. 18
figure 18

Pixel-based IoU algorithm

18.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hacıefendioğlu, K., Mostofi, F., Toğan, V. et al. CAM-K: a novel framework for automated estimating pixel area using K-Means algorithm integrated with deep learning based-CAM visualization techniques. Neural Comput & Applic 34, 17741–17759 (2022). https://doi.org/10.1007/s00521-022-07428-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-07428-6

Keywords

Navigation