Abstract
This study proposed and implemented a novel framework that can automatically generate accurate area estimation of the identified brick-labeled pixels with the pixel-based intersection of union (IoU) technique. This novel framework employs a combination of fully convolutional neural network with class activation map and K-Means algorithm (CAM-K) to classify, visualize and calculate the pixel areas of brick-labeled images. The existing IoU method based on ground truth and estimated bounding boxes is not suitable for the calculation of localized pixel area. Experiment with our CAM-K framework revealed that it can reliably estimate the pixel areas of the detected object in classified images. Compared with the current state of IoU application, the proposed framework can realize specifically just those targeted pixels objects, and therefore, it can offer a far more realistic IoU metric.

















Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Goodfellow I, Bengio Y, Courville A (2016) 6.2.2.3 Softmax units for multinoulli output distributions. In: Deep Learning. MIT Press., pp. 180–184
Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117. https://doi.org/10.1016/j.neunet.2014.09.003
Muhammad MB, Yeasin M (2020) Eigen-CAM: class activation map using principal components. Proc Int Jt Conf Neural Netw. https://doi.org/10.1109/IJCNN48605.2020.9206626
Selvaraju RR, Das A, Vedantam R, et al. (2016) Grad-CAM: Why did you say that?
Chattopadhay A, Sarkar A, Howlader P, Balasubramanian VN (2018) Grad-CAM++: Generalized gradient-based visual explanations for deep convolutional networks. Proc—2018 IEEE Winter Conf Appl Comput Vision, WACV 2018 2018-Janua:839–847. https://doi.org/10.1109/WACV.2018.00097
Wang H, Wang Z, Du M, et al (2020) Score-CAM: Score-weighted visual explanations for convolutional neural networks. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. pp. 111–119
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, pp. 770–778
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet Classification with Deep Convolutional Neural Networks. Adv Neural Inf Process Syst 25
Samuel AL (2000) Some studies in machine learning using the game of checkers. IBM J Res Dev 44:207–219. https://doi.org/10.1147/rd.441.0206
Rokach L, Maimon O (2007) Data mining with decision trees: theory and applications. Data Min Decis Trees Theory Appl. https://doi.org/10.1142/6604
Kanungo T, Mount DM, Netanyahu NS et al (2002) An efficient k-means clustering algorithms: analysis and implementation. IEEE Trans Pattern Anal Mach Intell 24:881–892. https://doi.org/10.1109/TPAMI.2002.1017616
Nieddu L, Vicari D (2020) Supervised nested algorithm for classification based on k-means. In: Studies in Classification, Data Analysis, and Knowledge Organization. pp. 79–88
O’Shea K, Nash R (2015) An introduction to convolutional neural networks. arXiv Prepr arXiv151108458
Cira CI, Kada M, Manso-Callejo MÁ et al (2022) Improving road surface area extraction via semantic segmentation with conditional generative learning for deep inpainting operations. ISPRS Int J Geo-Inf. https://doi.org/10.3390/ijgi11010043
Zhou B, Khosla A, Lapedriza A, et al (2015) Learning Deep Features for Discriminative Localization. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit 2016-Decem:2921–2929. https://doi.org/10.1109/CVPR.2016.319
Selvaraju RR, Cogswell M, Das A et al (2016) Grad-CAM: visual explanations from deep networks via gradient-based localization. Int J Comput Vis 128:336–359. https://doi.org/10.1007/s11263-019-01228-7
Aslam F, Farooq F, Amin MN et al (2020) Applications of gene expression programming for estimating compressive strength of high-strength concrete. Adv Civ Eng. https://doi.org/10.1155/2020/8850535
Hacıefendioğlu K, Demir G, Başağa HB (2021) Landslide detection using visualization techniques for deep convolutional neural network models. Nat Hazards 109:329–350. https://doi.org/10.1007/s11069-021-04838-y
Jiang PT, Bin ZC, Hou Q et al (2021) LayerCAM: exploring hierarchical class activation maps for localization. IEEE Trans Image Process 30:5875–5888. https://doi.org/10.1109/TIP.2021.3089943
Lin D, Li Y, Prasad S, et al (2020) CAM-UNET: class activation MAP guided UNET with feedback refinement for defect segmentation. Proc—Int Conf Image Process ICIP 2020-Octob:2131–2135. https://doi.org/10.1109/ICIP40778.2020.9190900
Meng Q, Wang H, He M et al (2020) Displacement prediction of water-induced landslides using a recurrent deep learning model. Eur J Environ Civ Eng. https://doi.org/10.1080/19648189.2020.1763847
Vinogradova K, Dibrov A, Myers G (2020) Towards interpretable semantic segmentation via gradient-weighted class activation mapping (student abstract). AAAI 2020—34th AAAI Conf Artif Intell 13943–13944. https://doi.org/10.1609/aaai.v34i10.7244
Choe J, Oh SJ, Lee S et al (2020) Evaluating weakly supervised object localization methods right. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. https://doi.org/10.1109/CVPR42600.2020.00320
Sun K, Shi H, Zhang Z, Huang Y (2021) ECS-Net: improving weakly supervised semantic segmentation by using connections between class activation maps. In: IEEE International Conference on Computer Vision. pp 7283–7292
Kim I, Rajaraman S, Antani S (2019) Visual interpretation of convolutional neural network predictions in classifying medical image modalities. Diagnostics. https://doi.org/10.3390/diagnostics9020038
Rajaraman S, Kim I, Antani SK (2020) Detection and visualization of abnormality in chest radiographs using modality-specific convolutional neural network ensembles. PeerJ. https://doi.org/10.7717/peerj.8693
Medela A, Mac CT, Aguilar Robles SA et al (2022) Automatic SCOring of atopic dermatitis using deep learning (ASCORAD): a pilot study. JID Innov. https://doi.org/10.1016/j.xjidi.2022.100107
Chen B, Zhang H, Li Y et al (2022) Quantify pixel-level detection of dam surface crack using deep learning. Meas Sci Technol 33:065402. https://doi.org/10.1088/1361-6501/ac4b8d
Liu C, Fan X, Guo Z et al (2019) Wound area measurement with 3D transformation and smartphone images. BMC Bioinformatics 20:1–21. https://doi.org/10.1186/s12859-019-3308-1
Nahavandi D, Abobakr A, Haggag H, Hossny M (2017) A skeleton-free body surface area estimation from depth images using deep neural networks. 2017 IEEE Int Conf Syst Man, Cybern SMC 2017 2017-January:2707–2711. https://doi.org/10.1109/SMC.2017.8123035
James and others M (1967) Some methods for classification and analysis of multivariate observations. Proc fifth Berkeley Symp Math Stat Probab 1:281–297
Shelhamer E, Long J, Darrell T (2017) Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 39:640–651. https://doi.org/10.1109/TPAMI.2016.2572683
Nair V, Hinton GE (2010) Rectified linear units improve restricted Boltzmann Machines. In: ICML’10: Proceedings of the 27th International Conference on International Conference on Machine Learning. pp 807–814
Selvaraju RR, Cogswell M, Das A et al (2020) Grad-CAM: visual explanations from deep networks via gradient-based localization. Int J Comput Vis 128:336–359. https://doi.org/10.1007/s11263-019-01228-7
Ahmad A, Dey L (2007) A k-mean clustering algorithm for mixed numeric and categorical data. Data Knowl Eng 63:503–527. https://doi.org/10.1016/j.datak.2007.03.016
Chiang MMT, Mirkin B (2007) Experiments for the number of clusters in K-means. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Springer, Berlin, Heidelberg, pp 395–405
Pedregosa F, Weiss R, Brucher M et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830
Agapaki E, Brilakis I (2021) CLOI: an automated benchmark framework for generating geometric digital twins of industrial facilities. J Constr Eng Manag 147:04021145. https://doi.org/10.1061/(asce)co.1943-7862.0002171
Zeugmann T, Poupart P, Kennedy J, et al (2011) Precision and Recall. In: Sammut C., Webb G.I. (eds). Springer, Boston, MA
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Rights and permissions
About this article
Cite this article
Hacıefendioğlu, K., Mostofi, F., Toğan, V. et al. CAM-K: a novel framework for automated estimating pixel area using K-Means algorithm integrated with deep learning based-CAM visualization techniques. Neural Comput & Applic 34, 17741–17759 (2022). https://doi.org/10.1007/s00521-022-07428-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-07428-6