Abstract
The objective of this study is to validate the use of Deep Neural Networks (DNNs) to segment and classify web elements. To achieve this, a dataset of 2200 images was created through screenshots of real web pages, with 10 distinct classes to represent the most common web elements. The contributions of this study encompass the validation of classification-only Convolutional Neural Networks (CNNs) with the support of Class Activation Mapping (CAM), a weakly-supervised semantic segmentation technique that requires no in-image annotation, significantly simplifying the dataset creation process when compared to traditional segmentation models. Multiple networks with distinct hyper-parameter combinations were cross-validated with 10 folds, with a final accuracy rating of 95.71% on the best-performing model. Although the final CNN showed promising results, further improvements on the dataset and architecture are still required for it to be employed as the centerpiece of a real-time dynamic web page building solution, with clear improvements needed on the clarity of the segmentation heatmap.
Similar content being viewed by others
Data availability
We are used our own coding and the publicly/freely available databases.
References
Balog M, Gaunt A, Brockschmidt M, Nowozin S, Tarlow D (2016). DeepCoder: Learning to Write Programs
Bansemir B, Hannß F, Lochner B, Wojdziak J (2014) Experience report: the effectiveness of paper prototyping for interactive visualizations. Design, user experience, and usability. theories, methods, and tools for designing the user experience: Third International Conference, DUXU, p. 3–13
Baulé D, Von Wangenheim CG, Von Wangenheim A, Hauck JCR, Vargas Júnior, E. C. (2021) Automatic code generation from sketches of mobile applications in end-user development using Deep Learning, arXiv:2103.05704
Beltramelli T (2018) pix2code: Generating code from a graphical user interface screenshot. Proceedings of EICS 2018, p. 3:1–3:6
Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13:281–305
Brigato L, Locchi L (2020) A close look at deep learning with small data, arXiv:2003.12843
Bunian S, Li K, Jemmali C, Harteveld C, Fu Y (2021) VINS: visual search for mobile user interface design, arXiv:2102.05216
Cai J, Xing F, Batra A, Liu F, Walter GA, Vandenborne K, Yang L (2019) Texture analysis for muscular dystrophy classification in mri with improved class activation mapping. Pattern Recognit 86:368–375
Chen WY, Podstreleny P, Cheng WH, Chen YY, Hua KL (2021) Code generation from a graphical user interface via attention-based encoder–decoder model. Multimedia Systems
Deming DJ, Noray K (2018) STEM careers and technological change. The National Buereau of Economic Research
Dingsoeyr T, Falessi D, Power K (2019) Agile development at scale: the next frontier, in IEEE Software, 36(2):30–38. https://doi.org/10.1109/MS.2018.2884884
Fu K, Dai W, Zhang Y, Wang Z, Yan M, Sun X (2019) MultiCAM: multiple class activation mapping for aircraft recognition in remote sensing images. Remote Sens 11:544
Girshick R (2015) Fast R-CNN. Proceedings of the 2015 IEEE International Conference on Computer Vision, p. 1440–1448
Guo Y, Liu Y, Georgiou T, Lew SM (2018) A review of semantic segmentation using deep neural networks. Int J Multimed Info Retr 7:87–93. https://doi.org/10.1007/s13735-017-0141-z
Halbe A, Joshi AR (March 2015) A novel approach to HTML page creation using neural network. Procedia Comput Sci 45:197–204
Hao S, Zhou Y, Guo Y (2020) A brief survey on semantic segmentation with deep learning. Neurocomputing 406:302–321
He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. arXiv:1512.03385
Hehn J, Uebernickel F (2018) The use of design thinking for requirements engineering: an ongoing case study in the field of innovative software-intensive systems. Proceedings of The 2018 IEEE 26th International Requirements Engineering Conference, p. 400–405
Heitkötter H, Hanschke S, Majchzrak TA (2012) Evaluating cross-platform development approaches for mobile applications. WEBIST 2012: Web Information Systems and Technologies 40:120–138
Huang G, Liu Z, Maaten LV, Weinberger KQ (July 2017) Densely connected convolutional networks. Proc IEEE Conf Comput Vis Pattern Recognit 1:2261–2269
Jamshidi P, Ahmad A, Pahl C (July 2013) Cloud migration research: A systematic review. IEEE Trans Cloud Comput 1:142–157
Kashfi P, Nilsson A, Feldt R (2016) Integrating user experience practices into software development processes: the implication of subjectivity and emergent nature of UX. arXiv:1605.03783
Le THM, Chen H, Babar MA (2020) Deep learning for source code modeling and generation: models, applications, and challenges. ACM Comput Surv 53:1–38
Lin M, Chen Q, Yan S (2014) Network in Network. 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, 14-16 April 2014.
López-Sánchez D, Arrieta AG, Corchado JM (April 2019) Visual content-based web page categorization with deep transfer learning and metric learning. Neurocomputing 338:418–431
Luo C, He X, Zhan J, Wang J, Gao W, Dai J (2020) Comparison and benchmarking of AI models and frameworks on mobile devices, arXiv:2005.05085
McNemar Q (1947) Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika 12(2):153–157
Muhammad MB, Yeasin M (2020) Eigen-CAM: class activation map using principal components arXiv:2008.00299
Olsson T, Lagerstam E, Kärkkäinen T, Väänänen-Vainio-Mattila K (2013) Expected user experience of mobile augmented reality services: a user study in the context of shopping centres. J Person Ubiquitous Comput 17:287–304
Redmon J, Farhadi A (2018). YOLOv3: An Incremental Improvement
Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39:1137–1149
Ries E (2011) The lean startup: how Today's entrepreneurs use continuous innovation to create radically successful businesses. 1st edition. New York, USA: Crown Business
Selvaraju RR, Cogswell M, Vedantam R, Parikh D, Batra D (2017) Grad-CAM: visual explanations from deep networks via gradient-based localization, 2017 IEEE International Conference on Computer Vision (ICCV), 2017, pp. 618–626, https://doi.org/10.1109/ICCV.2017.74.
Srinivasu PNM, SivaSai JG, Ijaz MF, Bhoi AK, Kim W, Kang JJ (2021) Classification of skin disease using deep learning neural networks with MobileNet V2 and LSTM. Sensors 21:2852. https://doi.org/10.3390/s21082852
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, pp. 1–9, https://doi.org/10.1109/CVPR.2015.7298594.
Wenjie Y, Houjing H, Xiaotang C, Kaiqi H, Shu Z (2019) Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR):1389–1398
Whatmough PN, Zhou C, Hansen P, Venkataramanaiah SK, Seo JS, Mattina M (2019) FixyNN: Efficient hardware for mobile computer vision via transfer learning. Proceedings of the 2nd SysML Conference
Wojdziak J, Bansemir B, Kirchner B, Lochner B, Groh R (2016) Low-fidelity prototyping for collaborative user interface specifications. HCI International 2016 - Posters’ Extended Abstracts (Communications in Computer and Information Science), v. 617, pp 167–172
Xie S, Girshick R, Dollár P, Tu Z, He K (2017) Aggregated residual transformations for deep neural networks. arXiv:1611.05431
Yuille AL, Liu C (2019) Deep nets: what have they ever done for vision? arXiv:1805.04025
Zhao T, Chen C, Liu Y, Zhu X (2021) Guigan: learning to generate GUI designs using generative adversarial networks, arXiv:2101.09978
Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2014) Object detectors emerge in deep scene cnns. International Conference on Learning Representations
Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2016) Learning deep features for discriminative localization. 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 2921–2929
Zhu L, Li C, Yang Z (2020) Crowd density estimation based on classification activation map and patch density level. Neural Comput & Applic 32:5105–5116. https://doi.org/10.1007/s00521-018-3954-7
Funding
The authors would like to thank the National Council of Scientific and Technologic Development of Brazil - CNPq (Grants number: 307958/2019-1-PQ, 307966/2019-4-PQ, 404659/2016-0-Univ, 405101/2016-3-Univ), and PRONEX `Fundação Araucária’ 042/2018.
Author information
Authors and Affiliations
Contributions
All authors contributed in this study and manuscript.
Corresponding author
Ethics declarations
Ethics approval
This study does not contain any studies with humans and animals participants performed by any authors.
Consent to participate
We voluntarily agreed to participate in this research study.
Consent for publication
All the authors have given their consent for the manuscript to be published in this Journal.
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Cizotto, A.A.J., de Souza, R.C.T., Mariani, V.C. et al. Web pages from mockup design based on convolutional neural network and class activation mapping. Multimed Tools Appl 82, 38771–38797 (2023). https://doi.org/10.1007/s11042-023-15108-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-15108-3