Abstract
The analysis for executing Robotic Process Automation (RPA) projects increasingly relies on monitoring user activities through Robotic Process Mining (RPM) techniques. Traditional approaches capture direct information using loggers that capture UI logs, i.e., sequences of events that include data from (1) the keyboard, (2) the mouse, and (3) the application elements, such as its name, the Excel cell, the clicked button, etc. Although the latter is highly relevant for identifying the activity that is being performed, this information is not accessible in virtualized environments; only screenshot data is available. This limitation necessitates activity identification based on screenshots alone. A significant challenge with this method is its sensitivity to minor interface changes, such as different zoom levels or notifications, which can cause detection failures. To address this, we propose a novel approach that, first, integrates embeddings from both screenshots and screen text obtained through OCR and, second, clusters the UI log events using these combined features to identify the activity. Our results show that this method enhances activity identification, outperforming current state-of-the-art techniques, and demonstrates promising improvements in accuracy and reliability.
This research was supported by the EQUAVEL project PID2022-137646OB-C31, funded by MICIU/AEI/10.13039/501100011033 and by FEDER, UE; the DISCOVERY project (2021/C005/00148631), funded by Unión Europea NextGeneration EU and “Plan de Recuperación, Transformación y Resiliencia” of the Ministry of Economic and Digital Transformation; and the grant FPU20/05984 funded by MICIU/AEI/10.13039/501100011033 and by FSE+.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
ScreenRPA Framework: https://github.com/RPA-US/screenrpa.
- 2.
Approach source code in: https://github.com/RPA-US/processdiscovery.
- 3.
Agglomerative Clustering Scikit-learn: https://scikit-learn.org/stable/modules/generated/sklearn.cluster.AgglomerativeClustering.html.
- 4.
Evaluation input and output data: https://doi.org/10.5281/zenodo.11368319.
- 5.
Cenit S.L: https://www.cenitcon.com/.
- 6.
Odoo is an open-source ERP system accessible at https://odoo.com.
References
van der Aalst, W.M.P., Bichler, M., Heinzl, A.: Robotic process automation. Bus. Inf. Syst. Eng. 60(4), 269–272 (2018)
Agostinelli, S., Lupia, M., Marrella, A., Mecella, M.: Automated generation of executable RPA scripts from user interface logs. In: Asatiani, A., et al (eds.) Business Process Management: Blockchain and Robotic Process Automation Forum, pp. 116–131 (2020)
Agostinelli, S., Lupia, M., Marrella, A., Mecella, M.: Reactive synthesis of software robots in RPA from user interface logs. Comput. Ind. 142, 103721 (2022)
Aguirre, S., Rodriguez, A.: Automation of a business process using robotic process automation (RPA): a case study. In: Applied Computer Sciences in Engineering: 4th Workshop on Engineering Applications, WEA 2017, Cartagena, Colombia, 27–29 September 2017, Proceedings 4, pp. 65–71 (2017)
Bala, S., Mendling, J., Schimak, M., Queteschiner, P.: Case and activity identification for mining process models from middleware. In: Buchmann, R.A., Karagiannis, D., Kirikova, M. (eds.) PoEM 2018. LNBIP, vol. 335, pp. 86–102. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-02302-7_6
Dumas, M., Rosa, M.L., Leno, V., Polyvyanyy, A., Maggi, F.M.: Robotic process mining. In: van der Aalst, W.M.P., Carmona, J. (eds.) Process Mining Handbook. LNCS, vol. 448, pp. 468–491. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-031-08848-3_16
El-Gharib, N.M., Amyot, D.: Robotic process automation using process mining - a systematic literature review. Data Knowl. Eng. 148, 102229 (2023)
Jimenez-Ramirez, A., Reijers, H.A., Barba, I., Del Valle, C.: A method to improve the early stages of the robotic process automation lifecycle. In: Giorgini, P., Weber, B. (eds.) CAiSE 2019. LNCS, vol. 11483, pp. 446–461. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21290-2_28
Leno, V., Augusto, A., Dumas, M., La Rosa, M., Maggi, F.M., Polyvyanyy, A.: Discovering data transfer routines from user interaction logs. Inf. Syst. 107, 101916 (2022)
Martínez-Rojas, A., Jiménez-Ramírez, A., Enríquez, J., Reijers, H.: A screenshot-based task mining framework for disclosing the drivers behind variable human actions. Inf. Syst. 121, 102340 (2024)
Martínez-Rojas, A., Reijers, H.A., Jiménez-Ramírez, A., Enríquez, J.G.: What are you gazing at? an approach to use eye-tracking for robotic process automation. In: Köpke, J., et al. (eds.) Business Process Management: Blockchain, Robotic Process Automation and Educators Forum, pp. 120–134 (2023)
Martínez-Rojas, A., Jiménez-Ramírez, A., Enríquez, J.G., Reijers, H.A.: A tool-supported method to generate user interface logs, vol. 2023-January, pp. 5472–5481 (2023)
Radford, A., et al.: Learning transferable visual models from natural language supervision, vol. 139, pp. 8748–8763 (2021)
Shahapure, K.R., Nicholas, C.: Cluster quality analysis using silhouette score, pp. 747–748 (2020)
Troller, M.: Practical OCR system based on state of art neural networks. Ph.D. thesis, Czech Technical University in Prague Dejvice, Czech Republic (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Martínez-Rojas, A., Alonso-Rocha, J.L., Jiménez-Ramírez, A., Enríquez, J.G. (2024). From Screenshots to Process Models: Improving Activity Identification Through Screen Text. In: Di Ciccio, C., et al. Business Process Management: Blockchain, Robotic Process Automation, Central and Eastern European, Educators and Industry Forum. BPM 2024. Lecture Notes in Business Information Processing, vol 527. Springer, Cham. https://doi.org/10.1007/978-3-031-70445-1_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-70445-1_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70444-4
Online ISBN: 978-3-031-70445-1
eBook Packages: Computer ScienceComputer Science (R0)