Skip to main content

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 527))

Included in the following conference series:

Abstract

The analysis for executing Robotic Process Automation (RPA) projects increasingly relies on monitoring user activities through Robotic Process Mining (RPM) techniques. Traditional approaches capture direct information using loggers that capture UI logs, i.e., sequences of events that include data from (1) the keyboard, (2) the mouse, and (3) the application elements, such as its name, the Excel cell, the clicked button, etc. Although the latter is highly relevant for identifying the activity that is being performed, this information is not accessible in virtualized environments; only screenshot data is available. This limitation necessitates activity identification based on screenshots alone. A significant challenge with this method is its sensitivity to minor interface changes, such as different zoom levels or notifications, which can cause detection failures. To address this, we propose a novel approach that, first, integrates embeddings from both screenshots and screen text obtained through OCR and, second, clusters the UI log events using these combined features to identify the activity. Our results show that this method enhances activity identification, outperforming current state-of-the-art techniques, and demonstrates promising improvements in accuracy and reliability.

This research was supported by the EQUAVEL project PID2022-137646OB-C31, funded by MICIU/AEI/10.13039/501100011033 and by FEDER, UE; the DISCOVERY project (2021/C005/00148631), funded by Unión Europea NextGeneration EU and “Plan de Recuperación, Transformación y Resiliencia” of the Ministry of Economic and Digital Transformation; and the grant FPU20/05984 funded by MICIU/AEI/10.13039/501100011033 and by FSE+.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    ScreenRPA Framework: https://github.com/RPA-US/screenrpa.

  2. 2.

    Approach source code in: https://github.com/RPA-US/processdiscovery.

  3. 3.

    Agglomerative Clustering Scikit-learn: https://scikit-learn.org/stable/modules/generated/sklearn.cluster.AgglomerativeClustering.html.

  4. 4.

    Evaluation input and output data: https://doi.org/10.5281/zenodo.11368319.

  5. 5.

    Cenit S.L: https://www.cenitcon.com/.

  6. 6.

    Odoo is an open-source ERP system accessible at https://odoo.com.

References

  1. van der Aalst, W.M.P., Bichler, M., Heinzl, A.: Robotic process automation. Bus. Inf. Syst. Eng. 60(4), 269–272 (2018)

    Article  Google Scholar 

  2. Agostinelli, S., Lupia, M., Marrella, A., Mecella, M.: Automated generation of executable RPA scripts from user interface logs. In: Asatiani, A., et al (eds.) Business Process Management: Blockchain and Robotic Process Automation Forum, pp. 116–131 (2020)

    Google Scholar 

  3. Agostinelli, S., Lupia, M., Marrella, A., Mecella, M.: Reactive synthesis of software robots in RPA from user interface logs. Comput. Ind. 142, 103721 (2022)

    Article  Google Scholar 

  4. Aguirre, S., Rodriguez, A.: Automation of a business process using robotic process automation (RPA): a case study. In: Applied Computer Sciences in Engineering: 4th Workshop on Engineering Applications, WEA 2017, Cartagena, Colombia, 27–29 September 2017, Proceedings 4, pp. 65–71 (2017)

    Google Scholar 

  5. Bala, S., Mendling, J., Schimak, M., Queteschiner, P.: Case and activity identification for mining process models from middleware. In: Buchmann, R.A., Karagiannis, D., Kirikova, M. (eds.) PoEM 2018. LNBIP, vol. 335, pp. 86–102. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-02302-7_6

    Chapter  Google Scholar 

  6. Dumas, M., Rosa, M.L., Leno, V., Polyvyanyy, A., Maggi, F.M.: Robotic process mining. In: van der Aalst, W.M.P., Carmona, J. (eds.) Process Mining Handbook. LNCS, vol. 448, pp. 468–491. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-031-08848-3_16

    Chapter  Google Scholar 

  7. El-Gharib, N.M., Amyot, D.: Robotic process automation using process mining - a systematic literature review. Data Knowl. Eng. 148, 102229 (2023)

    Article  Google Scholar 

  8. Jimenez-Ramirez, A., Reijers, H.A., Barba, I., Del Valle, C.: A method to improve the early stages of the robotic process automation lifecycle. In: Giorgini, P., Weber, B. (eds.) CAiSE 2019. LNCS, vol. 11483, pp. 446–461. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21290-2_28

    Chapter  Google Scholar 

  9. Leno, V., Augusto, A., Dumas, M., La Rosa, M., Maggi, F.M., Polyvyanyy, A.: Discovering data transfer routines from user interaction logs. Inf. Syst. 107, 101916 (2022)

    Article  Google Scholar 

  10. Martínez-Rojas, A., Jiménez-Ramírez, A., Enríquez, J., Reijers, H.: A screenshot-based task mining framework for disclosing the drivers behind variable human actions. Inf. Syst. 121, 102340 (2024)

    Article  Google Scholar 

  11. Martínez-Rojas, A., Reijers, H.A., Jiménez-Ramírez, A., Enríquez, J.G.: What are you gazing at? an approach to use eye-tracking for robotic process automation. In: Köpke, J., et al. (eds.) Business Process Management: Blockchain, Robotic Process Automation and Educators Forum, pp. 120–134 (2023)

    Google Scholar 

  12. Martínez-Rojas, A., Jiménez-Ramírez, A., Enríquez, J.G., Reijers, H.A.: A tool-supported method to generate user interface logs, vol. 2023-January, pp. 5472–5481 (2023)

    Google Scholar 

  13. Radford, A., et al.: Learning transferable visual models from natural language supervision, vol. 139, pp. 8748–8763 (2021)

    Google Scholar 

  14. Shahapure, K.R., Nicholas, C.: Cluster quality analysis using silhouette score, pp. 747–748 (2020)

    Google Scholar 

  15. Troller, M.: Practical OCR system based on state of art neural networks. Ph.D. thesis, Czech Technical University in Prague Dejvice, Czech Republic (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. Martínez-Rojas .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Martínez-Rojas, A., Alonso-Rocha, J.L., Jiménez-Ramírez, A., Enríquez, J.G. (2024). From Screenshots to Process Models: Improving Activity Identification Through Screen Text. In: Di Ciccio, C., et al. Business Process Management: Blockchain, Robotic Process Automation, Central and Eastern European, Educators and Industry Forum. BPM 2024. Lecture Notes in Business Information Processing, vol 527. Springer, Cham. https://doi.org/10.1007/978-3-031-70445-1_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-70445-1_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-70444-4

  • Online ISBN: 978-3-031-70445-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics