Abstract
The growth of the web has been unstoppable in the last decade, which leads to an increasing demand for extracting information from it. Apart from the need to extract information, this growth also has brought the necessity to adapt web pages to user requirements, create annotations or test web applications. Due to the evolution of web pages, the complexity of the implementation of these techniques has increased. Being able to test, annotate, adapt and extract information from web pages correctly and efficiently has become a primary task. In order to perform all these tasks, it is mandatory to have the best mechanisms to effectively and unequivocally locate the desired elements throughout the web page life cycle, especially when a web page evolves. Different mechanisms are used to find web nodes. These mechanisms, called locators, are prone to fail over time owing to changes on websites. Many authors improve life expectancy of locators developing algorithms that use different types of locators. Some others have created algorithms that regenerate locators by saving extra information from the previous structure of the website. These algorithms extend the useful life of locators but their computational and storage cost is much higher. To avoid these problems, we have designed an algorithm that employs an attribute system embedded in the HTML code. The algorithm is able to regenerate the locators based on these attributes every time a single change takes place in a given element attribute. The evaluation of the proposal shows a much lower computational cost than in previous works.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
References
Aldalur, I., Díaz, O.: Addressing web locator fragility: a case for browser extensions. In: Proceedings of the ACM SIGCHI Symposium on Engineering Interactive Computing Systems, EICS 2017, Lisbon, Portugal, 26–29 June 2017, pp. 45–50 (2017)
Almendros-Jiménez, J.M., Luna Tedesqui, A., Moreno, G.: Annotating “fuzzy chance degrees” when debugging XPath queries. In: Rojas, I., Joya, G., Cabestany, J. (eds.) IWANN 2013. LNCS, vol. 7903, pp. 300–311. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38682-4_33
Bajaj, K., Pattabiraman, K., Mesbah, A.: Synthesizing web element locators (T). In: 30th IEEE/ACM International Conference on Automated Software Engineering, ASE 2015, Lincoln, NE, USA, 9–13 November 2015, pp. 331–341 (2015)
Bartoli, A., Medvet, E., Mauri, M.: Recording and replaying navigations on AJAX web sites. In: Brambilla, M., Tokuda, T., Tolksdorf, R. (eds.) ICWE 2012. LNCS, vol. 7387, pp. 370–377. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31753-8_30
Biagiola, M., Stocco, A., Ricca, F., Tonella, P.: Diversity-based web test generation. In: 27th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering ESEC/FSE, Tallinn, Estonia, 26–30 August 2019, pp. 231–242 (2019)
Bures, M., Filipsky, M.: Smartdriver: extension of selenium webdriver to create more efficient automated tests. In: 6th International Conference on IT Convergence and Security, ICITCS 2016, Prague, Czech Republic, 26 September 2016, pp. 1–4 (2016)
Chang, C.-H., Lin, Y.-L., Lin, K.-C., Kayed, M.: Page-level wrapper verification for unsupervised web data extraction. In: Lin, X., Manolopoulos, Y., Srivastava, D., Huang, G. (eds.) WISE 2013. LNCS, vol. 8180, pp. 454–467. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41230-1_38
Eladawy, H.M., Mohamed, A.E., Salem, S.A.: A new algorithm for repairing web-locators using optimization techniques. In: 13th International Conference on Computer Engineering and Systems (ICCES), pp. 327–331, December 2018
Ferrara, E., Baumgartner, R.: Intelligent self-repairable web wrappers. In: AI*IA 2011: Artificial Intelligence Around Man and Beyond - XIIth International Conference of the Italian Association for Artificial Intelligence, Palermo, Italy, 15–17 September 2011, pp. 274–285 (2011)
Ferrara, E., Meo, P.D., Fiumara, G., Baumgartner, R.: Web data extraction, applications and techniques: a survey. Knowl.-Based Syst. 70, 301–323 (2014)
Fiorelli, M., Pazienza, M.T., Stellato, A.: A flexible approach to semantic annotation systems for web content. Int. Syst. Account. Financ. Manag. 22(1), 65–79 (2015)
Firmenich, D., Firmenich, S., Rivero, J.M., Antonelli, L., Rossi, G.: CrowdMock: an approach for defining and evolving web augmentation requirements. Requirements Eng. 23(1), 33–61 (2018). https://doi.org/10.1007/s00766-016-0257-3
Gao, Z., Chen, Z., Zou, Y., Memon, A.M.: SITAR: GUI test script repair. IEEE Trans. Softw. Eng. 42(2), 170–186 (2016)
Guo, J.: Reducing human effort in web data extraction. Ph.D. thesis, University of Oxford, UK (2017)
Hammoudi, M., Rothermel, G., Stocco, A.: WATERFALL: an incremental approach for repairing record-replay tests of web applications. In: Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2016, Seattle, WA, USA, 13–18 November 2016, pp. 751–762 (2016)
Hammoudi, M., Rothermel, G., Tonella, P.: Why do record/replay tests of web applications break? In: IEEE International Conference on Software Testing, Verification and Validation, ICST 2016, Chicago, USA, 11–15 April 2016, pp. 180–190 (2016)
Herbold, S., Bünting, U., Grabowski, J., Waack, S.: Deployable capture/replay supported by internal messages. Adv. Comput. 85, 327–367 (2012)
Huizinga, D., Kolawa, A.: Automated Defect Prevention: Best Practices in Software Management. Wiley, Hoboken (2007)
Kirinuki, H., Tanno, H., Natsukawa, K.: COLOR: correct locator recommender for broken test scripts using various clues in web application. In: 26th IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2019, Hangzhou, China, 24–27 February 2019, pp. 310–320 (2019)
Lee, T.Y., Bederson, B.B.: Give the people what they want: studying end-user needs for enhancing the web. PeerJ Comput. Sci. 2, e91 (2016)
Leotta, M., Clerissi, D., Ricca, F., Tonella, P.: Visual vs. DOM-based web locators: an empirical study. In: Casteleyn, S., Rossi, G., Winckler, M. (eds.) ICWE 2014. LNCS, vol. 8541, pp. 322–340. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-08245-5_19
Leotta, M., Stocco, A., Ricca, F., Tonella, P.: ROBULA+: an algorithm for generating robust Xpath locators for web testing. J. Softw. Evol. Process 28(3), 177–204 (2016)
Lin, A.Y., Ford, J., Adar, E., Hecht, B.J.: VizByWiki: mining data visualizations from the web to enrich news articles. In: Proceedings of the 2018 World Wide Web Conference on World Wide Web, WWW 2018, Lyon, France, 23–27 April 2018, pp. 873–882 (2018)
Potvin, B., Villemaire, R.: Robust web data extraction based on unsupervised visual validation. In: Nguyen, N.T., Gaol, F.L., Hong, T.-P., Trawiński, B. (eds.) ACIIDS 2019. LNCS (LNAI), vol. 11431, pp. 77–89. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-14799-0_7
Song, F., Xu, Z., Xu, F.: An Xpath-based approach to reusing test scripts for android applications. In: 14th Web Information Systems and Applications Conference, WISA 2017, Liuzhou, China, 11–12 November 2017, pp. 143–148 (2017)
Stocco, A., Yandrapally, R., Mesbah, A.: Visual web test repair. In: Proceedings of the 2018 ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/SIGSOFT FSE 2018, Lake Buena Vista, FL, USA, 04–09 November 2018, pp. 503–514 (2018)
Yeh, T., Chang, T., Miller, R.C.: Sikuli: using GUI screenshots for search and automation. In: Proceedings of the 22nd Annual ACM Symposium on User Interface Software and Technology, Victoria, BC, Canada, 4–7 October 2009, pp. 183–192 (2009)
Zhang, Y., Pan, Y., Chiu, K.: A parallel Xpath engine based on concurrent NFA execution. In: 16th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2010, Shanghai, China, 8–10 December 2010, pp. 314–321 (2010)
Acknowledgments
This work was carried out by the Software and Systems Engineering research group of Mondragon Unibertsitatea (IT1326-19), supported by the Department of Education, Universities and Research of the Basque Government.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Aldalur, I., Larrinaga, F., Perez, A. (2020). ABLA: An Algorithm for Repairing Structure-Based Locators Through Attribute Annotations. In: Huang, Z., Beek, W., Wang, H., Zhou, R., Zhang, Y. (eds) Web Information Systems Engineering – WISE 2020. WISE 2020. Lecture Notes in Computer Science(), vol 12343. Springer, Cham. https://doi.org/10.1007/978-3-030-62008-0_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-62008-0_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-62007-3
Online ISBN: 978-3-030-62008-0
eBook Packages: Computer ScienceComputer Science (R0)