skip to main content
10.1145/2983990.2984020acmconferencesArticle/Chapter ViewAbstractPublication PagessplashConference Proceedingsconference-collections
research-article

Ringer: web automation by demonstration

Published: 19 October 2016 Publication History

Abstract

With increasing amounts of data available on the web and a diverse range of users interested in programmatically accessing that data, web automation must become easier. Automation helps users complete many tedious interactions, such as scraping data, completing forms, or transferring data between websites. However, writing web automation scripts typically requires an expert programmer because the writer must be able to reverse engineer the target webpage. We have built a record and replay tool, Ringer, that makes web automation accessible to non-coders. Ringer takes a user demonstration as input and creates a script that interacts with the page as a user would. This approach makes Ringer scripts more robust to webpage changes because user-facing interfaces remain relatively stable compared to the underlying webpage implementations. We evaluated our approach on benchmarks recorded on real webpages and found that it replayed 4x more benchmarks than a state-of-the-art replay tool.

References

[1]
IFTTT - make your work flow.
[2]
The propublica nerd blog - propublica.
[3]
A free web & mobile app for reading comfortably - readability.
[4]
Alexa top 500 global sites, July 2013.
[5]
Beautiful soup: We called him tortoise because he taught us. http://www.crummy.com/software/BeautifulSoup/, July 2013.
[6]
Browser scripting, data extraction and web testing by imacros. http://www.iopus.com/imacros/, July 2013.
[7]
Scrapy. http://scrapy.org/, July 2013.
[8]
Selenium-web browser automation. http://seleniumhq. org/, July 2013.
[9]
Amazon price tracker, Dec. 2015.
[10]
Greasemonkey :: Add-ons for firefox, Nov. 2015.
[11]
S. Barman. End-User Record and Replay for the Web. PhD thesis, EECS Department, University of California, Berkeley, Dec 2015.
[12]
M. Bolin, M. Webber, P. Rha, T. Wilson, and R. C. Miller. Automation and customization of rendered web pages. In Proceedings of the 18th annual ACM symposium on User interface software and technology, UIST ’05, pages 163–172, New York, NY, USA, 2005. ACM.
[13]
[14]
[15]
B. Burg, R. Bailey, A. J. Ko, and M. D. Ernst. Interactive record/replay for web application debugging. In Proceedings of the 26th Annual ACM Symposium on User Interface Software and Technology, UIST ’13, pages 473–484, New York, NY, USA, 2013. ACM. ISBN 978-1-4503-2268-3. 2501988.2502050.
[16]
S. Chasins, S. Barman, R. Bodik, and S. Gulwani. Browser record and replay as a building block for end-user web automation tools. In Proceedings of the 24th International Conference on World Wide Web Companion, WWW ’15 Companion, pages 179–182, Republic and Canton of Geneva, Switzerland, 2015.
[17]
International World Wide Web Conferences Steering Committee. ISBN 978-1-4503-3473-0.
[18]
[19]
[20]
N. Dalvi, P. Bohannon, and F. Sha. Robust web extraction: An approach based on a probabilistic tree-edit model. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, SIGMOD ’09, pages 335–348, New York, NY, USA, 2009. ACM. ISBN 978- 1-60558-551-2.
[21]
R. Ennals, E. Brewer, M. Garofalakis, M. Shadle, and P. Gandhi. Intel mash maker: Join the web. SIGMOD Rec., 36(4):27–33, Dec. 2007. ISSN 0163-5808. 1361348.1361355.
[22]
f. dfgdfg, S. Flesca, and F. Furfaro. Xpath query relaxation through rewriting rules. IEEE Transactions on Knowledge and Data Engineering, 23(10):1583–1600, Oct 2011. ISSN 1041-4347.
[23]
P. L. Fernandez, L. S. Heath, N. Ramakrishnan, and J. P. C. Vergara. Reconstructing partial orders from linear extensions, 2006.
[24]
T. Furche, G. Gottlob, G. Grasso, C. Schallhart, and A. Sellers. Oxpath: A language for scalable data extraction, automation, and crawling on the deep web. The VLDB Journal, 22(1):47–72, Feb. 2013. ISSN 1066-8888.
[25]
1007/s00778-012-0286-6.
[26]
G. Grasso, T. Furche, and C. Schallhart. Effective web scraping with oxpath. In Proceedings of the 22Nd International Conference on World Wide Web Companion, WWW ’13 Companion, pages 23–26, Republic and Canton of Geneva, Switzerland, 2013. International World Wide Web Conferences Steering Committee. ISBN 978-1-4503-2038-2.
[27]
R. Hutton. Amazon discount tracker camelcamelcamel tips users to deals, December 2013.
[28]
Import.io. Import.io | web data platform & free web scraping tool, Mar. 2016.
[29]
A. Koesnandar, S. Elbaum, G. Rothermel, L. Hochstein, C. Scaffidi, and K. T. Stolee. Using assertions to help enduser programmers create dependable web macros. In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering, SIGSOFT ’08/FSE-16, pages 124–134, New York, NY, USA, 2008. ACM. ISBN 978-1-59593-995-1.
[30]
J. Kranzdorf, A. Sellers, G. Grasso, C. Schallhart, and T. Furche. Visual oxpath: Robust wrapping by example. In Proceedings of the 21st International Conference Companion on World Wide Web, WWW ’12 Companion, pages 369–372, New York, NY, USA, 2012. ACM. ISBN 978- 1-4503-1230-1.
[31]
K. Labs. Kimono: Turn websites into structured APIs from your browser in seconds, Mar. 2016.
[32]
M. Leotta, A. Stocco, F. Ricca, and P. Tonella. Reducing web test cases aging by means of robust xpath locators. In Software Reliability Engineering Workshops (ISSREW), 2014 IEEE International Symposium on, pages 449–454, Nov 2014.
[34]
G. Leshed, E. M. Haber, T. Matthews, and T. Lau. Coscripter: automating & sharing how-to knowledge in the enterprise. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’08, pages 1719–1728, New York, NY, USA, 2008. ACM.
[35]
I. Li, J. Nichols, T. Lau, C. Drews, and A. Cypher. Here’s what i did: Sharing and reusing web activity with actionshot. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’10, pages 723–732, New York, NY, USA, 2010. ACM.
[36]
J. Lin, J. Wong, J. Nichols, A. Cypher, and T. A. Lau. End-user programming of mashups with vegemite. In Proceedings of the 14th international conference on Intelligent user interfaces, IUI ’09, pages 97–106, New York, NY, USA, 2009. ACM.
[38]
H. Mannila and C. Meek. Global partial orders from sequential data. In Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’00, pages 161–168, New York, NY, USA, 2000. ACM. ISBN 1-58113-233-6.
[39]
J. Mickens, J. Elson, and J. Howell. Mugshot: deterministic capture and replay for javascript applications. In Proceedings of the 7th USENIX conference on Networked systems design and implementation, NSDI’10, pages 11–11, Berkeley, CA, USA, 2010. USENIX Association.
[40]
B. Petrov, M. Vechev, M. Sridharan, and J. Dolby. Race detection for web applications. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’12, pages 251–262, New York, NY, USA, 2012. ACM. ISBN 978-1-4503-1205-9. 2254064.2254095.
[41]
K. Sen, S. Kalasapur, T. Brutch, and S. Gibbs. Jalangi: A selective record-replay and dynamic analysis framework for javascript. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2013, pages 488–498, New York, NY, USA, 2013. ACM.
[42]
stackoverflow.com. Posts containing ’scraping’ - stack overflow, July 2016.
[43]
J. Wong and J. I. Hong. Making mashups with marmite: Towards end-user programming for the web. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’07, pages 1435–1444, New York, NY, USA, 2007. ACM. ISBN 978-1-59593-593-9. 1240624.1240842.
[44]
R. Yandrapally, S. Thummalapenta, S. Sinha, and S. Chandra. Robust test automation using contextual clues. In Proceedings of the 2014 International Symposium on Software Testing and Analysis, ISSTA 2014, pages 304–314, New York, NY, USA, 2014. ACM. ISBN 978-1-4503-2645-2. 2610384.2610390.
[45]
T. Yeh, T.-H. Chang, and R. C. Miller. Sikuli: using gui screenshots for search and automation. In Proceedings of the 22nd annual ACM symposium on User interface software and technology, UIST ’09, pages 183–192, New York, NY, USA, 2009. ACM.

Cited By

View all
  • (2024)Efficient Bottom-Up Synthesis for Programs with Local VariablesProceedings of the ACM on Programming Languages10.1145/36328948:POPL(1540-1568)Online publication date: 5-Jan-2024
  • (2024)Data Formulator: AI-Powered Concept-Driven Visualization AuthoringIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.332658530:1(1128-1138)Online publication date: 1-Jan-2024
  • (2023)Task Automation Intelligent Agents: A ReviewFuture Internet10.3390/fi1506019615:6(196)Online publication date: 29-May-2023
  • Show More Cited By

Index Terms

  1. Ringer: web automation by demonstration

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    OOPSLA 2016: Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications
    October 2016
    915 pages
    ISBN:9781450344449
    DOI:10.1145/2983990
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 19 October 2016

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Automation
    2. Browser
    3. Javascript
    4. Record-Replay

    Qualifiers

    • Research-article

    Conference

    SPLASH '16
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 268 of 1,244 submissions, 22%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)93
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 15 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Efficient Bottom-Up Synthesis for Programs with Local VariablesProceedings of the ACM on Programming Languages10.1145/36328948:POPL(1540-1568)Online publication date: 5-Jan-2024
    • (2024)Data Formulator: AI-Powered Concept-Driven Visualization AuthoringIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.332658530:1(1128-1138)Online publication date: 1-Jan-2024
    • (2023)Task Automation Intelligent Agents: A ReviewFuture Internet10.3390/fi1506019615:6(196)Online publication date: 29-May-2023
    • (2023)ImageEye: Batch Image Processing using Program SynthesisProceedings of the ACM on Programming Languages10.1145/35912487:PLDI(686-711)Online publication date: 6-Jun-2023
    • (2023)Primary Building Blocks for Web AutomationWeb Information Systems Engineering – WISE 202310.1007/978-981-99-7254-8_29(376-386)Online publication date: 21-Oct-2023
    • (2023)Streamlining Personal Data Access Requests: From Obstructive Procedures to Automated Web WorkflowsWeb Engineering10.1007/978-3-031-34444-2_9(111-125)Online publication date: 16-Jun-2023
    • (2022)WebRobot: web robotic process automation using interactive programming-by-demonstrationProceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation10.1145/3519939.3523711(152-167)Online publication date: 9-Jun-2022
    • (2022)Landmarks and regions: a robust approach to data extractionProceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation10.1145/3519939.3523705(993-1009)Online publication date: 9-Jun-2022
    • (2022)RLBrowse: Generating Realistic Packet Traces with Reinforcement LearningNOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium10.1109/NOMS54207.2022.9789851(1-6)Online publication date: 25-Apr-2022
    • (2022)EqFix: Fixing LaTeX Equation Errors by ExamplesDependable Software Engineering. Theories, Tools, and Applications10.1007/978-3-031-21213-0_7(106-124)Online publication date: 11-Dec-2022
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media