skip to main content
10.1145/3180155.3180194acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

How not to structure your database-backed web applications: a study of performance bugs in the wild

Published: 27 May 2018 Publication History

Abstract

Many web applications use databases for persistent data storage, and using Object Relational Mapping (ORM) frameworks is a common way to develop such database-backed web applications. Unfortunately, developing efficient ORM applications is challenging, as the ORM framework hides the underlying database query generation and execution. This problem is becoming more severe as these applications need to process an increasingly large amount of persistent data. Recent research has targeted specific aspects of performance problems in ORM applications. However, there has not been any systematic study to identify common performance anti-patterns in real-world such applications, how they affect resulting application performance, and remedies for them.
In this paper, we try to answer these questions through a comprehensive study of 12 representative real-world ORM applications. We generalize 9 ORM performance anti-patterns from more than 200 performance issues that we obtain by studying their bug-tracking systems and profiling their latest versions. To prove our point, we manually fix 64 performance issues in their latest versions and obtain a median speedup of 2× (and up to 39× max) with fewer than 5 lines of code change in most cases. Many of the issues we found have been confirmed by developers, and we have implemented ways to identify other code fragments with similar issues as well.

References

[1]
Active Support Instrumentation. http://guides.rubyonrails.org/active_support_instrumentation.html/.
[2]
Airbnb. An online marketplace and hospitality service application. https://www.airbnb.com/.
[3]
Amazon. An online e-commerce application. https://amazon.com/.
[4]
AutoAdmin. For database systems self-tuning and self-administering. https://www.microsoft.com/en-us/research/project/autoadmin/.
[5]
AWS instance types. https://aws.amazon.com/tw/ec2/instance-types/.
[6]
Browser Ranking. http://www.zdnet.com/article/chrome-is-the-most-popular-web-browser-of-all/.
[7]
Bullet. A library used to solve N + 1 query problem for Ruby on Rails. https://github.com/flyerhzm/bullet/.
[8]
Diaspora. A social-network application. https://github.com/diaspora/diaspora/.
[9]
Django. https://www.djangoproject.com/.
[10]
Django-cms. An enterprise content management system. https://github.com/divio/django-cms/.
[11]
Find your new favorite web framework. https://hotframeworks.com/.
[12]
Github. https://github.com/.
[13]
Gitlab. A software to collaborate on code. https://github.com/gitlabhq/gitlabhq/.
[14]
Hibernate. http://hibernate.org/.
[15]
Hulu. A subscription video on demand service application. https://www.hulu.com/.
[16]
Hyperloop. http://hyperloop.cs.uchicago.edu.
[17]
Lobsters. A forum application. https://www.github.com/jcs/lobsters/.
[18]
N+ 1 query problem. https://www.sitepoint.com/silver-bullet-n1-problem/.
[19]
OpenStreetMap. Amap service application. https://github.com/openstreetmap/openstreetmap-website/.
[20]
Pagination. A library used in webpage displaying. https://github.com/mislav/will_paginate/.
[21]
Redash. An application to connect your company's data. https://github.com/getredash/redash/.
[22]
Ruby on Rails. http://rubyonrails.org/.
[23]
Zulip. A powerful team chat system. https://github.com/zulip/zulip/.
[24]
Akamai and Gomez.com. How Loading Time Affects Your Bottom Line. https://blog.kissmetrics.com/loading-time/.
[25]
Michael Armbrust, Eric Liang, Tim Kraska, Armando Fox, Michael J. Franklin, and David A. Patterson. 2013. Generalized Scale Independence Through Incremental Precomputation. In SIGMOD. 625--636.
[26]
Tse-Hsun Chen, Weiyi Shang, Zhen Ming Jiang, Ahmed E. Hassan, Mohamed Nasser, and Parminder Flora. 2014. Detecting Performance Anti-patterns for Applications Developed Using Object-relational Mapping. In ICSE. 1001--1012.
[27]
Tse-Hsun Chen, Weiyi Shang, Zhen Ming Jiang, Ahmed E. Hassan, Mohamed Nasser, and Parminder Flora. 2016. Finding and evaluating the performance impact of redundant data access for applications that are developed using object-relational mapping frameworks. In ICSE. 1148--1161.
[28]
Alvin Cheung, Samuel Madden, and Armando Solar-Lezama. 2014. Sloth: Being Lazy is a Virtue (when Issuing Database Queries). In SIGMOD. 931--942.
[29]
Alvin Cheung, Armando Solar-Lezama, and Samuel Madden. 2013. Optimizing database-backed applications with query synthesis. In PLDI. 3--14.
[30]
Bruno Dufour, Barbara G. Ryder, and Gary Sevitsky. 2008. A Scalable Technique for Characterizing the Usage of Temporaries in Framework-intensive Java Applications. In FSE. 59--70.
[31]
Wenfei Fan, Floris Geerts, and Leonid Libkin. 2014. On Scale Independence for Querying Big Data. In PODS. 51--62.
[32]
Paul Graham. Startup = Growth. http://paulgraham.com/growth.html.
[33]
Linhai Song Xiaoming Shi Joel Scherpelz Jin, Guoliang and Shan Lu. 2012. Understanding and detecting real-world performance bugs. In PLDI. 77--88.
[34]
Emre Kiciman and Benjamin Livshits. 2007. AjaxScope: a platform for remotely monitoring the client-side behavior of Web 2.0 applications. ACM SIGOPS Operating Systems Review. 41, 6 (2007), 17--30.
[35]
Greg Linden. Marissa Mayer at Web 2.0. http://glinden.blogspot.com/2006/11/marissa-mayer-at-web-20.html/.
[36]
Adrian Nistor, Po-Chun Chang, Cosmin Radoi, and Shan Lu. 2015. CARAMEL: Detecting and Fixing Performance Problems That Have Non-Intrusive Fixes. In ICSE. 902--912.
[37]
Adrian Nistor, Linhai Song, Darko Marinov, and Shan Lu. 2013. Toddler: detecting performance problems via similar memory-access patterns. In ICSE. 562--571.
[38]
Stephen O'Grady. The RedMonk Programming Language Rankings: June 2017. http://redmonk.com/sogrady/2017/06/08/language-rankings-6-17/.
[39]
Oswaldo Olivo, Isil Dillig, and Calvin Lin. 2015. Static detection of asymptotic performance bugs in collection traversals. In PLDI. 369--378.
[40]
Karthik Ramachandra, Chavan Mahendra, Guravannavar Ravindra, and S Sudarshan. 2015. Program Transformations for Asynchronous and Batched Query Submission. In TKDE. 531--544.
[41]
Marija Selakovic and Michael Pradel. 2016. Performance issues and optimizations in javascript: an empirical study. In ICSE. 61--72.
[42]
Jeffrey D. Ullman and Jennifer Widom. 1997. A First Course in Database Systems. Prentice-Hall, Inc., Upper Saddle River, NJ, USA.
[43]
Guoqing Xu, Matthew Arnold, Nick Mitchell, Atanas Rountev, and Gary Sevitsky. 2009. Go with the Flow: Profiling Copies to Find Runtime Bloat. In PLDI. 419--430.
[44]
Guoqing Xu, Nick Mitchell, Matthew Arnold, Atanas Rountev, Edith Schonberg, and Gary Sevitsky. 2010. Finding Low-utility Data Structures. In PLDI. 174--186.
[45]
Cong Yan and Alvin Cheung. 2016. Leveraging Lock Contention to Improve OLTP Application Performance. In VLDB. 444--455.
[46]
Cong Yan, Junwen Yang, Alvin Cheung, and Shan Lu. 2017. Understanding Database Performance Inefficiencies in Real-world Web Applications. In CIKM.
[47]
Bram Adams Zaman, Shahed and Ahmed E. Hassan. 2012. A qualitative study on performance bugs. In MSR. 199--208.

Cited By

View all
  • (2025)Analyzing the adoption of database management systems throughout the history of open source projectsEmpirical Software Engineering10.1007/s10664-025-10627-z30:3Online publication date: 22-Feb-2025
  • (2024)An Empirical Study on the Characteristics of Database Access Bugs in Java ApplicationsACM Transactions on Software Engineering and Methodology10.1145/367244933:7(1-25)Online publication date: 13-Jun-2024
  • (2024)Beyond Functional Correctness: An Exploratory Study on the Time Efficiency of Programming AssignmentsProceedings of the 46th International Conference on Software Engineering: Software Engineering Education and Training10.1145/3639474.3640065(320-330)Online publication date: 14-Apr-2024
  • Show More Cited By

Index Terms

  1. How not to structure your database-backed web applications: a study of performance bugs in the wild

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICSE '18: Proceedings of the 40th International Conference on Software Engineering
    May 2018
    1307 pages
    ISBN:9781450356381
    DOI:10.1145/3180155
    • Conference Chair:
    • Michel Chaudron,
    • General Chair:
    • Ivica Crnkovic,
    • Program Chairs:
    • Marsha Chechik,
    • Mark Harman
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 May 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. bug study
    2. database-backed applications
    3. object-relational mapping frameworks
    4. performance anti-patterns

    Qualifiers

    • Research-article

    Conference

    ICSE '18
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 276 of 1,856 submissions, 15%

    Upcoming Conference

    ICSE 2025

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)45
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Analyzing the adoption of database management systems throughout the history of open source projectsEmpirical Software Engineering10.1007/s10664-025-10627-z30:3Online publication date: 22-Feb-2025
    • (2024)An Empirical Study on the Characteristics of Database Access Bugs in Java ApplicationsACM Transactions on Software Engineering and Methodology10.1145/367244933:7(1-25)Online publication date: 13-Jun-2024
    • (2024)Beyond Functional Correctness: An Exploratory Study on the Time Efficiency of Programming AssignmentsProceedings of the 46th International Conference on Software Engineering: Software Engineering Education and Training10.1145/3639474.3640065(320-330)Online publication date: 14-Apr-2024
    • (2024)WeBridge: Synthesizing Stored Procedures for Large-Scale Real-World Web ApplicationsProceedings of the ACM on Management of Data10.1145/36393192:1(1-29)Online publication date: 26-Mar-2024
    • (2024)Ad Hoc Transactions through the Looking Glass: An Empirical Study of Application-Level Transactions in Web ApplicationsACM Transactions on Database Systems10.1145/363855349:1(1-43)Online publication date: 28-Feb-2024
    • (2024)An Adaptive Logging System (ALS): Enhancing Software Logging with Reinforcement Learning TechniquesProceedings of the 15th ACM/SPEC International Conference on Performance Engineering10.1145/3629526.3645033(37-47)Online publication date: 7-May-2024
    • (2024)SMEAGOL: A Static Code Smell Detector for MongoDB2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER60148.2024.00088(816-820)Online publication date: 12-Mar-2024
    • (2024)A Multivocal Mapping Study of MongoDB Smells2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER60148.2024.00086(792-803)Online publication date: 12-Mar-2024
    • (2024)ROBUST: 221 bugs in the Robot Operating SystemEmpirical Software Engineering10.1007/s10664-024-10440-029:3Online publication date: 23-Mar-2024
    • (2023)Towards Auto-Generated Data SystemsProceedings of the VLDB Endowment10.14778/3611540.361163516:12(4116-4129)Online publication date: 1-Aug-2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media