skip to main content
research-article
Open access

Bloat beneath Python’s Scales: A Fine-Grained Inter-Project Dependency Analysis

Published: 12 July 2024 Publication History

Abstract

Modern programming languages promote software reuse via package managers that facilitate the integration of inter-dependent software libraries. Software reuse comes with the challenge of dependency bloat, which refers to unneeded and excessive code incorporated into a project through reused libraries. Such bloat exhibits security risks and maintenance costs, increases storage requirements, and slows down load times. In this work, we conduct a large-scale, fine-grained analysis to understand bloated dependency code in the PyPI ecosystem. Our analysis is the first to focus on different granularity levels, including bloated dependencies, bloated files, and bloated methods. This allows us to identify the specific parts of a library that contribute to the bloat. To do so, we analyze the source code of 1,302 popular Python projects and their 3,232 transitive dependencies. For each project, we employ a state-of-the-art static analyzer and incrementally construct the fine-grained project dependency graph (FPDG), a representation that captures all inter-project dependencies at method-level. Our reachability analysis on the FPDG enables the assessment of bloated dependency code in terms of several aspects, including its prevalence in the PyPI ecosystem, its relation to software vulnerabilities, its root causes, and developer perception. Our key finding suggests that PyPI exhibits significant resource underutilization: more than 50% of dependencies are bloated. This rate gets worse when considering bloated dependency code at a more subtle level, such as bloated files and bloated methods. Our fine-grained analysis also indicates that there are numerous vulnerabilities that reside in bloated areas of utilized packages (15% of the defects existing in PyPI). Other major observations suggest that bloated code primarily stems from omissions during code refactoring processes and that developers are willing to debloat their code: Out of the 36 submitted pull requests, developers accepted and merged 30, removing a total of 35 bloated dependencies. We believe that our findings can help researchers and practitioners come up with new debloating techniques and development practices to detect and avoid bloated code, ensuring that dependency resources are utilized efficiently.

References

[1]
2023. GitHub Advisory Database. https://github.com/advisories [Online; accessed 11-September-2023]
[2]
Rabe Abdalkareem, Vinicius Oda, Suhaib Mujahid, and Emad Shihab. 2020. On the impact of using trivial packages: an empirical case study on npm and PyPI. Empirical Software Engineering, 25, 2 (2020), 01 Mar, 1168–1204. issn:1573-7616 https://doi.org/10.1007/s10664-019-09792-9
[3]
Ioannis Agadakos, Nicholas Demarinis, Di Jin, Kent Williams-King, Jearson Alfajardo, Benjamin Shteinfeld, David Williams-King, Vasileios P. Kemerlis, and Georgios Portokalidis. 2020. Large-Scale Debloating of Binary Shared Libraries. Digital Threats, 1, 4 (2020), Article 19, dec, 28 pages. issn:2692-1626 https://doi.org/10.1145/3414997
[4]
Mahmoud Alfadel, Diego Elias Costa, and Emad Shihab. 2023. Empirical analysis of security vulnerabilities in Python packages. Empirical Software Engineering, 28, 3 (2023), 25 Mar, 59. issn:1573-7616 https://doi.org/10.1007/s10664-022-10278-4
[5]
Shihab E Alfadel M, Costa DE. 2020. Empirical Analysis of Security Vulnerabilities in Python Packages. https://doi.org/10.5281/zenodo.5645517
[6]
Babak Amin Azad, Rasoul Jahanshahi, Chris Tsoukaladelis, Manuel Egele, and Nick Nikiforakis. 2023. AnimateDead: Debloating Web Applications Using Concolic Execution. In 32nd USENIX Security Symposium (USENIX Security 23). USENIX Association, Anaheim, CA. 5575–5591. isbn:978-1-939133-37-3 https://www.usenix.org/conference/usenixsecurity23/presentation/azad
[7]
Babak Amin Azad, Pierre Laperdrix, and Nick Nikiforakis. 2019. Less is More: Quantifying the Security Benefits of Debloating Web Applications. In 28th USENIX Security Symposium (USENIX Security 19). USENIX Association, Santa Clara, CA. 1697–1714. isbn:978-1-939133-06-9 https://www.usenix.org/conference/usenixsecurity19/presentation/azad
[8]
Paolo Boldi and Georgios Gousios. 2020. Fine-Grained Network Analysis for Modern Software Ecosystems. ACM Trans. Internet Technol., 21, 1 (2020), Article 1, dec, 14 pages. issn:1533-5399 https://doi.org/10.1145/3418209
[9]
Pearl Brereton, Barbara A. Kitchenham, David Budgen, Mark Turner, and Mohamed Khalil. 2007. Lessons from Applying the Systematic Literature Review Process within the Software Engineering Domain. J. Syst. Softw., 80, 4 (2007), apr, 571–583. issn:0164-1212 https://doi.org/10.1016/j.jss.2006.07.009
[10]
Bobby R. Bruce, Tianyi Zhang, Jaspreet Arora, Guoqing Harry Xu, and Miryung Kim. 2020. JShrink: In-Depth Investigation into Debloating Modern Java Applications. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2020). Association for Computing Machinery, New York, NY, USA. 135–146. isbn:9781450370431 https://doi.org/10.1145/3368089.3409738
[11]
Brett Cannon, Nathaniel Smith, and Donald Stufft. 2016. PEP 518 – Specifying Minimum Build System Requirements for Python Projects. Python Software Foundation. https://www.python.org/dev/peps/pep-0518/
[12]
Yulu Cao, Lin Chen, Wanwangying Ma, Yanhui Li, Yuming Zhou, and Linzhang Wang. 2023. Towards Better Dependency Management: A First Look at Dependency Smells in Python Projects. IEEE Transactions on Software Engineering, 49, 4 (2023), 1741–1765. https://doi.org/10.1109/TSE.2022.3191353
[13]
Stefanos Chaliasos, Thodoris Sotiropoulos, Georgios-Petros Drosos, Charalambos Mitropoulos, Dimitris Mitropoulos, and Diomidis Spinellis. 2021. Well-Typed Programs Can Go Wrong: A Study of Typing-Related Bugs in JVM Compilers. Proc. ACM Program. Lang., 5, OOPSLA (2021), Article 123, Oct., 30 pages. https://doi.org/10.1145/3485500
[14]
Valerio Cosentino, Javier L. Cánovas Izquierdo, and Jordi Cabot. 2017. A Systematic Mapping Study of Software Development With GitHub. IEEE Access, 5 (2017), 7173–7192. https://doi.org/10.1109/ACCESS.2017.2682323
[15]
Russ Cox. 2019. Surviving Software Dependencies. Commun. ACM, 62, 9 (2019), aug, 36–43. issn:0001-0782 https://doi.org/10.1145/3347446
[16]
Ozren Dabic, Emad Aghajani, and Gabriele Bavota. 2021. Sampling Projects in GitHub for MSR Studies. In 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR). 560–564. https://doi.org/10.1109/MSR52588.2021.00074
[17]
Georgios-Petros Drosos, Thodoris Sotiropoulos, Diomidis Spinellis, and Dimitris Mitropoulos. 2024. Artifact for "Bloat beneath Python’s Scales: A Fine-Grained Inter-Project Dependency Analysis". https://doi.org/10.5281/zenodo.11095274
[18]
Ruian Duan, Omar Alrawi, Ranjita Pai Kasturi, Ryan Elder, Brendan Saltaformaggio, and Wenke Lee. 2020. Towards Measuring Supply Chain Attacks on Package Managers for Interpreted Languages. Proceedings 2021 Network and Distributed System Security Symposium, https://api.semanticscholar.org/CorpusID:227247756
[19]
Daniel M. German, Massimiliano Di Penta, and Julius Davies. 2010. Understanding and Auditing the Licensing of Open Source Software Distributions. In 2010 IEEE 18th International Conference on Program Comprehension. 84–93. https://doi.org/10.1109/ICPC.2010.48
[20]
GitHub. 2023. The State of the Octoverse: Top Programming Languages 2023. https://github.blog/2023-11-08-the-state-of-open-source-and-ai/ Online: accessed 29 February 2023
[21]
Antonios Gkortzis, Daniel Feitosa, and Diomidis Spinellis. 2019. A double-edged sword? Software reuse and potential security vulnerabilities. Lecture Notes in Computer Science, 187–203. https://doi.org/10.1007/978-3-030-22888-0_13
[22]
Jürgen Gmach. 2021. Remove unused Sphinx dependency. https://github.com/zopefoundation/Zope/pull/968 [Online; accessed 26-September-2023]
[23]
Joseph Hejderup, Moritz Beller, Konstantinos Triantafyllou, and Georgios Gousios. 2022. Präzi: from package-based to call-based dependency networks. Empirical Software Engineering, 27, 5 (2022), 30 May, 102. issn:1573-7616 https://doi.org/10.1007/s10664-021-10071-9
[24]
Joseph Hejderup, Arie van Deursen, and Georgios Gousios. 2018. Software Ecosystem Call Graph for Dependency Management. In Proceedings of the 40th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER ’18). Association for Computing Machinery, New York, NY, USA. 101–104. isbn:9781450356626 https://doi.org/10.1145/3183399.3183417
[25]
Kihong Heo, Woosuk Lee, Pardis Pashakhanloo, and Mayur Naik. 2018. Effective Program Debloating via Reinforcement Learning. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (CCS ’18). Association for Computing Machinery, New York, NY, USA. 380–394. isbn:9781450356930 https://doi.org/10.1145/3243734.3243838
[26]
Abbas Javan Jafari, Diego Elias Costa, Rabe Abdalkareem, Emad Shihab, and Nikolaos Tsantalis. 2022. Dependency Smells in JavaScript Projects. IEEE Trans. Softw. Eng., 48, 10 (2022), oct, 3790–3807. issn:0098-5589 https://doi.org/10.1109/TSE.2021.3106247
[27]
Rasoul Jahanshahi, Babak Amin Azad, Nick Nikiforakis, and Manuel Egele. 2023. Minimalist: Semi-automated Debloating of PHP Web Applications through Static Analysis. In 32nd USENIX Security Symposium (USENIX Security 23). USENIX Association, Anaheim, CA. 5557–5573. isbn:978-1-939133-37-3 https://www.usenix.org/conference/usenixsecurity23/presentation/jahanshahi
[28]
Yufei Jiang, Qinkun Bao, Shuai Wang, Xiao Liu, and Dinghao Wu. 2018. RedDroid: Android Application Redundancy Customization Based on Static Analysis. In 2018 IEEE 29th International Symposium on Software Reliability Engineering (ISSRE). 189–199. https://doi.org/10.1109/ISSRE.2018.00029
[29]
Yufei Jiang, Dinghao Wu, and Peng Liu. 2016. JRed: Program Customization and Bloatware Mitigation Based on Static Analysis. In 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC). 1, 12–21. https://doi.org/10.1109/COMPSAC.2016.146
[30]
Eirini Kalliamvakou, Georgios Gousios, Kelly Blincoe, Leif Singer, Daniel M. German, and Daniela Damian. 2014. The Promises and Perils of Mining GitHub. In Proceedings of the 11th Working Conference on Mining Software Repositories (MSR 2014). Association for Computing Machinery, New York, NY, USA. 92–101. isbn:9781450328630 https://doi.org/10.1145/2597073.2597074
[31]
Mehdi Keshani. 2021. Scalable Call Graph Constructor for Maven. In 2021 IEEE/ACM 43rd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion). 99–101. https://doi.org/10.1109/ICSE-Companion52605.2021.00046
[32]
Igibek Koishybayev and Alexandros Kapravelos. 2020. Mininode: Reducing the Attack Surface of Node.js Applications. In 23rd International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2020). USENIX Association, San Sebastian. 121–134. isbn:978-1-939133-18-2 https://www.usenix.org/conference/raid2020/presentation/koishybayev
[33]
Zoe Kotti, Rafaila Galanopoulou, and Diomidis Spinellis. 2023. Machine Learning for Software Engineering: A Tertiary Study. ACM Comput. Surv., 55, 12 (2023), Article 256, mar, 39 pages. issn:0360-0300 https://doi.org/10.1145/3572905
[34]
Charles W. Krueger. 1992. Software Reuse. ACM Comput. Surv., 24, 2 (1992), jun, 131–183. issn:0360-0300 https://doi.org/10.1145/130844.130856
[35]
Konner Macias, Mihir Mathur, Bobby R. Bruce, Tianyi Zhang, and Miryung Kim. 2020. WebJShrink: A Web Service for Debloating Java Bytecode. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2020). Association for Computing Machinery, New York, NY, USA. 1665–1669. isbn:9781450370431 https://doi.org/10.1145/3368089.3417934
[36]
Luis Mastrangelo, Matthias Hauswirth, and Nathaniel Nystrom. 2019. Casting about in the Dark: An Empirical Study of Cast Operations in Java Programs. Proc. ACM Program. Lang., 3, OOPSLA (2019), Article 158, Oct., 31 pages. https://doi.org/10.1145/3360584
[37]
Hayden Melton and Ewan Tempero. 2007. An empirical study of cycles among classes in Java. Empirical Software Engineering, 12, 4 (2007), 01 Aug, 389–415. issn:1573-7616 https://doi.org/10.1007/s10664-006-9033-1
[38]
Gianluca Mezzetti, Anders Møller, and Martin Toldam Torp. 2018. Type Regression Testing to Detect Breaking Changes in Node.js Libraries. In 32nd European Conference on Object-Oriented Programming (ECOOP 2018), Todd Millstein (Ed.) (Leibniz International Proceedings in Informatics (LIPIcs), Vol. 109). Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl, Germany. 7:1–7:24. isbn:978-3-95977-079-8 issn:1868-8969 https://doi.org/10.4230/LIPIcs.ECOOP.2018.7
[39]
Amir M. Mir, Mehdi Keshani, and Sebastian Proksch. 2023. On the Effect of Transitivity and Granularity on Vulnerability Propagation in the Maven Ecosystem. In 2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). 201–211. https://doi.org/10.1109/SANER56733.2023.00028
[40]
Parastoo Mohagheghi and Reidar Conradi. 2007. Quality, productivity and economic benefits of software reuse: a review of industrial studies. Empirical Software Engineering, 12, 5 (2007), 01 Oct, 471–516. issn:1573-7616 https://doi.org/10.1007/s10664-007-9040-x
[41]
Anders Møller, Benjamin Barslev Nielsen, and Martin Toldam Torp. 2020. Detecting locations in JavaScript programs affected by breaking library changes. Proc. ACM Program. Lang., 4, OOPSLA (2020), Article 187, nov, 25 pages. https://doi.org/10.1145/3428255
[42]
Peter Naur and Brian Randell. 1969. Software engineering: Report of a conference sponsored by the nato science committee, garmisch, germany, 7th-11th october 1968.
[43]
Shradha Neupane, Grant Holmes, Elizabeth Wyss, Drew Davidson, and Lorenzo De Carli. 2023. Beyond Typosquatting: An In-depth Look at Package Confusion. In 32nd USENIX Security Symposium (USENIX Security 23). USENIX Association, Anaheim, CA. 3439–3456. isbn:978-1-939133-37-3 https://www.usenix.org/conference/usenixsecurity23/presentation/neupane
[44]
Benjamin Barslev Nielsen, Martin Toldam Torp, and Anders Møller. 2021. Modular call graph construction for security scanning of Node.js applications. In Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2021). Association for Computing Machinery, New York, NY, USA. 29–41. isbn:9781450384599 https://doi.org/10.1145/3460319.3464836
[45]
Tosin Daniel Oyetoyan, Jean-Rémy Falleri, Jens Dietrich, and Kamil Jezek. 2015. Circular dependencies and change-proneness: An empirical study. In 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER). 241–250. https://doi.org/10.1109/SANER.2015.7081834
[46]
Jibesh Patra, Pooja N. Dixit, and Michael Pradel. 2018. ConflictJS: Finding and Understanding Conflicts between JavaScript Libraries. In Proceedings of the 40th International Conference on Software Engineering (ICSE ’18). Association for Computing Machinery, New York, NY, USA. 741–751. isbn:9781450356381 https://doi.org/10.1145/3180155.3180184
[47]
Serena Elisa Ponta, Wolfram Fischer, Henrik Plate, and Antonino Sabetta. 2021. The Used, the Bloated, and the Vulnerable: Reducing the Attack Surface of an Industrial Application. In 2021 IEEE International Conference on Software Maintenance and Evolution (ICSME). 555–558. https://doi.org/10.1109/ICSME52107.2021.00056
[48]
Python Packaging Authority. 2023. Pip v23.1.2 Documentation: Build System Interface. https://pip.pypa.io/en/stable/reference/build-system/## Accessed: July 9, 2023
[49]
Python Packaging Authority. 2024. top_level.txt – Conflict Management Metadata. https://setuptools.pypa.io/en/latest/deprecated/python_eggs.html##top-level-txt-conflict-management-metadata [Online; accessed 21-February-2024]
[50]
Chenxiong Qian, Hong Hu, Mansour Alharthi, Pak Ho Chung, Taesoo Kim, and Wenke Lee. 2019. RAZOR: A Framework for Post-deployment Software Debloating. In 28th USENIX Security Symposium (USENIX Security 19). USENIX Association, Santa Clara, CA. 1733–1750. isbn:978-1-939133-06-9 https://www.usenix.org/conference/usenixsecurity19/presentation/qian
[51]
Chenxiong Qian, Hyungjoon Koo, ChangSeok Oh, Taesoo Kim, and Wenke Lee. 2020. Slimium: Debloating the Chromium Browser with Feature Subsetting. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security (CCS ’20). Association for Computing Machinery, New York, NY, USA. 461–476. isbn:9781450370899 https://doi.org/10.1145/3372297.3417866
[52]
Anh Quach, Aravind Prakash, and Lok Yan. 2018. Debloating Software through Piece-Wise Compilation and Loading. In 27th USENIX Security Symposium (USENIX Security 18). USENIX Association, Baltimore, MD. 869–886. isbn:978-1-939133-04-5 https://www.usenix.org/conference/usenixsecurity18/presentation/quach
[53]
Jeremy Rack and Cristian-Alexandru Staicu. 2023. Jack-in-the-box: An Empirical Study of JavaScript Bundling on the Web and its Security Implications. In Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security (CCS ’23). Association for Computing Machinery, New York, NY, USA. 3198–3212. isbn:9798400700507 https://doi.org/10.1145/3576915.3623140
[54]
Vaibhav Rastogi, Drew Davidson, Lorenzo De Carli, Somesh Jha, and Patrick McDaniel. 2017. Cimplifier: Automatically Debloating Containers. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2017). Association for Computing Machinery, New York, NY, USA. 476–486. isbn:9781450351058 https://doi.org/10.1145/3106237.3106271
[55]
John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, and Xuejun Yang. 2012. Test-Case Reduction for C Compiler Bugs. SIGPLAN Not., 47, 6 (2012), jun, 335–346. issn:0362-1340 https://doi.org/10.1145/2345156.2254104
[56]
Vitalis Salis, Thodoris Sotiropoulos, Panos Louridas, Diomidis Spinellis, and Dimitris Mitropoulos. 2021. PyCG: Practical Call Graph Generation in Python. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). 1646–1657. https://doi.org/10.1109/ICSE43902.2021.00146
[57]
Hashim Sharif, Muhammad Abubakar, Ashish Gehani, and Fareed Zaffar. 2018. TRIMMER: Application Specialization for Code Debloating. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering (ASE ’18). Association for Computing Machinery, New York, NY, USA. 329–339. isbn:9781450359375 https://doi.org/10.1145/3238147.3238160
[58]
Mikhail Shcherbakov, Musard Balliu, and Cristian-Alexandru Staicu. 2023. Silent Spring: Prototype Pollution Leads to Remote Code Execution in Node.js. In 32nd USENIX Security Symposium (USENIX Security 23). USENIX Association, Anaheim, CA. 5521–5538. isbn:978-1-939133-37-3 https://www.usenix.org/conference/usenixsecurity23/presentation/shcherbakov
[59]
Sebastian Simon, Nikolay Kolyada, Christopher Akiki, Martin Potthast, Benno Stein, and Norbert Siegmund. 2023. Exploring Hyperparameter Usage and Tuning in Machine Learning Research. In 2023 IEEE/ACM 2nd International Conference on AI Engineering – Software Engineering for AI (CAIN). 68–79. https://doi.org/10.1109/CAIN58948.2023.00016
[60]
César Soto-Valero, Thomas Durieux, and Benoit Baudry. 2021. A Longitudinal Analysis of Bloated Java Dependencies. ESEC/FSE 2021. Association for Computing Machinery, New York, NY, USA. 1021–1031. isbn:9781450385626 https://doi.org/10.1145/3468264.3468589
[61]
César Soto-Valero, Thomas Durieux, Nicolas Harrand, and Benoit Baudry. 2023. Coverage-Based Debloating for Java Bytecode. ACM Trans. Softw. Eng. Methodol., 32, 2 (2023), Article 38, apr, 34 pages. issn:1049-331X https://doi.org/10.1145/3546948
[62]
César Soto-Valero, Nicolas Harrand, Martin Monperrus, and Benoit Baudry. 2021. A comprehensive study of bloated dependencies in the Maven ecosystem. Empirical Software Engineering, 26, 3 (2021), 25 Mar, 45. issn:1573-7616 https://doi.org/10.1007/s10664-020-09914-8
[63]
Diomidis Spinellis. 2012. Package Management Systems. IEEE Softw., 29, 2 (2012), mar, 84–86. issn:0740-7459 https://doi.org/10.1109/MS.2012.38
[64]
Cristian-Alexandru Staicu, Michael Pradel, and Benjamin Livshits. 2018. SYNODE: Understanding and Automatically Preventing Injection Attacks on NODE.JS. In Network and Distributed System Security Symposium. https://api.semanticscholar.org/CorpusID:51951699
[65]
Cristian-Alexandru Staicu, Sazzadur Rahaman, Ágnes Kiss, and Michael Backes. 2023. Bilingual Problems: Studying the Security Risks Incurred by Native Extensions in Scripting Languages. In 32nd USENIX Security Symposium (USENIX Security 23). USENIX Association, Anaheim, CA. 6133–6150. isbn:978-1-939133-37-3 https://www.usenix.org/conference/usenixsecurity23/presentation/staicu
[66]
Cristian-Alexandru Staicu, Martin Toldam Torp, Max Schäfer, Anders Møller, and Michael Pradel. 2020. Extracting taint specifications for JavaScript libraries. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering (ICSE ’20). Association for Computing Machinery, New York, NY, USA. 198–209. isbn:9781450371216 https://doi.org/10.1145/3377811.3380390
[67]
Chengnian Sun, Yuanbo Li, Qirun Zhang, Tianxiao Gu, and Zhendong Su. 2018. Perses: Syntax-Guided Program Reduction. In 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE). 361–371. https://doi.org/10.1145/3180155.3180236
[68]
Girish Suryanarayana, Ganesh Samarthyam, and Tushar Sharma. 2014. Refactoring for Software Design Smells: Managing Technical Debt (1st ed.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA. isbn:0128013974
[69]
Alexi Turcotte, Ellen Arteca, Ashish Mishra, Saba Alimadadi, and Frank Tip. 2022. Stubbifier: debloating dynamic server-side JavaScript applications. Empirical Software Engineering, 27, 7 (2022), 20 Sep, 161. issn:1573-7616 https://doi.org/10.1007/s10664-022-10195-6
[70]
Sander van der Burg, Eelco Dolstra, Shane McIntosh, Julius Davies, Daniel M. German, and Armijn Hemel. 2014. Tracing Software Build Processes to Uncover License Compliance Inconsistencies. In Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering (ASE ’14). Association for Computing Machinery, New York, NY, USA. 731–742. isbn:9781450330138 https://doi.org/10.1145/2642937.2643013
[71]
Hernán Ceferino Vázquez, Alexandre Bergel, Santiago Vidal, JA Díaz Pace, and Claudia Marcos. 2019. Slimming JavaScript applications: An approach for removing unused functions from JavaScript libraries. Information and Software Technology, 107 (2019), 18–29. issn:0950-5849 https://doi.org/10.1016/j.infsof.2018.10.009
[72]
A. Shivarpatna Venkatesh, J. Wang, L. Li, and E. Bodden. 2023. Enhancing Comprehension and Navigation in Jupyter Notebooks with Static Analysis. In 2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE Computer Society, Los Alamitos, CA, USA. 391–401. https://doi.org/10.1109/SANER56733.2023.00044
[73]
Jiawei Wang, Li Li, and Andreas Zeller. 2021. Restoring Execution Environments of Jupyter Notebooks. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). 1622–1633. https://doi.org/10.1109/ICSE43902.2021.00144
[74]
Ying Wang, Ming Wen, Yepang Liu, Yibo Wang, Zhenming Li, Chao Wang, Hai Yu, Shing-Chi Cheung, Chang Xu, and Zhiliang Zhu. 2020. Watchman: Monitoring Dependency Conflicts for Python Library Ecosystem. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering (ICSE ’20). Association for Computing Machinery, New York, NY, USA. 125–135. isbn:9781450371216 https://doi.org/10.1145/3377811.3380426
[75]
Nusrat Zahan, Thomas Zimmermann, Patrice Godefroid, Brendan Murphy, Chandra Maddila, and Laurie Williams. 2022. What are weak links in the npm supply chain? In Proceedings of the 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP ’22). Association for Computing Machinery, New York, NY, USA. 331–340. isbn:9781450392266 https://doi.org/10.1145/3510457.3513044
[76]
Markus Zimmermann, Cristian-Alexandru Staicu, Cam Tenny, and Michael Pradel. 2019. Small World with High Risks: A Study of Security Threats in the npm Ecosystem. In 28th USENIX Security Symposium (USENIX Security 19). USENIX Association, Santa Clara, CA. 995–1010. isbn:978-1-939133-06-9 https://www.usenix.org/conference/usenixsecurity19/presentation/zimmerman

Index Terms

  1. Bloat beneath Python’s Scales: A Fine-Grained Inter-Project Dependency Analysis

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Software Engineering
    Proceedings of the ACM on Software Engineering  Volume 1, Issue FSE
    July 2024
    2770 pages
    EISSN:2994-970X
    DOI:10.1145/3554322
    Issue’s Table of Contents
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 July 2024
    Published in PACMSE Volume 1, Issue FSE

    Badges

    Author Tags

    1. PyPI
    2. Python
    3. call graph
    4. debloating
    5. dependencies
    6. software bloat

    Qualifiers

    • Research-article

    Funding Sources

    • European Union's Horizon 2020 research and innovation programme
    • European Union's Horizon 2021 research and innovation programme

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 1,161
      Total Downloads
    • Downloads (Last 12 months)1,161
    • Downloads (Last 6 weeks)85
    Reflects downloads up to 20 Jan 2025

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media