DAPP: automatic detection and analysis of prototype pollution vulnerability in Node.js modules

Kim, Hee Yeon; Kim, Ji Hoon; Oh, Ho Kyun; Lee, Beom Jin; Mun, Si Woo; Shin, Jeong Hoon; Kim, Kyounggon

doi:10.1007/s10207-020-00537-0

DAPP: automatic detection and analysis of prototype pollution vulnerability in Node.js modules

regular contribution
Published: 13 February 2021

Volume 21, pages 1–23, (2022)
Cite this article

International Journal of Information Security Aims and scope Submit manuscript

1696 Accesses
7 Altmetric
Explore all metrics

Abstract

The safe maintenance of Node.js modules is critical in the software security industry. Most server-side web applications are built on Node.js, an environment that is highly dependent on modules. However, there is clear lack of research on Node.js module security. This study focuses particularly on prototype pollution vulnerability, which is an emerging security vulnerability type that has also not been studied widely. To this point, the main goal of this paper is to propose patterns that can identify prototype pollution vulnerabilities. We developed an automatic static analysis tool called DAPP, which targets all the real-world modules registered in the Node Package Manager. DAPP can discover the proposed patterns in each Node.js module in a matter of a few seconds, and it mainly performs and integrates a static analysis based on abstract syntax tree and control flow graph. This study suggests an improved and efficient analysis methodology. We conducted multiple empirical tests to evaluate and compare our state-of-the-art methodology with previous analysis tools, and we found that our tool is exhaustive and works well with modern JavaScript syntax. To this end, our research demonstrates how DAPP found over 37 previously undiscovered prototype pollution vulnerabilities among 30,000 of the most downloaded Node.js modules. To evaluate DAPP, we expanded the experiment and ran our tool on 100,000 Node.js modules. The evaluation results show a high level of performance for DAPP along with the root causes for false positives and false negatives. Finally, we reported the 37 vulnerabilities, respectively, and obtained 24 CVE IDs mostly with 9.8 CVSS scores.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

VANDALIR: Vulnerability Analyses Based on Datalog and LLVM-IR

Towards the Efficient Use of Dynamic Call Graph Generators of Node.js Applications

Designing a Code Vulnerability Meta-scanner

Availability of data and material

The data and material that support the findings of this study are available from the corresponding author, Kyounggon Kim, upon reasonable request.

Code availability The data and material that support the findings of this study are available from the corresponding author, Kyounggon Kim, upon reasonable request.

References

Acorn.: acorn (2019). https://www.npmjs.com/package/acorn. Online; Accessed 10 Dec 2019
Arteau, O.: Holyvier/prototype-pollution-nsec18 (2018). https://github.com/HoLyVieR/prototype-pollution-nsec18. Online; Accessed 10 Aug 2020
Babel-eslint.: babel-eslint (2019). https://www.npmjs.com/package/babel-eslint. Online; Accessed 10 Dec 2019
Christensen, H.K., Brodal, G.S.: Algorithms for finding dominators in directed graphs. PhD thesis, Aarhus Universitet, Datalogisk Institut (2016)
Davis, J., Thekumparampil, A., Lee, D.: Node. fz: fuzzing the server-side event-driven architecture. In: Proceedings of the Twelfth European Conference on Computer Systems, pp. 145–160. ACM (2017)
De Groef, W., Massacci, F., Piessens, F.: Nodesentry: least-privilege library integration for server-side Javascript. In: Proceedings of the 30th Annual Computer Security Applications Conference, pp. 446–455. ACM (2014)
Duračík, M., Kršák, E., Hrkút, P.: Current trends in source code analysis, plagiarism detection and issues of analysis big datasets. Procedia Eng. 192, 136–141 (2017)
Esgraph.: esgraph (2019). https://www.npmjs.com/package/esgraph. Online; Accessed 10 Dec 2019
Gauthier, F., Hassanshahi, B., Jordan, A.: A ffogato: runtime detection of injection attacks for node.js. In: Companion Proceedings for the ISSTA/ECOOP 2018 Workshops, pp. 94–99. ACM (2018)
Georgiadis, L., Tarjan, R.E., Werneck, R.F.: Finding dominators in practice. J. Graph Algorithms Appl. 10(1), 69–94 (2006)
Article MathSciNet Google Scholar
Ghaffarian, S.M., Shahriari, H.R.: Software vulnerability analysis and discovery using machine-learning and data-mining techniques: a survey. ACM Comput. Surv.: CSUR 50(4), 56 (2017)
Google Scholar
Gong, L., Pradel, M., Sridharan, M., Sen, K.: Dlint: dynamically checking bad coding practices in javascript. In: Proceedings of the 2015 International Symposium on Software Testing and Analysis, pp. 94–105. ACM (2015)
Grieco, G., Grinblat, G.L., Uzal, L., Rawat, S., Feist, J., Mounier, L.: Toward large-scale vulnerability discovery using machine learning. In: Proceedings of the Sixth ACM Conference on Data and Application Security and Privacy, pp. 85–96. ACM (2016)
Gupta, R.: Generalized dominators and post-dominators. In: Proceedings of the 19th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pp. 246–257. ACM (1992)
Hidayat, A. esprima (2018). https://www.npmjs.com/package/esprima, https://esprima.org/. Online; Accessed 10 Dec 2019
Holland, B., Santhanam, G.R., Awadhutkar, P., Kothari, S.: Statically-informed dynamic analysis tools to detect algorithmic complexity vulnerabilities. In: 2016 IEEE 16th International Working Conference on Source Code Analysis and Manipulation (SCAM), pp. 79–84. IEEE (2016)
Jovanovic, N., Kruegel, C., Kirda, E.: Pixy: a static analysis tool for detecting web application vulnerabilities. In: 2006 IEEE Symposium on Security and Privacy (S&P’06), pp. 6–pp. IEEE (2006)
Lengauer, T., Tarjan, R.E.: A fast algorithm for finding dominators in a flowgraph. ACM Trans. Program. Lang. Syst.: TOPLAS 1(1), 121–141 (1979)
Article Google Scholar
Madsen, M., Tip, F., Lhoták, O.: Static analysis of event-driven node.js Javascript applications. In: ACM SIGPLAN Notices, vol. 50, pp. 505–519. ACM (2015)
Murthy, P.K.: Constructing a control flow graph for a software program, February 3 (2015). US Patent 8,949,811
nodejs.: nodejs (2019). https://nodejs.org/en/about/. Online; Accessed 10 Dec 2019
Ojamaa, A., Düüna, K.: Assessing the security of node.js platform. In: 2012 International Conference for Internet Technology and Secured Transactions, pp. 348–355. IEEE (2012)
OWASP: Owasp dependency check (2019). https://www.owasp.org/index.php/OWASP_Dependency_Check. Online; Accessed 10 Dec 2019
Patel, P.R.: Existence of Dependency-Based Attacks in NodeJS Environment. In: Creative Components. 91 (2018). https://lib.dr.iastate.edu/creativecomponents/91. Accessed 10 Dec 2019
Patnaik, N., Sahoo, S.: Javascript static security analysis made easy with jsprime. Blackhat USA (2013)
Pfretzschner, B., ben Othmane, L.: Identification of dependency-based attacks on node.js. In: Proceedings of the 12th International Conference on Availability, Reliability and Security, p. 68. ACM (2017)
Quinlan, D.J., Vuduc, R.W., Misherghi, G.: Techniques for specifying bug patterns. In: Proceedings of the 2007 ACM Workshop on Parallel and Distributed Systems: Testing and Debugging, pp. 27–35. ACM (2007)
Retire.js.: Retire.js (2019). https://retirejs.github.io/retire.js/. Online; Accessed 10 Dec 2019
Scandariato, R., Walden, J., Hovsepyan, A., Joosen, W.: Predicting vulnerable software components via text mining. IEEE Trans. Softw. Eng. 40(10), 993–1006 (2014)
Article Google Scholar
Sen, K., Kalasapur, S., Brutch, T., Gibbs, S.: Jalangi: a selective record-replay and dynamic analysis framework for Javascript. In: Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, pp. 488–498. ACM (2013)
Sharir, M.: A strong-connectivity algorithm and its applications in data flow analysis. Comput. Math. Appl. 7(1), 67–72 (1981)
Article MathSciNet Google Scholar
Snyk: Prototype pollution (2019a). https://snyk.io/vuln/SNYK-JS-JQUERY-174006. Online; Accessed 10 Dec 2019
Snyk: Snyk (2019b). https://github.com/snyk/snyk. Online; Accessed 10 Dec 2019
SourceClear: Sourceclear (2019). https://www.sourceclear.com/. Online; Accessed 10 Dec 2019
Staicu, C.-A., Pradel, M., Livshits, B.: Understanding and automatically preventing injection attacks on node.js. Technical report, Technical Report TUD-CS-2016-14663, TU Darmstadt, Department of Computer Science (2016)
Sun, H., Bonetta, D., Humer, C., Binder, W.: Efficient dynamic analysis for node.js. In: Proceedings of the 27th International Conference on Compiler Construction, pp. 196–206. ACM (2018)
Tao, G., Guowei, D., Hu, Q., Baojiang, C.: Improved plagiarism detection algorithm based on abstract syntax tree. In: 2013 Fourth International Conference on Emerging Intelligent Data and Web Technologies, pp. 714–719. IEEE (2013)
Xie, Y., Aiken, A.: Static detection of security vulnerabilities in scripting languages. In: USENIX Security Symposium, vol. 15, pp. 179–192 (2006)
Yamaguchi, F., Lindner, F., Rieck, K.: Vulnerability extrapolation: assisted discovery of vulnerabilities using machine learning. In: Proceedings of the 5th USENIX Conference on Offensive Technologies, pp. 13. USENIX Association (2011)
Yamaguchi, F., Lottmann, M., Rieck, K.: Generalized vulnerability extrapolation using abstract syntax trees. In: Proceedings of the 28th Annual Computer Security Applications Conference, pp. 359–368. ACM (2012)
Yamaguchi, F., Golde, N., Arp, D., Rieck, K.: Modeling and discovering vulnerabilities with code property graphs. In: 2014 IEEE Symposium on Security and Privacy, pp. 590–604. IEEE (2014)
Zhao, J., Xia, K., Fu, Y., Cui, B.: An ast-based code plagiarism detection algorithm. In: 2015 10th International Conference on Broadband and Wireless Computing, Communication and Applications (BWCCA), pp. 178–182. IEEE (2015)
Zheng, M., Pan, X., Lillis, D.: Codex: source code plagiarism detection based on abstract syntax tree. In: AICS, pp. 362–373 (2018)

Download references

Acknowledgements

This work was supported as part of the Next Generation Security Leader Training Program (Best of the Best) funded by Korea Information Technology Research Institute (KITRI).

Funding

This work was supported as part of the Next Generation Security Leader Training Program (Best of the Best) funded by Korea Information Technology Research Institute (KITRI).

Author information

Authors and Affiliations

Department of Cyber Defense, Korea University, Seoul, Republic of Korea
Hee Yeon Kim, Ji Hoon Kim & Ho Kyun Oh
Hayyim Security, Seoul, Republic of Korea
Beom Jin Lee
Alice&Mallory, Seoul, Republic of Korea
Si Woo Mun
THEORI, Seoul, Republic of Korea
Jeong Hoon Shin
Department of Forensic Sciences, Naif Arab University for Security Sciences, Riyadh, Kingdom of Saudi Arabia
Kyounggon Kim

Authors

Hee Yeon Kim
View author publications
You can also search for this author inPubMed Google Scholar
Ji Hoon Kim
View author publications
You can also search for this author inPubMed Google Scholar
Ho Kyun Oh
View author publications
You can also search for this author inPubMed Google Scholar
Beom Jin Lee
View author publications
You can also search for this author inPubMed Google Scholar
Si Woo Mun
View author publications
You can also search for this author inPubMed Google Scholar
Jeong Hoon Shin
View author publications
You can also search for this author inPubMed Google Scholar
Kyounggon Kim
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by H.Y.K., J.H.K., H.K.O., B.J.L. and S.W.M.. The first draft of the manuscript was written by H.Y.K. and J.H.K., and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Kyounggon Kim.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent

This article does not contain any studies with human participants.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Mitigations for prototype pollution vulnerability

After we reported those 37 vulnerabilities through NPM, some of the modules are newly patched to avoid prototype pollution attack. This process helped us to organize how to mitigate prototype pollution attack with today techniques. This section summarizes how to mitigate prototype pollution vulnerability.

1.1 Appendix A.1: Use Object.hasOwnProperty

The “Object.hasOwnProperty” method can be used to check the existence of certain properties of the target object. Thus, even if the value of a key such as “constructor” is specified, the reference can be prevented. In this way, unlike the “in” operator which works similar to the “Object.hasOwnProperty” method, the developer can prevent the prototype references. The difference between “Object.hasOwnProperty” and “in” operator is shown in Listing 21.

Algorithm 4 shows how to mitigate prototype pollution by using “Object.hasOwnProperty” function. However, this method is not available for the new property settings that did not exist originally.

1.2 Appendix A.2: Make “proto” empty

Typically, to initialize object in JavaScript, it is usually initialized in the following ways.

However, when initializing this way, the “__proto__” property of the “obj” in Listing 22 will refer to the prototype of the constructor Object. If the attacker contaminates the “__proto__” property of “obj,” the “Object.prototype” is also contaminated. Since the global object also inherits the “Object.prototype,” other contexts will refer to the contaminated property which can generate prototype pollution vulnerability.

The fundamental problem here is that “__proto__” of the object literal refers to the “Object.prototype,” so contaminating the “__proto__” will also pollutes other objects. Therefore, by making “__proto__” empty, the developer can delete the reference of the “Object.prototype.” Listing 23 and Listing 24 show how to initialize “__proto__” to null.

1.3 Appendix A.3: Filtering by Keyname

This method prevents prototype pollution vulnerability by filtering the values of key with certain names such as “prototype,” “__proto__,” and “constructor.”

Algorithm 5 shows how to mitigate prototype pollution by setting a keyword filter. Most of patched modules take this approach. Creating a mitigation with keyword filters can help prevent pollution without being limited to the execution environment.

Listing 25 is a code actually used to prevent prototype pollution of the “dot-prop” module. We have reported the vulnerability on Oct 14, 2019, and the patched source code was committed on Oct 23, 2019. The vendor added a function that checks disallowed keys while splitting object name with dot notation.

1.4 Appendix A.4: Prototype freezing

“Object.freeze” is a JavaScript native function that makes objects read-only. By applying this function to the object “Object.prototype,” the properties of the “Object.prototype” will become unchangeable as shown in Listing 26.

However, this method can cause serious errors when using libraries that require modification of the object prototypes. Therefore, this may not be applicable in all situations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kim, H.Y., Kim, J.H., Oh, H.K. et al. DAPP: automatic detection and analysis of prototype pollution vulnerability in Node.js modules. Int. J. Inf. Secur. 21, 1–23 (2022). https://doi.org/10.1007/s10207-020-00537-0

Download citation

Published: 13 February 2021
Issue Date: February 2022
DOI: https://doi.org/10.1007/s10207-020-00537-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DAPP: automatic detection and analysis of prototype pollution vulnerability in Node.js modules

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

VANDALIR: Vulnerability Analyses Based on Datalog and LLVM-IR

Towards the Efficient Use of Dynamic Call Graph Generators of Node.js Applications

Designing a Code Vulnerability Meta-scanner

Availability of data and material

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Informed consent

Additional information

Publisher's Note

Appendix A: Mitigations for prototype pollution vulnerability

Appendix A: Mitigations for prototype pollution vulnerability

1.1 Appendix A.1: Use Object.hasOwnProperty

1.2 Appendix A.2: Make “__proto__” empty

1.3 Appendix A.3: Filtering by Keyname

1.4 Appendix A.4: Prototype freezing

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

1.2 Appendix A.2: Make “proto” empty