Skip to main content

Vulnerability Detection for Smart Contract via Backward Bayesian Active Learning

  • Conference paper
  • First Online:
Applied Cryptography and Network Security Workshops (ACNS 2022)

Abstract

Smart contract is a piece of program code running on the blockchain, which aims to realize trusted transactions without third parties. In recent years, smart contract vulnerabilities emerge one after another, resulting in huge economic losses. Machine learning technology is widely used in smart contract vulnerability detection. It is common that model training in machine learning often requires a large amount of labeled data while the unlabeled data in the current field is very rich and acquiring labels is extremely difficult. As a result, it takes a lot of manpower and time to label a vulnerability, and it is challenging to perform effective smart contract vulnerability detection. To tackle this problem, we propose BwdBAL, a novel framework for smart contract vulnerability detection that combines Bayesian Active Learning (BAL) and a backward noise removal method. We use BAL to remove the impact of model uncertainty on uncertainty sampling in active learning. During the backward process, we clean up the noise in the labeled dataset to reduce the negative influence on the classification model. We evaluate BwdBAL on 8 vulnerabilities about 4929 smart contracts with four performance indicators. The experimental results show that BwdBAL outperforms two baseline methods: conventional machine learning-enabled classification method and one-way active learning method.

J. Zhang and L. Tu contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Garriga, C.: Decentralized finance: on blockchain- and smart contract-based financial markets (2021)

    Google Scholar 

  2. Moosavi, J., Naeni, L.M., Fathollahi-Fard, A.M., Fiore, U.: Blockchain in supply chain management: a review, bibliometric, and network analysis. Environ. Sci. Pollut. Res. 1–15 (2021). https://doi.org/10.1007/s11356-021-13094-3

  3. Jiang, Y., Zhong, Y., Ge, X.: Smart contract-based data commodity transactions for industrial internet of things. IEEE Access 7, 180856–180866 (2019)

    Article  Google Scholar 

  4. Xu, B., Agbele, T., Jiang, R.: Biometric blockchain: a better solution for the security and trust of food logistics. IOP Conf. Ser. Mater. Sci. Eng. 646, 012009 (2019)

    Article  Google Scholar 

  5. [26] cointegraph1. https://cointelegraph.com/. Accessed 21 Mar 2022

  6. Torres, C.F., Iannillo, A.K., Gervais, A., State, R.: The eye of horus: spotting and analyzing attacks on ethereum smart contracts. arXiv preprint arXiv:2101.06204 (2021)

  7. Tikhomirov, S., Voskresenskaya, E., Ivanitskiy, I., Takhaviev, R., Marchenko, E., Alexandrov, Y.: Smartcheck: static analysis of ethereum smart contracts. In: Proceedings of the 1st International Workshop on Emerging Trends in Software Engineering for Blockchain, pp. 9–16 (2018)

    Google Scholar 

  8. Feist, J., Grieco, G., Groce, A.: Slither: a static analysis framework for smart contracts. In: 2019 IEEE/ACM 2nd International Workshop on Emerging Trends in Software Engineering for Blockchain (WETSEB), pp. 8–15. IEEE (2019)

    Google Scholar 

  9. Kalra, S., Goel, S., Dhawan, M., Sharma, S.: Zeus: analyzing safety of smart contracts. In: NDSS, pp. 1–12 (2018)

    Google Scholar 

  10. Park, D., Zhang, Y., Saxena, M., Daian, P., Roşu, G.: A formal verification tool for ethereum VM bytecode. In: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 912–915 (2018)

    Google Scholar 

  11. Tsankov, P., Dan, A., Drachsler-Cohen, D., Gervais, A., Buenzli, F., Vechev, M.: Securify: practical security analysis of smart contracts. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pp. 67–82 (2018)

    Google Scholar 

  12. Luu, L., Chu, D.-H., Olickel, H., Saxena, P., Hobor, A.: Making smart contracts smarter. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 254–269 (2016)

    Google Scholar 

  13. Jiang, B., Liu, Y., Chan, W.K.: Contractfuzzer: fuzzing smart contracts for vulnerability detection. In: 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 259–269. IEEE (2018)

    Google Scholar 

  14. Yu, Z., Theisen, C., Williams, L., Menzies, T.: Improving vulnerability inspection efficiency using active learning. IEEE Trans. Softw. Eng. 47(11), 2401–2420 (2019)

    Article  Google Scholar 

  15. Xu, Z., Liu, J., Luo, X., Zhang, T.: Cross-version defect prediction via hybrid active learning with kernel principal component analysis. In: 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 209–220. IEEE (2018)

    Google Scholar 

  16. Tong, S., Koller, D.: Support vector machine active learning with applications to text classification. J. Mach. Learn. Res. 2(Nov), 45–66 (2001)

    MATH  Google Scholar 

  17. Hoi, S.C.H., Jin, R., Lyu, M.R.: Large-scale text categorization by batch mode active learning. In: Proceedings of the 15th International Conference on World Wide Web, pp. 633–642 (2006)

    Google Scholar 

  18. Tuia, D., Ratle, F., Pacifici, F., Kanevski, M.F., Emery, W.J.: Active learning methods for remote sensing image classification. IEEE Trans. Geosci. Remote Sens. 47(7), 2218–2232 (2009)

    Article  Google Scholar 

  19. Cho, J.W., Kim, D.-J., Jung, Y., Kweon, I.S.: MCDAL: maximum classifier discrepancy for active learning. IEEE Trans. Neural Netw. Learn, Syst (2022)

    Google Scholar 

  20. Huang, S.-J., Jin, R., Zhou, Z.-H.: Active learning by querying informative and representative examples. In: Advances in Neural Information Processing Systems 23 (2010)

    Google Scholar 

  21. Gal, Y., Ghahramani, Z.: Dropout as a Bayesian approximation: representing model uncertainty in deep learning. In: International Conference on Machine Learning, pp. 1050–1059. PMLR (2016)

    Google Scholar 

  22. Brent, L., et al.: Vandal: a scalable security analysis framework for smart contracts. arXiv preprint arXiv:1809.03981 (2018)

  23. Torres, C.F., Schütte, J., State, R.: Osiris: hunting for integer bugs in ethereum smart contracts. In: Proceedings of the 34th Annual Computer Security Applications Conference, pp. 664–676 (2018)

    Google Scholar 

  24. Liu, C., Liu, H., Cao, Z., Chen, Z., Chen, B., Roscoe, B.: Reguard: finding reentrancy bugs in smart contracts. In: 2018 IEEE/ACM 40th International Conference on Software Engineering: Companion (ICSE-Companion), pp. 65–68. IEEE (2018)

    Google Scholar 

  25. Kevin N’DA, A.A., Matalonga, S., Dahal, K.: Applicability of the software security code metrics for ethereum smart contract. In: Awan, I., Benbernou, S., Younas, M., Aleksy, M. (eds.) Deep-BDB 2021. LNNS, vol. 309, pp. 106–119. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-84337-3_9

    Chapter  Google Scholar 

  26. Momeni, P., Wang, Y., Samavi, R.: Machine learning model for smart contracts security analysis. In: 2019 17th International Conference on Privacy, Security and Trust (PST), pp. 1–6. IEEE (2019)

    Google Scholar 

  27. Liao, J.-W., Tsai, T.-T., He, C.-K., Tien, C.-W.: SoliAudit: smart contract vulnerability assessment based on machine learning and fuzz testing. In: 2019 Sixth International Conference on Internet of Things: Systems, Management and Security (IOTSMS), pp. 458–465. IEEE (2019)

    Google Scholar 

  28. Qian, P., Liu, Z., He, Q., Zimmermann, R., Wang, X.: Towards automated reentrancy detection for smart contracts based on sequential models. IEEE Access 8, 19685–19695 (2020)

    Article  Google Scholar 

  29. Ashizawa, N., Yanai, N., Cruz, J.P., Okamura, S.: Eth2vec: learning contract-wide code representations for vulnerability detection on ethereum smart contracts. In: Proceedings of the 3rd ACM International Symposium on Blockchain and Secure Critical Infrastructure, pp. 47–59 (2021)

    Google Scholar 

  30. Mi, F., Wang, Z., Zhao, C., Guo, J., Ahmed, F., Khan, L.: VSCL: automating vulnerability detection in smart contracts with deep learning. In: 2021 IEEE International Conference on Blockchain and Cryptocurrency (ICBC), pp. 1–9. IEEE (2021)

    Google Scholar 

  31. Atighehchian, P., Branchaud-Charron, F., Lacoste, A.: Bayesian active learning for production, a systematic study and a reusable library. arXiv preprint arXiv:2006.09916 (2020)

  32. Tsymbalov, E., Makarychev, S., Shapeev, A., Panov, M.: Deeper connections between neural networks and Gaussian processes speed-up active learning. arXiv preprint arXiv:1902.10350 (2019)

  33. Kirsch, A., Van Amersfoort, J., Gal, Y.: Batchbald: efficient and diverse batch acquisition for deep Bayesian active learning. In: Advances in Neural Information Processing Systems 32 (2019)

    Google Scholar 

  34. Cakmak, M., Thomaz, A.L.: Eliciting good teaching from humans for machine learners. Artif. Intell. 217, 198–215 (2014)

    Article  Google Scholar 

  35. Donmez, P., Carbonell, J.G., Schneider, J.: Efficiently learning the accuracy of labeling sources for selective sampling. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 259–268 (2009)

    Google Scholar 

  36. Zhang, X.-Y., Wang, S., Yun, X.: Bidirectional active learning: a two-way exploration into unlabeled and labeled data set. IEEE Trans. Neural Netw. Learn. Syst. 26(12), 3034–3044 (2015)

    Article  MathSciNet  Google Scholar 

  37. Luo, G., Ma, Y., Qin, K.: Active learning for software defect prediction. IEICE Trans. Inf. Syst. 95(6), 1680–1683 (2012)

    Article  Google Scholar 

  38. Li, M., Zhang, H., Rongxin, W., Zhou, Z.-H.: Sample-based software defect prediction with active and semi-supervised learning. Autom. Softw. Eng. 19(2), 201–230 (2012)

    Article  Google Scholar 

  39. Lu, H., Cukic, B.: An adaptive approach with active learning in software fault prediction. In: Proceedings of the 8th International Conference on Predictive Models in Software Engineering, pp. 79–88 (2012)

    Google Scholar 

  40. Lu, H., Kocaguneli, E., Cukic, B.: Defect prediction between software versions with active learning and dimensionality reduction. In: 2014 IEEE 25th International Symposium on Software Reliability Engineering, pp. 312–322. IEEE (2014)

    Google Scholar 

  41. NCC group. https://www.nccgroup.trust/us/. Accessed 21 Mar 2022

  42. DASP top 10. https://dasp.co/. Accessed 21 Mar 2022

  43. Durieux, T., Ferreira, J.F., Abreu, R., Cruz, P.: Empirical review of automated analysis tools on 47,587 ethereum smart contracts. In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, pp. 530–541 (2020)

    Google Scholar 

  44. SoliAudit vulnerability analyzer dataset. https://goo.gl/UAUpK5/. Accessed 21 Mar 2022

  45. Ghaleb, A., Pattabiraman, K.: How effective are smart contract analysis tools? Evaluating smart contract static analysis tools using bug injection. In: Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 415–427 (2020)

    Google Scholar 

  46. Abe, N.: Query learning strategies using boosting and bagging. In: Proceedings of 15th International Conference on Machine Learning (ICML 1998) (1998)

    Google Scholar 

  47. Ebert, S., Fritz, M., Schiele, B.: RALF: a reinforced active learning formulation for object class recognition. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3626–3633. IEEE (2012)

    Google Scholar 

  48. Roy, N., McCallum, A.: Toward optimal active learning through Monte Carlo estimation of error reduction. ICML Williamstown 2, 441–448 (2001)

    Google Scholar 

Download references

Acknowledgement

This work is supported by the National Natural Science Foundation of China (No. 61972335, No. 61872312, No. 62002309); the Six Talent Peaks Project in Jiangsu Province (No. RJFW-053), the Jiangsu “333” Project, the Open Funds of State Key Laboratory for Novel Software Technology of Nanjing University (No. KFKT2020B15, No. KFKT2020B16), the Future Network Scientific Research Fund Project (FNSRFP-2021-YB-47), the Yangzhou city-Yangzhou University Science and Technology Cooperation Fund Project (YZ2021157, YZ2021158), the Key Laboratory of Safety-Critical Software Ministry of Industry and Information Technology (No. NJ2020022), the Natural Science Research Project of Universities in Jiangsu Province (No. 20KJB520024), and Yangzhou University Top-level Talents Support Program (2019).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Jie Cai or Xiaobing Sun .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, J. et al. (2022). Vulnerability Detection for Smart Contract via Backward Bayesian Active Learning. In: Zhou, J., et al. Applied Cryptography and Network Security Workshops. ACNS 2022. Lecture Notes in Computer Science, vol 13285. Springer, Cham. https://doi.org/10.1007/978-3-031-16815-4_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-16815-4_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-16814-7

  • Online ISBN: 978-3-031-16815-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics