Abstract
Quality-driven design decisions are often addressed by using architectural tactics that are re-usable solution options for certain quality concerns. Creating traceability links for these tactics is useful but costly. Automating the creation of these links can help reduce costs but is challenging as simple structural analyses only yield limited results. Transfer-learning approaches using language models like BERT are a recent trend in the field of natural language processing. These approaches yield state-of-the-art results for tasks like text classification. In this paper, we experiment with treating detection of architectural tactics in code as a text classification problem. We present an approach to detect architectural tactics in code by fine-tuning BERT. A 10-fold cross-validation shows promising results with an average \(F_1\)-Score of 90%, which is on a par with state-of-the-art approaches. We additionally apply our approach on a case study, where the results of our approach show promising potential but fall behind the state-of-the-art. Therefore, we discuss our approach and look at potential reasons as well as downsides and future work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Adhikari, A., Ram, A., Tang, R., Lin, J.: Docbert: BERT for document classification. arXiv (2019). http://arxiv.org/abs/1904.08398
Alon, U., Brody, S., Levy, O., Yahav, E.: code2seq: generating sequences from structured representations of code. In: ICLR (2019)
Antoniol, G., Canfora, G., Casazza, G., De Lucia, A., Merlo, E.: Recovering traceability links between code and documentation. IEEE TSE 28(10), 970–983 (2002). https://doi.org/10.1109/TSE.2002.1041053
Antoniol, G., Casazza, G., Di Penta, M., Fiutem, R.: Object-oriented design patterns recovery. J. Syst. Softw. 59(2), 181–196 (2001)
Babar, M.A., Gorton, I.: A tool for managing software architecture knowledge. In: 2nd SHARK/ADI 2007 ICSE Workshops 2007, pp. 11–11. IEEE (2007)
Bass, L., Clements, P., Kazman, R.: Software Architecture in Practice. Addison-Wesley Professional (2003)
Beltagy, I., Peters, M.E., Cohan, A.: Longformer: The long-document transformer. arXiv (2020). http://arxiv.org/abs/1904.08398
Capilla, R., Nava, F., Pérez, S., Dueñas, J.C.: A web-based tool for managing architectural design decisions. ACM SIGSOFT 31(5), 4 (2006)
Chihada, A., Jalili, S., Hasheminejad, S.M.H., Zangooei, M.H.: Source code and design conformance, design pattern detection from source code by classification approach. Appl. Soft Comput. 26, 357–367 (2015)
Cleland-Huang, J., Berenbach, B., Clark, S., Settimi, R., Romanova, E.: Best practices for automated traceability. Computer 40(6), 27–35 (2007). https://doi.org/10.1109/MC.2007.195
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of Deep Bidirectional transformers for language understanding. In: NAACL-HLT (2019). https://doi.org/10.18653/v1/N19-1423
Ducasse, S., Pollet, D.: Software architecture reconstruction: a process-oriented taxonomy. IEEE TSE 35(4), 573–591 (2009)
Egyed, A., Biffl, S., Heindl, M., Grünbacher, P.: Determining the cost-quality trade-off for automated software traceability. In: 20th IEEE/ACM ASE, pp. 360–363. ACM, New York (2005). https://doi.org/10.1145/1101908.1101970
Gamma, E., Helm, R., Johnson, R., Vlissides, J.: Elements of reusable object-oriented software. arXiv (1995)
Hey, T., Keim, J., Tichy, W.F., Koziolek, A.: NoRBERT: Transfer learning for requirements classification. In: 2020 IEEE 28th RE. IEEE (2020)
Hoorn, J.F., Farenhorst, R., Lago, P., Van Vliet, H.: The lonesome architect. J. Syst. Softw. 84(9), 1424–1435 (2011)
Howard, J., Ruder, S.: Fine-tuned language models for text classification. arXiv (2018). http://arxiv.org/abs/1801.06146
Keim, J., Kaplan, A., Koziolek, A., Mirakhorli, M.: Gram21/BERT4DAT, July 2020. https://doi.org/10.5281/zenodo.3925165
Keim, J., Kaplan, A., Koziolek, A., Mirakhorli, M.: Using BERT for the detection of architectural tactics in code. Technical report 2, Karlsruhe Institute of Technology (KIT), Karlsruhe (2020). https://doi.org/10.5445/IR/1000121031
Keskar, N.S., Mudigere, D., Nocedal, J., Smelyanskiy, M., Tang, P.T.P.: On large-batch training for deep learning: generalization gap and sharp minima. arXiv (2016). http://arxiv.org/abs/1609.04836
Li, J., Wang, Y., Lyu, M.R., King, I.: Code completion with neural attention and pointer networks. 27th IJCAI, July 2018. https://doi.org/10.24963/ijcai.2018/578
Loshchilov, I., Hutter, F.: Fixing weight decay regularization in adam. arXiv (2017). http://arxiv.org/abs/1711.05101
Mirakhorli, M., Cleland-Huang, J.: Detecting, tracing, and monitoring architectural tactics in code. IEEE Trans. Softw. Eng. 42(3), 205–220 (2016). https://doi.org/10.1109/TSE.2015.2479217
Mirakhorli, M., Shin, Y., Cleland-Huang, J., Cinar, M.: A tactic-centric approach for automating traceability of quality concerns. In: 34th ICSE, pp. 639–649, June 2012. https://doi.org/10.1109/ICSE.2012.6227153
Mirakhorli, M., Cleland-Huang, J.: Tracing architectural concerns in high assurance systems. In: 33rd ICSE, pp. 908–911. ACM (2011)
Mirakhorli, M., et al.: Archie. https://github.com/SoftwareDesignLab/Archie
Niven, T., Kao, H.Y.: Probing neural network comprehension of natural language arguments. In: 57th ACL (2019). https://doi.org/10.18653/v1/P19-1459
Prechelt, L.: Why we need an explicit forum for negative results. J. Univ. Comput. Sci. 3(9), 1074–1083 (1997)
Raychev, V., Vechev, M., Yahav, E.: Code completion with statistical language models. In: 35th ACM SIGPLAN PLDI, pp. 419–428. New York, NY, USA (2014). https://doi.org/10.1145/2594291.2594321
Sharma, T., Efstathiou, V., Louridas, P., Spinellis, D.: On the feasibility of transfer-learning code smells using deep learning. arXiv (2019). http://arxiv.org/abs/1904.03031
Sun, C., Qiu, X., Xu, Y., Huang, X.: How to fine-tune bert for text classification? arXiv (2019). http://arxiv.org/abs/1905.05583
Tenney, I., Das, D., Pavlick, E.: BERT rediscovers the classical NLP pipeline. In: 57th ACL, pp. 4593–4601. ACL, Florence, Italy, July 2019. https://doi.org/10.18653/v1/P19-1452
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Keim, J., Kaplan, A., Koziolek, A., Mirakhorli, M. (2020). Does BERT Understand Code? – An Exploratory Study on the Detection of Architectural Tactics in Code. In: Jansen, A., Malavolta, I., Muccini, H., Ozkaya, I., Zimmermann, O. (eds) Software Architecture. ECSA 2020. Lecture Notes in Computer Science(), vol 12292. Springer, Cham. https://doi.org/10.1007/978-3-030-58923-3_15
Download citation
DOI: https://doi.org/10.1007/978-3-030-58923-3_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58922-6
Online ISBN: 978-3-030-58923-3
eBook Packages: Computer ScienceComputer Science (R0)