Abstract
Design contexts are factors that shape a design, and whilst they are recognised by developers, they are typically tacit. Unlike software requirements, software engineering researchers have paid little attention to design contexts and there is little or no systematic research on how design contexts influence design. In this paper, we conduct an empirical investigation using Stack Overflow with the aim of mining design context knowledge that is related to design patterns. We chose to study design patterns because they are clear and identifiable. In this work, we develop a new taxonomy of design context terms related to design patterns. We introduce a new automated mining approach, DPC Miner, for mining design context knowledge from Stack Overflow. Finally, we analyse the Stack Overflow posts and present how design context impacts decisions about design patterns in practice.
Similar content being viewed by others
References
Adam (2007) Entity Systems are the future of MMOG development – Part 1. http://t-machine.org/index.php/2007/09/03/entity-systems-are-the-future-of-mmog-development-part-1/
Ahmad A, Chong F, Shi G, Yousif A (2018) A survey on mining stack overflow: question and answering (Q&A) community. Data Technol Appl 52(2)
Ali I, Asif M, Shahbaz M, Khalid A, Rehman M, Guergachi A (2018) Text categorization approach for secure design pattern selection using software requirement specification. IEEE Access 6:73928–73939. https://doi.org/10.1109/ACCESS.2018.2883077
Allamanis M, Sutton C (2013a) Mining source code repositories at massive scale using language modeling. In: Proceedings of the 10th working conference on mining software repositories, MSR ’13. IEEE Press, Piscataway, pp 207–216. http://dl.acm.org/citation.cfm?id=2487085.2487127
Allamanis M, Sutton C (2013b) Mining source code repositories at massive scale using language modeling. In: IEEE international working conference on mining software repositories, (Iim), pp 207–216. https://doi.org/10.1109/MSR.2013.6624029
Alreshedy K, Dharmaretnam D, M German D, Srinivasan V, A Gulliver T (2018) Predicting the programming language of questions and snippets of stackoverflow using natural language processing. arXiv:1809.07954
Ampatzoglou A, Charalampidou S, Stamelos I (2013) Research state of the art on GoF design patterns: A mapping study. J Syst Softw 86(7):1945–1964. https://doi.org/10.1016/j.jss.2013.03.063
Babar MA, Dingsøyr T, Lago P, Van Vliet H (2009) Software architecture knowledge management: Theory and practice. https://doi.org/10.1007/978-3-642-02374-3
Barua A, Thomas SW, Hassan AE (2014) What are developers talking about? an analysis of topics and trends in stack overflow. Empir Softw Eng 19 (3):619–654
Bass L, Clements P, Kazmanm R (2012) Software architecture in practice, 3rd edn. Addison-Wesley Professional, Boston
Bedjeti A, Lago P, Lewis GA, De Boer RD, Hilliard R (2017) Viewpoint: Modeling context with an architecture. In: Proceedings - 2017 IEEE international conference on software architecture, ICSA 2017, pp 117–120. https://doi.org/10.1109/ICSA.2017.26
Belecheanu R, Riedel J, Pawar KS (2006) A conceptualisation of design context to explain design trade-offs in the automotive industry. R D Manag 36 (5):517–529. https://doi.org/10.1111/j.1467-9310.2006.00451.x
Bengio Y, Ducharme R, Vincent P, Jauvin C (2003) A neural probabilistic language model. J Mach Learn Res 3(Feb):1137–1155
Beyer S, Macho C, Di Penta M, Pinzger M (2019) What kind of questions do developers ask on Stack Overflow? A comparison of automated approaches to classify posts into question categories. Empir Softw Eng. https://doi.org/10.1007/s10664-019-09758-x
Bi T, Liang P, Tang A (2018) Architecture patterns, quality attributes, and design contexts: How developers design with them. In: Proceedings - Asia-pacific software engineering conference, APSEC, 2018-Decem(December), pp 49–58. https://doi.org/10.1109/APSEC.2018.00019
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3(Jan):993–1022
Borg M, Wnuk K, Regnell B, Runeson P (2017) Supporting change impact analysis using a recommendation an industrial case study in a system: safety-critical context. IEEE Trans Softw Eng 43(7):675–700. https://doi.org/10.1109/TSE.2016.2620458
Buschmann F, Henney K (1993) Pattern-oriented software architecture
Cai X, Zhu J, Shen B, Chen Y (2016) Greta: Graph-based tag assignment for github repositories. In: Computer software and applications conference (COMPSAC), 2016 IEEE 40th Annual, vol 1. IEEE, pp 63–72
Carlson J, Papatheocharous E, Petersen K (2016) A context model for architectural decision support. In: Proceedings - 2016 1st international workshop on decision making in software ARCHitecture, MARCH 2016, pp 9–15. https://doi.org/10.1109/MARCH.2016.6
Casamayor A, Godoy D, Campo M (2012) Functional grouping of natural language requirements for assistance in architectural software design, vol 30, pp 78–86. https://doi.org/10.1016/j.knosys.2011.12.009. http://www.sciencedirect.com/science/article/pii/S0950705111002759
Chattopadhyay S, Nelson N, Nam T, Calvert M, Sarma A (2018) Context in programming: an investigation of how programmers create context. pp 33–36. https://doi.org/10.1145/3195836.3195861
Chen C, Xing Z, Han L (2016) TechLand: Assisting technology landscape inquiries with insights from stack overflow. In: 2016 IEEE international conference on software maintenance and evolution (ICSME). pp 356–366. https://doi.org/10.1109/ICSME.2016.17
Chen T-H, Thomas SW, Nagappan M, Hassan AE (2012) Explaining software defects using topic models. In: Proceedings of the 9th IEEE working conference on mining software repositories, MSR ’12. Piscataway, IEEE Press, pp 189–198. http://dl.acm.org/citation.cfm?id=2664446.2664476
Choi J, Choi C, Kim H, Kim P (2011) Efficient malicious code detection using N-gram analysis and SVM. In: Proceedings - 2011 International conference on network-based information systems, NBiS 2011, pp 618–621. https://doi.org/10.1109/NBiS.2011.104
Clarke P, O’connor RV (2012) Towards a comprehensive reference framework, vol 54, pp 433–447. http://doras.dcu.ie/16823/1/ClarkeAndOConnor-Vol54No5-pp433-447.pdf
Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391–407
Dybå T, Moe NB, Arisholm E (2005) Measuring software methodology usage: Challenges of conceptualization and operationalization. In: 2005 International symposium on empirical software engineering, ISESE 2005, pp 447–457. https://doi.org/10.1109/ISESE.2005.1541852
Dybå T, Sjøberg DI, Cruzes DS (2012) What works for whom, where, when, and why? On the role of context in empirical software engineering. In: International symposium on empirical software engineering and measurement, (7465), pp 19–28. https://doi.org/10.1145/2372251.2372256
Evans E (2004) Domain-driven design: tackling complexity in the heart of software. Addison-Wesley, Boston
Fawcett T, An introduction to ROC (2006) analysis. Pattern Recognit Lett 27(8):861–874. https://doi.org/10.1016/j.patrec.2005.10.010
Feitosa D, Ampatzoglou A, Avgeriou P, Chatzigeorgiou A, Nakagawa E (2019) What can violations of good practices tell about the relationship between GoF patterns and run-time quality attributes?, vol 105, pp 1–16. https://doi.org/10.1016/j.infsof.2018.07.014. http://www.sciencedirect.com/science/article/pii/S0950584918301617
Fielding R (2000) Architectural styles and the design of network -based software architectures. http://search.proquest.com/docview/304591392/
Fleiss JL (1971) Measuring nominal scale agreement among many raters. Psychol Bull 76(5):378–382
Galster M, Avgeriou P (2012) Qualitative analysis of the impact of SOA patterns on quality attributes. In: Proceedings - international conference on quality software, pp 167–170. https://doi.org/10.1109/QSIC.2012.35
Gamma E, Helm R, Johnson R, Vlissides J (1995) Design patterns: elements of reusable object-oriented software. Addison-Wesley Longman Publishing Co., Inc., Boston
Gokyer G, Cetin S, Sener C, Yondem MT (2008) Non-functional requirements to architectural concerns: ML and NLP at crossroads. In: 2008 the third international conference on software engineering advances. pp 400–406. https://doi.org/10.1109/ICSEA.2008.28
Goodman JT (2001) A bit of progress in language modeling. Comput Speech Lang 15(4):403–434. https://doi.org/10.1006/csla.2001.0174
Groher I, Weinreich R (2015) A study on architectural decision-making in context. In: Proceedings - 12th Working IEEE/IFIP conference on software architecture, WICSA 2015, pp 11–20. https://doi.org/10.1109/WICSA.2015.27
Harper KE, Zheng J (2015) Exploring software architecture context. In: Proceedings - 12th working IEEE/IFIP conference on software architecture, WICSA 2015, pp 123–126. https://doi.org/10.1109/WICSA.2015.22
Harris ZS (1954) Distributional structure. WORD 10(2-3):146–162. https://doi.org/10.1080/00437956.1954.11659520
Harrison NB, Avgeriou P (2007) Leveraging architecture patterns to satisfy quality attributes. In: European conference on software architecture, 4758 LNCS. pp 263–270. https://doi.org/10.1007/978-3-540-75132-8_21
Hindle A, Barr ET, Gabel M, Su Z, Devanbu P (2016) On the naturalness of software. Commun ACM 59(5):122–131. https://doi.org/10.1145/2902362
Hussain S, Keung J, Khan AA (2017) Software design patterns classification and selection using text categorization approach. Appl Soft Comput J 58:225–244. https://doi.org/10.1016/j.asoc.2017.04.043
Jacobson I (2004) Object-oriented software engineering: a use case driven approach. Addison Wesley Longman Publishing Co., Inc., Boston
Kawaguchi S, Garg PK, Matsushita M, Inoue K (2003) Automatic categorization algorithm for evolvable software archive, pp 195–200. https://doi.org/10.1109/IWPSE.2003.1231227
Khomh F, Guėhėneuc YG (2008) Do design patterns impact software quality positively?. In: Proceedings of the European conference on software maintenance and reengineering, CSMR, pp 274–278. https://doi.org/10.1109/CSMR.2008.4493325
Kitchenham BA, Pfleeger SL, Pickard LM, Jones PW, Hoaglin DC, El Emam K, Rosenberg J (2002) Preliminary guidelines for empirical research in software engineering. IEEE Trans Softw Eng 28(8):721–734. https://doi.org/10.1109/TSE.2002.1027796
Kyakulumbye S, Pather S, Jantjies M (2019) Knowledge creation in a participatory design context: The use of empathetic participatory design. Electron J Knowl Manag 17(1):49–65
Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33(1):159–174. http://www.jstor.org/stable/2529310
Linstead E, Rigor P, Bajracharya S, Lopes C, Baldi P (2007a) Mining concepts from code with probabilistic topic models. In: ASE’07 - 2007 ACM/IEEE international conference on automated software engineering, pp 461–464. https://doi.org/10.1145/1321631.1321709
Linstead E, Rigor P, Bajracharya S, Lopes C (2007b) Mining eclipse developer contributions via author-topic models. In: Proceedings - ICSE 2007 workshops: fourth international workshop on mining software repositories, MSR 2007, pp 7–10. https://doi.org/10.1109/MSR.2007.20
Liu D, Jiang H, Li X, Ren Z, Qiao L, Ding Z (2020) DPWord2Vec: better representation of design patterns in semantics. IEEE Trans Softw Eng 5589(c):1–1. https://doi.org/10.1109/tse.2020.3017336
Lukins SK, Kraft NA, Etzkorn LH (2008) Source code retrieval for bug localization using latent Dirichlet allocation. In: Proceedings - working conference on reverse engineering, WCRE, pp 155–164. https://doi.org/10.1109/WCRE.2008.33
Marcus A, Sergeyev A, Rajlieh V, Maletic JI (2004) An information retrieval approach to concept location in source code. In: Proceedings - working conference on reverse engineering, WCRE, pp 214–223. https://doi.org/10.1109/WCRE.2004.10
Marcus A, Rajlich V, Buchta J, Petrenko M, Sergeyev A (2005) Static techniques for concept location in object-oriented code. In: Proceedings - IEEE workshop on program comprehension, pp 33–42. https://doi.org/10.1109/wpc.2005.33
Mikolov T, Chen K, Corrado G, Dean J (2013a) Efficient estimation of word representations in vector space. https://doi.org/10.1162/153244303322533223. arXiv:1301.3781
Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013b) Distributed representations ofwords and phrases and their compositionality. In: Advances in neural information processing systems. pp 1–9
Mirakhorli M, Cleland-Huang J (2016) Detecting, tracing, and monitoring architectural tactics in code. IEEE Trans Softw Eng 42(3):205–220. https://doi.org/10.1109/TSE.2015.2479217
Mirakhorli M, Shin Y, Cleland-Huang J, Cinar M (2012) A tactic-centric approach for automating traceability of quality concerns. In: 2012 34th international conference on software engineering (ICSE). pp 639–649. https://doi.org/10.1109/ICSE.2012.6227153
Papatheocharous E, Sentilles S, Petersen K, Shah SMA, Cicchetti A, Gorschek T (2015) Decision support for choosing architectural assets in the development of software-intensive systems: The GRADE taxonomy. In: ACM international conference proceeding series 07-11-Sept. https://doi.org/10.1145/2797433.2797483
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesna E (2011) Scikit-learn: Machine learning in {P}ython. J Mach Learn Res 12:2825–2830
Petersen K, Wohlin C (2009) Context in industrial software engineering research
Poshyvanyk D, Guėhėneuc YG, Marcus A, Antoniol G, Rajlich V (2006) Combining probabilistic ranking and latent semantic indexing for feature identification, pp 137–146. https://doi.org/10.1109/ICPC.2006.17
Power K, Wirfs-Brock R (2018) Understanding architecture decisions in context. In: European conference on software architecture. PowerKenandWirfs-Brock2018UnderstandingContext, vol 1. Springer International Publishing, pp 147–155, https://doi.org/10.1007/978-3-030-00761-4
Riaz M, Breaux T, Williams L (2015) How have we evaluated software pattern application? A systematic mapping study of research design practices. Inf Softw Technol 65:14–38. https://doi.org/10.1016/j.infsof.2015.04.002
Riehle D (2011) Lessons learned from using design patterns in industry projects. In: Lecture notes in computer science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 10.1007/978-3-642-19432-0_1, vol 6510, pp 1–15
Rosenfeld R (2000) Two decades of statistical language modeling: where do we go from here? Proc IEEE 88(8):1270–1278. https://doi.org/10.1109/5.880083
Salton G, Wong A, Yang CS (1975) A vector space model for automatic indexing. Commun ACM 18(11):613–620. https://doi.org/10.1145/361219.361220
Schmidt DC, Stal M, Rohnert H, Buschmann F (2013) Pattern-oriented, software architecture patterns for concurrent and networked objects, vol 2. Hoboken, Wiley
Sillito J, Maurer F, Nasehi SM, Burns C (2012) What makes a good code example?: A study of programming Q&A in StackOverflow. https://doi.org/10.1109/ICSM.2012.6405249, pp 25–34
Soliman M, Galster M, Salama AR, Riebisch M (2016) Architectural knowledge for technology decisions in developer communities an exploratory study with stackoverflow. In: 2016 13th Working IEEE/IFIP conference on software architecture (WICSA). pp 128–133. https://doi.org/10.1109/WICSA.2016.13
Song F, Croft WB (1999) General language model for information retrieval. In: International conference on information and knowledge management, Proceedings, pp 316–321. https://doi.org/10.1145/319950.320022
Tang A, Lau M (2014) Software architecture review by association. J Syst Softw 88(1):87–101. https://doi.org/10.1016/j.jss.2013.09.044
Tang A, Kuo F-C, Lau M (2008) Towards independent software architecture review, pp 306–313. https://doi.org/10.1007/978-3-540-88030-1_25
Thomas SW (2011) Mining software repositories using topic models. In: Proceedings of the 33rd international conference on software engineering, iCSE ’11. https://doi.org/10.1145/1985793.1986020. ACM, New York, pp 1138–1139
Tian F, Liang P, Babar MA (2019) How developers discuss architecture smells? An exploratory study on stack overflow. In: Proceedings - 2019 IEEE international conference on software architecture, ICSA 2019, pp 91–100. https://doi.org/10.1109/ICSA.2019.00018
Tian K, Revelle M, Poshyvanyk D (2009) Using latent dirichlet allocation for automatic categorization of software. In: Proceedings of the 2009 6th IEEE international working conference on mining software repositories, MSR 2009, pp 163–166. https://doi.org/10.1109/MSR.2009.5069496
Velasco-Elizondo P, Mari̇n-Piṅa R, Vazquez-Reyes S, Mora-Soto A, Mejia J (2016) Knowledge representation and information extraction for analysing architectural patterns. Sci Comput Program 121:176–189. https://doi.org/10.1016/j.scico.2015.12.007
Washizaki H, Ogata S, Hazeyama A, Okubo T, Fernandez EB, Yoshioka N (2020) Landscape of architecture and design patterns for IoT systems. IEEE Internet Things J 7(10):10091–10101. https://doi.org/10.1109/JIOT.2020.3003528
Xu B, Ye D, Xing Z, Xia X, Chen G, Li S (2016) Predicting semantically linkable knowledge in developer online forums via convolutional neural network. In: Proceedings of the 31st IEEE/ACM international conference on automated software engineering - ASE 2016, (Id 510357). pp 51–62. https://doi.org/10.1145/2970276.2970357. http://dl.acm.org/citation.cfm?doid=2970276.2970357
Xu B, Xing Z, Xia X, Lo D (2017) AnswerBot: Automated generation of answer summary to developers’ technical questions. In: ASE 2017 - Proceedings of the 32nd IEEE/ACM international conference on automated software engineering, pp 706–716. https://doi.org/10.1109/ASE.2017.8115681
Yang J, Tao K, Bozzon A, Houben G-J (2014) Sparrows and owls: Characterisation of expert behaviour in stackoverflow. In: International conference on user modeling, adaptation, and personalization. Springer, pp 266–277
Zaiontz C (2021) Real statistics using excel. real-statistics.com/reliability/interrater-reliability/fleiss-kappa/
Zamudio Lopez SA, Santaolaya Salgado R, Fragoso Diaz OG (2012) Restructuring object-oriented frameworks to model-view-adapter architecture. IEEE Latin Am Trans 10(4):2010–2016. https://doi.org/10.1109/TLA.2012.6272488
Zanoni M, Arcelli Fontana F, Stella F (2015) On applying machine learning techniques for design pattern detection. J Syst Softw 103:102–117. https://doi.org/10.1016/j.jss.2015.01.037
Zhang C, Budgen D (2012) What do we know about the effectiveness of software design patterns? IEEE Trans Softw Eng 38(5):1213–1231. https://doi.org/10.1109/TSE.2011.79
Zhang WE, Sheng QZ, Lau JH, Abebe E (2017) Detecting duplicate posts in programming QA communities via latent semantics and association rules. pp 1221–1229. https://doi.org/10.1145/3038912.3052701
Zhang Y, Witte R, Rilling J, Haarslev V (2006) Ontology-based program comprehension tool supporting website architectural evolution. In: 2006 Eighth IEEE international symposium on web site evolution (WSE’06). pp 41–49. https://doi.org/10.1109/WSE.2006.15
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Shaowei Wang, Tse-Hsun (Peter) Chen, Sebastian Baltes, Ivano Malavolta, Christoph Treude, and Alexander Serebrenik
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article belongs to the Topical Collection: Collective Knowledge in Software Engineering
Rights and permissions
About this article
Cite this article
Wijerathna, L., Aleti, A., Bi, T. et al. Mining and relating design contexts and design patterns from Stack Overflow. Empir Software Eng 27, 8 (2022). https://doi.org/10.1007/s10664-021-10034-0
Accepted:
Published:
DOI: https://doi.org/10.1007/s10664-021-10034-0