Skip to main content
Log in

Mining and relating design contexts and design patterns from Stack Overflow

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Design contexts are factors that shape a design, and whilst they are recognised by developers, they are typically tacit. Unlike software requirements, software engineering researchers have paid little attention to design contexts and there is little or no systematic research on how design contexts influence design. In this paper, we conduct an empirical investigation using Stack Overflow with the aim of mining design context knowledge that is related to design patterns. We chose to study design patterns because they are clear and identifiable. In this work, we develop a new taxonomy of design context terms related to design patterns. We introduce a new automated mining approach, DPC Miner, for mining design context knowledge from Stack Overflow. Finally, we analyse the Stack Overflow posts and present how design context impacts decisions about design patterns in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. https://archive.org/download/stackexchange

  2. https://www.nltk.org/api/nltk.tokenize.html

  3. http://www.nltk.org/book/ch05.html

  4. https://github.com/laksW/Mining-and-Relating-Design-Contexts-and-DesignPatterns-from-Stack-Overflow.git

References

  • Adam (2007) Entity Systems are the future of MMOG development – Part 1. http://t-machine.org/index.php/2007/09/03/entity-systems-are-the-future-of-mmog-development-part-1/

  • Ahmad A, Chong F, Shi G, Yousif A (2018) A survey on mining stack overflow: question and answering (Q&A) community. Data Technol Appl 52(2)

  • Ali I, Asif M, Shahbaz M, Khalid A, Rehman M, Guergachi A (2018) Text categorization approach for secure design pattern selection using software requirement specification. IEEE Access 6:73928–73939. https://doi.org/10.1109/ACCESS.2018.2883077

    Article  Google Scholar 

  • Allamanis M, Sutton C (2013a) Mining source code repositories at massive scale using language modeling. In: Proceedings of the 10th working conference on mining software repositories, MSR ’13. IEEE Press, Piscataway, pp 207–216. http://dl.acm.org/citation.cfm?id=2487085.2487127

  • Allamanis M, Sutton C (2013b) Mining source code repositories at massive scale using language modeling. In: IEEE international working conference on mining software repositories, (Iim), pp 207–216. https://doi.org/10.1109/MSR.2013.6624029

  • Alreshedy K, Dharmaretnam D, M German D, Srinivasan V, A Gulliver T (2018) Predicting the programming language of questions and snippets of stackoverflow using natural language processing. arXiv:1809.07954

  • Ampatzoglou A, Charalampidou S, Stamelos I (2013) Research state of the art on GoF design patterns: A mapping study. J Syst Softw 86(7):1945–1964. https://doi.org/10.1016/j.jss.2013.03.063

    Article  Google Scholar 

  • Babar MA, Dingsøyr T, Lago P, Van Vliet H (2009) Software architecture knowledge management: Theory and practice. https://doi.org/10.1007/978-3-642-02374-3

  • Barua A, Thomas SW, Hassan AE (2014) What are developers talking about? an analysis of topics and trends in stack overflow. Empir Softw Eng 19 (3):619–654

    Article  Google Scholar 

  • Bass L, Clements P, Kazmanm R (2012) Software architecture in practice, 3rd edn. Addison-Wesley Professional, Boston

    Google Scholar 

  • Bedjeti A, Lago P, Lewis GA, De Boer RD, Hilliard R (2017) Viewpoint: Modeling context with an architecture. In: Proceedings - 2017 IEEE international conference on software architecture, ICSA 2017, pp 117–120. https://doi.org/10.1109/ICSA.2017.26

  • Belecheanu R, Riedel J, Pawar KS (2006) A conceptualisation of design context to explain design trade-offs in the automotive industry. R D Manag 36 (5):517–529. https://doi.org/10.1111/j.1467-9310.2006.00451.x

    Article  Google Scholar 

  • Bengio Y, Ducharme R, Vincent P, Jauvin C (2003) A neural probabilistic language model. J Mach Learn Res 3(Feb):1137–1155

    MATH  Google Scholar 

  • Beyer S, Macho C, Di Penta M, Pinzger M (2019) What kind of questions do developers ask on Stack Overflow? A comparison of automated approaches to classify posts into question categories. Empir Softw Eng. https://doi.org/10.1007/s10664-019-09758-x

  • Bi T, Liang P, Tang A (2018) Architecture patterns, quality attributes, and design contexts: How developers design with them. In: Proceedings - Asia-pacific software engineering conference, APSEC, 2018-Decem(December), pp 49–58. https://doi.org/10.1109/APSEC.2018.00019

  • Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3(Jan):993–1022

    MATH  Google Scholar 

  • Borg M, Wnuk K, Regnell B, Runeson P (2017) Supporting change impact analysis using a recommendation an industrial case study in a system: safety-critical context. IEEE Trans Softw Eng 43(7):675–700. https://doi.org/10.1109/TSE.2016.2620458

    Article  Google Scholar 

  • Buschmann F, Henney K (1993) Pattern-oriented software architecture

  • Cai X, Zhu J, Shen B, Chen Y (2016) Greta: Graph-based tag assignment for github repositories. In: Computer software and applications conference (COMPSAC), 2016 IEEE 40th Annual, vol 1. IEEE, pp 63–72

  • Carlson J, Papatheocharous E, Petersen K (2016) A context model for architectural decision support. In: Proceedings - 2016 1st international workshop on decision making in software ARCHitecture, MARCH 2016, pp 9–15. https://doi.org/10.1109/MARCH.2016.6

  • Casamayor A, Godoy D, Campo M (2012) Functional grouping of natural language requirements for assistance in architectural software design, vol 30, pp 78–86. https://doi.org/10.1016/j.knosys.2011.12.009. http://www.sciencedirect.com/science/article/pii/S0950705111002759

  • Chattopadhyay S, Nelson N, Nam T, Calvert M, Sarma A (2018) Context in programming: an investigation of how programmers create context. pp 33–36. https://doi.org/10.1145/3195836.3195861

  • Chen C, Xing Z, Han L (2016) TechLand: Assisting technology landscape inquiries with insights from stack overflow. In: 2016 IEEE international conference on software maintenance and evolution (ICSME). pp 356–366. https://doi.org/10.1109/ICSME.2016.17

  • Chen T-H, Thomas SW, Nagappan M, Hassan AE (2012) Explaining software defects using topic models. In: Proceedings of the 9th IEEE working conference on mining software repositories, MSR ’12. Piscataway, IEEE Press, pp 189–198. http://dl.acm.org/citation.cfm?id=2664446.2664476

  • Choi J, Choi C, Kim H, Kim P (2011) Efficient malicious code detection using N-gram analysis and SVM. In: Proceedings - 2011 International conference on network-based information systems, NBiS 2011, pp 618–621. https://doi.org/10.1109/NBiS.2011.104

  • Clarke P, O’connor RV (2012) Towards a comprehensive reference framework, vol 54, pp 433–447. http://doras.dcu.ie/16823/1/ClarkeAndOConnor-Vol54No5-pp433-447.pdf

  • Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391–407

    Article  Google Scholar 

  • Dybå T, Moe NB, Arisholm E (2005) Measuring software methodology usage: Challenges of conceptualization and operationalization. In: 2005 International symposium on empirical software engineering, ISESE 2005, pp 447–457. https://doi.org/10.1109/ISESE.2005.1541852

  • Dybå T, Sjøberg DI, Cruzes DS (2012) What works for whom, where, when, and why? On the role of context in empirical software engineering. In: International symposium on empirical software engineering and measurement, (7465), pp 19–28. https://doi.org/10.1145/2372251.2372256

  • Evans E (2004) Domain-driven design: tackling complexity in the heart of software. Addison-Wesley, Boston

    Google Scholar 

  • Fawcett T, An introduction to ROC (2006) analysis. Pattern Recognit Lett 27(8):861–874. https://doi.org/10.1016/j.patrec.2005.10.010

    Article  MathSciNet  Google Scholar 

  • Feitosa D, Ampatzoglou A, Avgeriou P, Chatzigeorgiou A, Nakagawa E (2019) What can violations of good practices tell about the relationship between GoF patterns and run-time quality attributes?, vol 105, pp 1–16. https://doi.org/10.1016/j.infsof.2018.07.014. http://www.sciencedirect.com/science/article/pii/S0950584918301617

  • Fielding R (2000) Architectural styles and the design of network -based software architectures. http://search.proquest.com/docview/304591392/

  • Fleiss JL (1971) Measuring nominal scale agreement among many raters. Psychol Bull 76(5):378–382

    Article  Google Scholar 

  • Galster M, Avgeriou P (2012) Qualitative analysis of the impact of SOA patterns on quality attributes. In: Proceedings - international conference on quality software, pp 167–170. https://doi.org/10.1109/QSIC.2012.35

  • Gamma E, Helm R, Johnson R, Vlissides J (1995) Design patterns: elements of reusable object-oriented software. Addison-Wesley Longman Publishing Co., Inc., Boston

    MATH  Google Scholar 

  • Gokyer G, Cetin S, Sener C, Yondem MT (2008) Non-functional requirements to architectural concerns: ML and NLP at crossroads. In: 2008 the third international conference on software engineering advances. pp 400–406. https://doi.org/10.1109/ICSEA.2008.28

  • Goodman JT (2001) A bit of progress in language modeling. Comput Speech Lang 15(4):403–434. https://doi.org/10.1006/csla.2001.0174

    Article  Google Scholar 

  • Groher I, Weinreich R (2015) A study on architectural decision-making in context. In: Proceedings - 12th Working IEEE/IFIP conference on software architecture, WICSA 2015, pp 11–20. https://doi.org/10.1109/WICSA.2015.27

  • Harper KE, Zheng J (2015) Exploring software architecture context. In: Proceedings - 12th working IEEE/IFIP conference on software architecture, WICSA 2015, pp 123–126. https://doi.org/10.1109/WICSA.2015.22

  • Harris ZS (1954) Distributional structure. WORD 10(2-3):146–162. https://doi.org/10.1080/00437956.1954.11659520

    Article  Google Scholar 

  • Harrison NB, Avgeriou P (2007) Leveraging architecture patterns to satisfy quality attributes. In: European conference on software architecture, 4758 LNCS. pp 263–270. https://doi.org/10.1007/978-3-540-75132-8_21

  • Hindle A, Barr ET, Gabel M, Su Z, Devanbu P (2016) On the naturalness of software. Commun ACM 59(5):122–131. https://doi.org/10.1145/2902362

    Article  Google Scholar 

  • Hussain S, Keung J, Khan AA (2017) Software design patterns classification and selection using text categorization approach. Appl Soft Comput J 58:225–244. https://doi.org/10.1016/j.asoc.2017.04.043

    Article  Google Scholar 

  • Jacobson I (2004) Object-oriented software engineering: a use case driven approach. Addison Wesley Longman Publishing Co., Inc., Boston

    Google Scholar 

  • Kawaguchi S, Garg PK, Matsushita M, Inoue K (2003) Automatic categorization algorithm for evolvable software archive, pp 195–200. https://doi.org/10.1109/IWPSE.2003.1231227

  • Khomh F, Guėhėneuc YG (2008) Do design patterns impact software quality positively?. In: Proceedings of the European conference on software maintenance and reengineering, CSMR, pp 274–278. https://doi.org/10.1109/CSMR.2008.4493325

  • Kitchenham BA, Pfleeger SL, Pickard LM, Jones PW, Hoaglin DC, El Emam K, Rosenberg J (2002) Preliminary guidelines for empirical research in software engineering. IEEE Trans Softw Eng 28(8):721–734. https://doi.org/10.1109/TSE.2002.1027796

    Article  Google Scholar 

  • Kyakulumbye S, Pather S, Jantjies M (2019) Knowledge creation in a participatory design context: The use of empathetic participatory design. Electron J Knowl Manag 17(1):49–65

    Google Scholar 

  • Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33(1):159–174. http://www.jstor.org/stable/2529310

    Article  Google Scholar 

  • Linstead E, Rigor P, Bajracharya S, Lopes C, Baldi P (2007a) Mining concepts from code with probabilistic topic models. In: ASE’07 - 2007 ACM/IEEE international conference on automated software engineering, pp 461–464. https://doi.org/10.1145/1321631.1321709

  • Linstead E, Rigor P, Bajracharya S, Lopes C (2007b) Mining eclipse developer contributions via author-topic models. In: Proceedings - ICSE 2007 workshops: fourth international workshop on mining software repositories, MSR 2007, pp 7–10. https://doi.org/10.1109/MSR.2007.20

  • Liu D, Jiang H, Li X, Ren Z, Qiao L, Ding Z (2020) DPWord2Vec: better representation of design patterns in semantics. IEEE Trans Softw Eng 5589(c):1–1. https://doi.org/10.1109/tse.2020.3017336

    Google Scholar 

  • Lukins SK, Kraft NA, Etzkorn LH (2008) Source code retrieval for bug localization using latent Dirichlet allocation. In: Proceedings - working conference on reverse engineering, WCRE, pp 155–164. https://doi.org/10.1109/WCRE.2008.33

  • Marcus A, Sergeyev A, Rajlieh V, Maletic JI (2004) An information retrieval approach to concept location in source code. In: Proceedings - working conference on reverse engineering, WCRE, pp 214–223. https://doi.org/10.1109/WCRE.2004.10

  • Marcus A, Rajlich V, Buchta J, Petrenko M, Sergeyev A (2005) Static techniques for concept location in object-oriented code. In: Proceedings - IEEE workshop on program comprehension, pp 33–42. https://doi.org/10.1109/wpc.2005.33

  • Mikolov T, Chen K, Corrado G, Dean J (2013a) Efficient estimation of word representations in vector space. https://doi.org/10.1162/153244303322533223. arXiv:1301.3781

  • Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013b) Distributed representations ofwords and phrases and their compositionality. In: Advances in neural information processing systems. pp 1–9

  • Mirakhorli M, Cleland-Huang J (2016) Detecting, tracing, and monitoring architectural tactics in code. IEEE Trans Softw Eng 42(3):205–220. https://doi.org/10.1109/TSE.2015.2479217

    Article  Google Scholar 

  • Mirakhorli M, Shin Y, Cleland-Huang J, Cinar M (2012) A tactic-centric approach for automating traceability of quality concerns. In: 2012 34th international conference on software engineering (ICSE). pp 639–649. https://doi.org/10.1109/ICSE.2012.6227153

  • Papatheocharous E, Sentilles S, Petersen K, Shah SMA, Cicchetti A, Gorschek T (2015) Decision support for choosing architectural assets in the development of software-intensive systems: The GRADE taxonomy. In: ACM international conference proceeding series 07-11-Sept. https://doi.org/10.1145/2797433.2797483

  • Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesna E (2011) Scikit-learn: Machine learning in {P}ython. J Mach Learn Res 12:2825–2830

    MathSciNet  MATH  Google Scholar 

  • Petersen K, Wohlin C (2009) Context in industrial software engineering research

  • Poshyvanyk D, Guėhėneuc YG, Marcus A, Antoniol G, Rajlich V (2006) Combining probabilistic ranking and latent semantic indexing for feature identification, pp 137–146. https://doi.org/10.1109/ICPC.2006.17

  • Power K, Wirfs-Brock R (2018) Understanding architecture decisions in context. In: European conference on software architecture. PowerKenandWirfs-Brock2018UnderstandingContext, vol 1. Springer International Publishing, pp 147–155, https://doi.org/10.1007/978-3-030-00761-4

  • Riaz M, Breaux T, Williams L (2015) How have we evaluated software pattern application? A systematic mapping study of research design practices. Inf Softw Technol 65:14–38. https://doi.org/10.1016/j.infsof.2015.04.002

    Article  Google Scholar 

  • Riehle D (2011) Lessons learned from using design patterns in industry projects. In: Lecture notes in computer science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 10.1007/978-3-642-19432-0_1, vol 6510, pp 1–15

  • Rosenfeld R (2000) Two decades of statistical language modeling: where do we go from here? Proc IEEE 88(8):1270–1278. https://doi.org/10.1109/5.880083

    Article  Google Scholar 

  • Salton G, Wong A, Yang CS (1975) A vector space model for automatic indexing. Commun ACM 18(11):613–620. https://doi.org/10.1145/361219.361220

    Article  Google Scholar 

  • Schmidt DC, Stal M, Rohnert H, Buschmann F (2013) Pattern-oriented, software architecture patterns for concurrent and networked objects, vol 2. Hoboken, Wiley

    Google Scholar 

  • Sillito J, Maurer F, Nasehi SM, Burns C (2012) What makes a good code example?: A study of programming Q&A in StackOverflow. https://doi.org/10.1109/ICSM.2012.6405249, pp 25–34

  • Soliman M, Galster M, Salama AR, Riebisch M (2016) Architectural knowledge for technology decisions in developer communities an exploratory study with stackoverflow. In: 2016 13th Working IEEE/IFIP conference on software architecture (WICSA). pp 128–133. https://doi.org/10.1109/WICSA.2016.13

  • Song F, Croft WB (1999) General language model for information retrieval. In: International conference on information and knowledge management, Proceedings, pp 316–321. https://doi.org/10.1145/319950.320022

  • Tang A, Lau M (2014) Software architecture review by association. J Syst Softw 88(1):87–101. https://doi.org/10.1016/j.jss.2013.09.044

    Article  Google Scholar 

  • Tang A, Kuo F-C, Lau M (2008) Towards independent software architecture review, pp 306–313. https://doi.org/10.1007/978-3-540-88030-1_25

  • Thomas SW (2011) Mining software repositories using topic models. In: Proceedings of the 33rd international conference on software engineering, iCSE ’11. https://doi.org/10.1145/1985793.1986020. ACM, New York, pp 1138–1139

  • Tian F, Liang P, Babar MA (2019) How developers discuss architecture smells? An exploratory study on stack overflow. In: Proceedings - 2019 IEEE international conference on software architecture, ICSA 2019, pp 91–100. https://doi.org/10.1109/ICSA.2019.00018

  • Tian K, Revelle M, Poshyvanyk D (2009) Using latent dirichlet allocation for automatic categorization of software. In: Proceedings of the 2009 6th IEEE international working conference on mining software repositories, MSR 2009, pp 163–166. https://doi.org/10.1109/MSR.2009.5069496

  • Velasco-Elizondo P, Mari̇n-Piṅa R, Vazquez-Reyes S, Mora-Soto A, Mejia J (2016) Knowledge representation and information extraction for analysing architectural patterns. Sci Comput Program 121:176–189. https://doi.org/10.1016/j.scico.2015.12.007

    Article  Google Scholar 

  • Washizaki H, Ogata S, Hazeyama A, Okubo T, Fernandez EB, Yoshioka N (2020) Landscape of architecture and design patterns for IoT systems. IEEE Internet Things J 7(10):10091–10101. https://doi.org/10.1109/JIOT.2020.3003528

    Article  Google Scholar 

  • Xu B, Ye D, Xing Z, Xia X, Chen G, Li S (2016) Predicting semantically linkable knowledge in developer online forums via convolutional neural network. In: Proceedings of the 31st IEEE/ACM international conference on automated software engineering - ASE 2016, (Id 510357). pp 51–62. https://doi.org/10.1145/2970276.2970357. http://dl.acm.org/citation.cfm?doid=2970276.2970357

  • Xu B, Xing Z, Xia X, Lo D (2017) AnswerBot: Automated generation of answer summary to developers’ technical questions. In: ASE 2017 - Proceedings of the 32nd IEEE/ACM international conference on automated software engineering, pp 706–716. https://doi.org/10.1109/ASE.2017.8115681

  • Yang J, Tao K, Bozzon A, Houben G-J (2014) Sparrows and owls: Characterisation of expert behaviour in stackoverflow. In: International conference on user modeling, adaptation, and personalization. Springer, pp 266–277

  • Zaiontz C (2021) Real statistics using excel. real-statistics.com/reliability/interrater-reliability/fleiss-kappa/

  • Zamudio Lopez SA, Santaolaya Salgado R, Fragoso Diaz OG (2012) Restructuring object-oriented frameworks to model-view-adapter architecture. IEEE Latin Am Trans 10(4):2010–2016. https://doi.org/10.1109/TLA.2012.6272488

    Article  Google Scholar 

  • Zanoni M, Arcelli Fontana F, Stella F (2015) On applying machine learning techniques for design pattern detection. J Syst Softw 103:102–117. https://doi.org/10.1016/j.jss.2015.01.037

    Article  Google Scholar 

  • Zhang C, Budgen D (2012) What do we know about the effectiveness of software design patterns? IEEE Trans Softw Eng 38(5):1213–1231. https://doi.org/10.1109/TSE.2011.79

    Article  Google Scholar 

  • Zhang WE, Sheng QZ, Lau JH, Abebe E (2017) Detecting duplicate posts in programming QA communities via latent semantics and association rules. pp 1221–1229. https://doi.org/10.1145/3038912.3052701

  • Zhang Y, Witte R, Rilling J, Haarslev V (2006) Ontology-based program comprehension tool supporting website architectural evolution. In: 2006 Eighth IEEE international symposium on web site evolution (WSE’06). pp 41–49. https://doi.org/10.1109/WSE.2006.15

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Laksri Wijerathna.

Additional information

Communicated by: Shaowei Wang, Tse-Hsun (Peter) Chen, Sebastian Baltes, Ivano Malavolta, Christoph Treude, and Alexander Serebrenik

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Collective Knowledge in Software Engineering

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wijerathna, L., Aleti, A., Bi, T. et al. Mining and relating design contexts and design patterns from Stack Overflow. Empir Software Eng 27, 8 (2022). https://doi.org/10.1007/s10664-021-10034-0

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10664-021-10034-0

Keywords

Navigation