Keywords

1 Introduction

Public and private organizations use information technology (IT) to improve their efficiency and service offerings [1]. During the development of IT-projects and the subsequent operation/maintenance, temporary suboptimal solutions are sometimes introduced to profit from the solutions faster. To capture this reality, Cunningham [2] coined the term technical debt (TD) to explain the process and pitfalls of programming to the management in the banking sector in 1992. Rios et al. [3, p. 117] describe TD as a conceptualization of “problems faced during software evolution considering the tasks that are not carried out adequately during software development.” They conducted a tertiary literature review and found a variation in the application of the term. TD is often associated with any impediment related to the software product and the development process [3]. Griffith et al. describe technical debt management (TDM) as comprising “the actions of identification, assessment, and remediation of technical debt throughout a software system” [4, p. 1016].

TDM is important because it enables an organization a more optimal use of its resources [5]. For instance: An organization’s IT-system breaks down, which leads to reduced production. For management, the possible solutions are to 1) accept the reduced production, 2) conduct a root cause analysis and fix the actual problem, 3) create a work-around that increases momentarily production speed, but does not solve the actual problem and will make the maintenance of the solution harder in the future. Solution 3 may be a viable course of action under certain circumstances, but it will result in the creation of technical debt. Moreover, since the consequences are no longer visible to management, this debt may be forgotten. If proper TDM methods were applied, management could identify the debt and repay it in a timely fashion.

We will claim, TD is important for the field of Digital Government, because TD can hinder the public sector in fully reaping the benefits of digitalization. Scholl encourages the field of Digital Government to engage with other disciplines “which overlap with Digital Government as a practice area, but which might lack the forward-looking capabilities that Digital Government Research at least can provide in part” [6, p. 11]. Moreover, Digital Government scholars can contribute to TD research, because they have both domain specific knowledge of the public sector’s use of IT and methodological experience in studying IT and operation in this context [7].

In 2017, The Danish Ministry of Finance published an analysis of Danish public IT-systems. The report concluded that 157 of 428 society or business-critical IT systems’ technical components (applications and IT-infrastructure) were not fully maintained [8]. Outdated software or hardware components can increase the risk of breakdowns, security breaches, and trouble the maintenance and the future development of the IT-systems maintained [8]. The Swedish National Audit Office conducted a similar analysis and found 70% of their IT-systems were outdated [9].

TD and the concept of legacy systems “discuss a state of software that is sub-optimal, time constrained, and explain how this state can decrease an organization’s development efficiency” [7, p. 80]. A considerable source of TD originates from software legacy [10], e.g. during continuous development of a system in an outdated environment. In this review, we focus solely on TDM.

TD studies are primarily published in Software Engineering, especially after Cunningham’s [2] introduction of the metaphor [3, 5, 11, 12]. However, Information Systems researchers have also published studies on TD [13, 14].

We found nine literature reviews and one tertiary review on TD and TDM (Appendix A). While these reviews offer important contributions, they only cover the literature up until 2017 and do not focus on examining methodology, use of theory or unit of analysis. In this study we aim to address these gaps.

We identify 49 TDM papers published within 2017–2020. We find the focus of the TD research fragmented: TD decreases morale, TD is difficult to measure, and numerous tools. The MTD workshops and a tertiary study encourage more research on strategies and management [P13, 3]. The papers primarily present data from open source projects and the private sector. This leaves a gap for research in the public sector. Finally, we offer a research agenda for Digital Government scholars on TDM.

1.1 Aim of the Study and Research Questions

This study reviews the latest published TDM papers (2017–2020). To the best of our knowledge, these papers have not been studied by other secondary studies. The TD field is rapidly evolving with nine secondary studies being published in the past six years. Our study indicates that the publishing rate has not decreased. In a four-year period, TD papers were published in 30 different outlets. This study introduces TD research and a research agenda to the Digital Government scholars. Therefore, we follow established guidelines for conducting systematic literature reviews within the Digital Government field [15, 16]. Our research questions are:

  • RQ1. How is TDM studied and in which fields? RQ1a: Which authors and fields have contributed to technical studies debt management since 2017? RQ1b: What are the methods, context, level of analysis and data level used? RQ1c: Which theories and theoretical concepts are applied?

  • RQ2. What does the TDM literature focus on? RQ2a Which topics are studied? RQ2b What do the findings show?

  • RQ3. What research agenda should the Digital Government scholars investigate in the context of TDM? RQ3a What suggestions does the literature have for future studies? RQ3b What are the identified knowledge gaps in the literature?

The following sections are organized as follows: 2) search process, 3) brief mapping of previous TD literature reviews, 4) analysis, 5) results and suggestions for future study, 6) discussion, limitations, and finally, 7) conclusion. The previous literature reviews, and the pool of papers are listed in Appendix A and B.

2 Search Process

The initial search for papers occurred from January to April 2019, with an updated search conducted in March 2020. We applied Webster and Watson’s [16] method for conducting systematic literature reviews (Fig. 1). Webster and Watson present a three-step process to search for papers [16].

Fig. 1.
figure 1

Illustrates the initial process.

Webster and Watson recommend, that scholars begin a review by searching for papers in known key outlets. The second step is a database search, as it enables the researcher to discover other fields. The third step is a backward-forward search, where papers citing or cited by the pool of papers are identified. The following sections explain each search step in detail.

  1. 1.

    Exploratory search: Using the software tool ‘Publish or Perish’ (Google Scholar)

  2. 2.

    Database search: Web of Science, DGRL and Scopus

  3. 3.

    Backwards and forward search (Google Scholar citations).

2.1 Explorative Search (Step 1)

Webster and Watson [16] recommend starting a review by searching key outlets. They assume that the researcher is familiar with the literature and key outlets. However, we were not familiar with these at the time. Therefore, we chose to deviate from their method in the first step. In January 2019, we conducted an explorative search through Google Scholar to become familiar with the topic and academic literature [17]. We used the software “Publish or Perish” and Google Scholar as an underlying search engine [18]. This search created a foundation for the second step: the structured literature database search. The search informed: the selected search term, the selection criteria, and the temporal limitation. In this search process, we discovered nine secondary literature reviews and a tertiary review (Sect. 3 & Appendix A). The secondary and tertiary reviews informed our inclusion and exclusions criteria (Table 1).

Table 1. Selection criteria (+included/− excluded)

2.2 Structured Database Search (Step 2)

In February 2019, we searched in the three databases: Digital Government Reference Library (DGRL), Scopus and Web of Science. DGRL version 15.5 contains references to approximately 12.500 peer-reviewed papers within the digital government field.

Webster and Watson [16] suggest using keywords when searching in databases, However, not all papers are presented with keywords in these databases. Therefore, we searched the databases for papers including the expression ‘Technical Debt Management’ anywhere in the text. We identified 213 primary research papers on TD.

Due to a large number of identified papers, and the existence of secondary and tertiary studies, we decided to introduce two additional selection criteria in our study (Table 1). This was done to position our review in relation to the previous secondary and tertiary studies. The secondary studies cover all papers on TD published until early 2017 (see Fig. 2). Thus, we focused the review on 1) papers studying TDM specifically, this brought the number of papers to 130, 2) papers published from 2017 and onwards, which reduced the number to 28 papers.

Fig. 2.
figure 2

Illustrates previous meta-studies research on TD

2.3 Forward/Backward Search (Step 3)

In April 2019, we conducted backward and forward searches [16]. First, we reviewed the references in the previously identified papers (backward search) adding two papers. Second, we used Google Scholar to conduct a forward search for papers citing the identified papers, adding 13 papers. This brought the pool of papers to 31.

In March 2020 we updated the search and repeated step 2 and 3, we identified 18 new papers, bringing the total pool of papers in this review to 49 (Appendix B).

3 Previous Secondary Studies on TD

In the explorative search, we found nine secondary and one tertiary study. The nine studies cover literature published from 1992–2017 (Fig. 2). Benldris et al. [20] and Becker et al. [19] cover the year 2016 completely with a broad view of TD. We position our literature review to these existing secondary studies in the discussion section. Rios et al. published a tertiary study in 2018, which explores the state of TD research [3]. They evaluate 13 secondary studies from 2012 to March 2018. They identify three research directions and concepts studied in the secondary studies: TD identification, TD concepts, and TDM. They develop a taxonomy of 15 TD types and generate a TD landscape mapping out the TDM activities, strategies and tools. The period from 2017 does not appear to be covered and none of the literature reviews examine the context of the data.

4 Method for Analysis

The total pool of papers in our review includes 49 papers. Next, we explain the overall coding process, then we go more into detail in the following section.

The coding process was conducted iteratively. We developed a coding sheet (a template) containing: origin of data, theory and method, and unit of analysis [21]. We applied the template on 10 papers and discussed and adjusted the categories. The first author conducted the coding, while coding issues were discussed among all authors.

4.1 Detailed Description of Coding Elements

This section describes additional categories besides author and journal. We explain the reason for our coding, and the coding process in detail.

The Origin of the Data.

We coded for country and sector (private or public) to uncover where the data originates from. We expanded the categorization to include open source projects. Open source projects allow for anyone, anywhere to contribute to the code without stating who they are or where they are from. In the papers, the authors describe where they extracted the data, e.g. by state the type of organization, from which we could interpret the sector.

Theory and Technique.

We made an open coding for the technique; thus, we included a description of how the data was extracted. We coded for theory, focusing on explanations of the relation between concepts, observed phenomenon and why these relationships exist [22,23,24].

Findings and Future Studies.

We coded the papers’ main findings, typically from the results, discussion, and conclusion sections to identify the latest findings within TDM studies. Additionally, we coded for the authors’ suggestions for future studies.

Unit of Analysis and Concepts.

Webster and Watson [16] suggest creating a concept matrix and adding another dimension: the level of analysis. This dimension analyzes the abstraction level of the paper. This allows for more accurate identification of the existing literature. They suggest three levels of analysis: individual, group and organization. We discovered several papers that analyzed at the IT-system level and added this to the existing three levels of analysis. We coded for the overall concept within the paper, besides TDM. We began with a careful read-through of the first 10 papers. Here, a pattern emerged, and we identified four concepts. They were confirmed after going through the remainder of the pool of papers.

5 Results

Next, we present the findings of our study according to our research questions.

5.1 RQ1 How Is TDM Studied and in Which Fields?

RQ1a: Which Authors and Disciplines Have Contributed to TDM Research Since 2017?

Papers concerning TDM have been published in 30 different journals and conferences since 2017. The eight outlets containing more than one paper are listed in Table 2, the remaining 23 papers are published in 26 different outlets.

Table 2. The most frequent publishing outlets for TDM research

The International Conference on TD contains the most papers regarding TDM followed by the Journal of Systems and software. The mapping shows that TDM appears in a variety of outlets within the field of Software Engineering. The 65% of the papers do not specify the countries where the study was conducted. The papers which do specify this, primarily come from the Nordic countries - particularly Sweden [P2, P4, P11, P18, P21, P45]. The papers not specifying the country mainly use open source projects.

124 different authors have contributed to the literature of TDM, 15 of them contributed to two papers. Figure 3 presents the top contributors, note that many papers have several authors.

Fig. 3.
figure 3

The most active authors

RQ1b: What are the Methods, Context, Level of Analysis and Data Level Used?

Only half of the papers specify the organizational context, which the data originates from. One paper use data from the public sector, 12 papers present open source data, and 12 papers present data from the private sector. The papers use different techniques during their research. The most frequently used techniques are survey, interviews and literature studies (Fig. 4). Note that some studies use more than one technique.

Fig. 4.
figure 4

The most frequently used techniques

RQ1c: Which Theories and Theoretical Concepts are Applied?

The authors introduced different concepts and frameworks to explain TD and the underlying interrelations. This was done from different perspectives; assessment, working environment or awareness. The papers seldomly use theory and the word ‘theory’ rarely occurs in the text of the papers. Two papers are presented as foundation for future theory [P39, P45]. The papers offer indicators and methods on how to reduce TD.

5.2 RQ2: What Does the TDM Literature Focus on?

RQ2a: Which Topics are Studied?

Inspired by Webster and Watson [16], we have identified four main concepts and mapped the papers’ level of analysis into a concept matrix Table 3). This gives an overview of how the concepts have been studied and which levels or concepts have not received attention in recent studies.

Table 3. Concept matrix illustrating the studied topics

The concept TD Assessment entails research which assesses the effect of TD or as it is called, interest rate. This concept is the most analyzed and it has been analyzed at four different levels. Ten of the papers use the concept Self-admitted technical debt, which covers TD that is consciously admitted and is visible in code comments. This has been analyzed at the IT-System and Project level. The third concept of TD awareness covers the research creating awareness of TD. This can be achieved by visualizing the assessed TD; it is analyzed at the IT-system and the project level. The last concept Working environment focuses on morale and organizational culture. This is analyzed at both an individual and an organizational level.

A third of the papers analyze TD at the IT-System level, the most frequent analyzed level. The primary focus is TD assessment, which is explored in 21 papers. The organizational level is the least used level of analysis; one explanation is that organizational data is more challenging to access than surveys and on open source projects.

RQ2b: What do the Findings Show?

Almost a third of the papers propose a tool, method, model or technique to aid TDM [P8, P16–20, P29–31, P34, P35, P41, P42, P47, P48]. Their findings focus on the results of evaluating their presented tools. The tools generally aid in TDM e.g. through TD identification or visualization. Increased TD visibility can benefit communication between stakeholders [P37].

The papers vary in approach and focus; thus, the findings of the papers are fragmented. We list seven general findings from the pool of papers: 1) TD harms software development work [P2–4, P11]. 2) All roles related to the system are affected [P4], and community-related factors contribute to TD’s intensity [P46], however, 3) the developer morale can be increased by proper management [P11]. 4) Organizational factors can influence TD [P1], the number of collaborators and the size of the project correlate significant with the amount of TD [P14], and the breadth of the developers’ experience lower the amount of TD [P1, P33]. 5) TD is time consuming: practitioners estimate 36% of development time is wasted due to TD [P4], and TD increases the need to perform additional time-consuming activities [P3]. However, TD is not visible in the backlog [P45] and lack of development processes increase TD [P1] 6) Architectural debt should be managed early in the process because the early introduction of architectural debt shows it persists during the whole software lifecycle [P2]. 7) The estimation difficulties are proposed to be solved by a workflow, which “provide more actual information including TD concepts to the stakeholders” [P7, pp. 600–601].

5.3 RQ3: What Research Agenda Should the Digital Government Scholars Investigate in the Context of TDM?

RQ3a What Suggestions does the Literature have for Future Studies?

The papers proposing a model, tool, method, approach or technique for TD aim to continue expanding and validating their model against new datasets [P1, P5, P8, P12, P14–15, P18–19, P23, P27, P29–31, P34–35, P41–42, P47–48]. Suggestions for future research for other researchers are scarce. However, both MTD workshops encourage a slight change in direction and provide several suggestions for future studies [P10, P13].

Quantifying the Value of TD.

The report of MTD workshop of 2016 [P13] suggests the research agenda is on defining, understanding and operationalizing the value of TD. They urge researchers to understand the value that falls outside the core definition of TD, which is essential to how TD plays out in practice. Digital Government scholars can contribute to this with their experience. The 2017 report of MTD workshop [P10] recommends elevating the quantification of TD from low-level code to architectural opportunities. They identify a need to educate stakeholders to raise awareness level.

A Better Understanding of the Metrics.

Three papers contain the following suggestions for future work: 1) to research more important metrics in the future [P16], 2) to understand the factors leading to TD [P14], 3) to study more change features that can introduce TD [P31]. Other research should be undertaken to investigate if other types of TD (besides Architectural Debt) have a significant correlation to the estimated wasted time. Thus, it creates a better understanding of the negative impact different TD has on wasted software development time [P2–3]. Lastly, the organization’s maneuverability can be increased by determining the types of debt incurred [P18].

Strategies.

Specific suggestions are to investigate concrete architecture problems: how they contribute to file bug-proneness, and possible ways of refactoring [P6]. Vadja et al. encourage future studies to develop methods to assist the stakeholders to estimate their TD [P22] or increase the breadth of the experience [P33]. Additionally, Dong et al. suggest exploration of TD recovery strategies due to a lack of discussed actual cases – particularly in the cross-disciplinary environment [P9]. Further research should focus on providing an in-depth understanding of the relationship between TD and developers’ morale [P11]. Two papers aim to build a TD theory [P39, P45].

RQ3b What are the Identified Knowledge Gaps in the Literature?

We identify five gaps which are presented in Table 4. 1) The papers suffer from a lack of theory and therefore cannot explain the relationship between events and concepts [24]. The debt metaphor contains some explanatory power; however, a metaphor cannot substitute for a theory [25]. Two papers present research as possible foundation for theory [P39, P45]. 2) The level of analysis is primarily focused on a project or system level, which leaves a gap in researching the organizational and individual levels. 3) Half of the papers have not specified the organizational context of the data, and the other half investigate open source projects or private companies. Only one paper use data from the public sector, this leaves a gap in research in this sector [P41]. 4) TD is explored through 16 different approaches and techniques - both quantitative and qualitative. Eight papers use more than one technique and combine both quantitative and qualitative methods. However, observation is not used actively as a data gathering technique.

Table 4. Identified gaps, their importance, and suggestions to how they can be addressed

This may be a problem, because what people say during interviews, may differ from what they do [26]. 5) All the papers are published within the field of software engineering leaving a gap for venue diversity. Table 4 summarizes the identified gaps in the papers and suggests why and how these gaps could be addressed.

When we compare the TD literature’s suggested research RQ3a and the gaps we identify, we find that they can easily be combined. Our gaps contain a high abstraction level, whereas the suggested future research provides very specific suggestions.

6 Discussion

First, we compare the findings of this review to comparable secondary studies. Second, we identify and discuss five findings further, 1) lack of research in the public sector, 2) empirical confirmation of negative effects of TD, 3) a tendency of reinventing the wheel, 4) gaps in diversity of approaches, and 5) the absence of theory.

Comparison to Related Work.

Becker et al. [19] focus on decision-making and criticize the method and objective of the research. They find that the actual decision making was not studied. Ampatzoglou et al. [11] explore how financial aspects are defined and applied when studying TD. They encourage a balance between economic theories and software engineering. Li et al. [12] find that code-related TD and its management gain the most attention and encourage future studies to explore the whole TDM process. This corresponds to our findings in the concept matrix, as research on the organizational level or the working environment is limited. The tertiary review by Rios et al. [3] underline several gaps, which are in line with the MTD reports [P13, 3]: more research on strategies and management.

This literature review includes management of all types of TD, in line with several of the secondary studies [5, 11, 12, 19, 28]. However, none of the secondary studies analyze theory nor the organization type the data was extracted from. Furthermore, this review offers insights into the recent literature and present the concept and a research agenda to Digital Government scholars.

Lack of Research in the Public Sector.

Half of the papers do not state the organizational context where data was collected. Only one paper appears to have conducted its data collection within the public sector [P41]. This is important, because the public sector is different from open source projects and the private sector. While some challenges may be similar, public organizations are subjected to specific requirements [29]. We suggest future Digital Government studies focus on studying TDM, both in terms of strategies, management and indicators.

A Tendency to Reinvent the Wheel.

About a third of the papers propose a tool to support technical debt management [P8, P16–20, P29–31, P34, P35, P41, P42, P47, P48]. They evaluate their presented tool and plan to apply it within a different context. They rarely apply other researchers’ tools or methods, instead, they develop their own. This is problematic as it does not advance the field through joint effort; instead of providing and testing a few select tools for practitioners, the number of tools becomes overwhelming.

Empirical Confirmations of the Negative Effects of TD.

Technical debt harms software development work, both in terms of morale and wasted time [P2, P3, P4, P11]. However, community factors can also intensify TD [P46]. The size of the project correlates with the amount of TD [P14]. TD can be minimized by following some simple guidelines. However, if TD is addressed early it is less time consuming and the morale can be increased [P2, P3, P4, P11]. Thus, we need future studies to explore indicators of TD.

Gaps in the Diversity of Approaches.

TDM is primarily studied from a quantitative approach, 30% of the papers used a qualitative approach. Six of the 31 papers use more than one method. This leaves a gap which can be addressed with the mixed method approach. Mixed methods may provide more comprehensive evidence [22]. None of the papers apply observation, this leaves people’s actual behavior concerning TDM unstudied.

Absence of Theory.

Concerning results, we see an absence of the use of theory, a few papers import theory and approaches from other fields, e.g. portfolio finance and games. We need theories to explain the relationship between the concepts and why these exist [22,23,24]. The absence of native theory is natural for a young field, where researchers are prone to import theory from a different field instead [30, 31].

7 Conclusion, Limitations and Future Studies

We have introduced TDM to the field of Digital Government and proposed a research agenda for Digital Government scholars. We conducted a systematic literature review to explore: 1) how and where technical debt management is studied, 2) what we know about technical debt and its management and lastly 3) a research agenda for scholars within Digital Government. Our findings are based on papers published from 2017–2020. We have discovered that 1) researchers focus on a specific type of technical debt and how it can be reduced. 2) TDM is still strongly rooted within the field of Software Engineering and 3) is primarily examined in open software projects or the private sector. 4) TDM is studied using primarily quantitative methods, finally, 5) there is a lack of theory to guide the studies and explain findings.

Limitations.

We decided to only use the term “Technical debt management” in our literature search which can decrease the external validity, however the backward/forward searches strengthen the external validity. The papers were coded by the first author only which decrease the internal validity, in order to strengthen the internal validity, issues were discussed among all authors.

Research Agenda.

We suggest Digital Government scholars research technical debt management, TD strategies and TD indicators, so the field is advanced further.