Keywords

1 Introduction

Microservices architectures are the new weapon-of-choice for the development of cloud-native applications as suites of small, autonomous, and conversational services, which are then easy to understand, deploy, and scale [1]. Migrating to microservices enables optimizing the autonomy, replaceability, and decentralized governance of software architectures [2]. Despite the hype for microservices, both industry and academia still lack consensus on the adequate conditions to embrace and benefit from this new paradigm [3]. Microservices architectures are highly complex, comprising multiple, often conflicting factors. From the industrial perspective, NetflixFootnote 1 and SoundCloudFootnote 2 are the early adopters of microservices, transitioning from a traditional development model with hundreds of engineers maintaining a monolithic application, to many small teams responsible for the end-to-end development of hundreds of microservices to serve millions of users on a daily basis. The organizational culture shifted from traditional siloed teams to product-oriented teams following a DevOps methodology. The academia is still in an early stage of documenting and analyzing the migration to microservices that is taking place in industry [4], mainly by distilling the key drivers for migrating to microservices (e.g., reusability, decentralized data governance, and scalability) and migration patterns that help structure and generalize the process [5].

Therefore any approach that aims to assist the decision making process must be flexible and dynamically adaptable. In this context, the paper develops a Multi-Layer Fuzzy Cognitive Map (ML-FCM) [6] as the first DSS that captures those key factors towards the migration to a microservices architecture, and offers the means to study their influence and interrelation [7]. Fuzzy Cognitive Maps (FCMs) are computational intelligence, soft computing tools that combine elements of fuzzy logic and neural networks [8]. FCMs capture the behavior of a given problem in nodes that represent knowledge in the application domain [9]. ML-FCMs extend FCMs by the concept of sub-FCMs, that is, smaller structures (maps) of related nodes organized in layers. This grouping offers a way for analyzing parameters at finer levels of granularity [6, 10]. This enables tracking the causes for the decision outcome, and offers the ability to study the dependencies between the leading determinants of the decision.

The construction and analysis of the model starts with a literature review to identify an initial set of factors that potentially influence the decision of migrating to microservices. The next step engages a group of experts from industry and academia with related background to the subject. They evaluated and refined the identified factors through questionnaires and interviews. Finally, we performed both static and dynamic analysis [9] over the resulting ML-FCM through graph-analysis and simulation, respectively, which helped us identify the influence that different nodes (concepts) exercise on the decision of migrating to microservices.

To the best of our knowledge, this is the first decision support system (DSS) for migrating to microservices. The suitability of a DSS in this context is suggested in [11], based on reference models for enterprise architectures. Regarding ML-FCMs, they have been used as DSS in the context of novel architectures such as Cloud Adoption [9, 12], and extensively applied in sensitive, real-world domains [7].

The rest of the paper is organized as follows. Section 2 details the concepts identification and subsequent construction of the ML-FCM for supporting the decision of migrating to a microservices architecture. Section 3 discusses the static and dynamic analysis of our model. Finally, Sect. 4 concludes the paper.

2 Decision Support Models

A FCM is a directed graph with nodes representing concepts in a domain and weighted edges describing the various causal relationships that exist among these concepts. A numeric activation level per concept denotes the strength of its presence in the problem domain. The map is initialized with a set of activation levels (that represent a particular scenario in the problem domain), and then executed on a series of discrete steps in which the activation levels of the nodes are iteratively updated based on the causation relationship between them, until the map: (1) reaches an equilibrium state, (2) exhibits cyclic behavior, or (3) exhibits chaotic behavior. The former two cases allow one to develop simulation scenarios and perform inferences. The main outcome of the execution is the final activation value of the concept of interest (central node) for that particular scenario. Details about the updating functions for activation levels are given in [9].

The first step to develop a FCM is a Literature Review (LR), for which we followed the key guidelines proposed in [13]. Although a systematic LR is outside the scope of this work, this helped us organize the process of finding and classifying relevant works. We searched for microservices-related articles indexed in different online databases, considered both journal and conference articles, and suppressed duplicated papers, given the overlapping among the search engines and databases.

From this analysis we refined the initial collection up to 46 relevant worksFootnote 3 to perform concept extraction, that is, identifying and then disambiguating the concepts that are potentially relevant for the decision of migrating to microservices, and are crosscutting through the literature. This process was supported by leveraging our previous experience in analysis frameworks in the context of SOA [14, 15].

The initial list of concepts extracted from the literature was delivered to a group of seven experts (researchers and industry practitioners) with a background related to the subject, who evaluated the list and suggested to add, remove, group, or decompose concepts based on their experience. The final list of concepts (summarized in Table 1) is then hierarchically organized, in our case featuring two different layers that focus on specific aspects of the problem, with a total of six sub-FCMs (Table 1).

Table 1. Sub-FCM concept grouping

Based on the identified concepts, the experts completed a questionnaire concerning the causal relationships between nodes (concepts) and their weights, i.e., the degree to which concepts influence each other, fuzzified using seven linguistic values (from negatively high to positively high) according to a triangular membership function. The activation levels (used to represent different scenarios in the problem domain) for the different concepts also consist of five linguistic values, from very low to very high.

Once all experts defined their causal relationships between concepts as described above, the linguistic values were aggregated and defuzzified, producing a weight matrix (representing the weight of each edge) with numerical values in the interval [−1, 1]. The final structure of the model, which consists of the main FCM on the top layer and 5 sub-FCMs on the lower layer, is depicted in Fig. 1.

Fig. 1.
figure 1

ML-FCM for the microservices migration problem.

3 Static and Dynamic Model Analyses

The static and dynamic analyses of the map can help understand and assess the shape and behavior of the model [9]. This will ultimately highlight “hidden” properties and features, as well as points that require particular attention. Static analysis examines the properties of a model prior to its execution, and irrespectively of its behavior over time, by applying notions of Graph Theory. The major categories covered by static analysis are complexity of the graph (in terms of density, depth and breadth); strength of each node (weight and number of its incoming and outgoing edges); and tendency of cycles in the graph (positive cycles amplify any initial activation value and vice-versa). By the end of the static analysis, modelers shall be able to identify the stronger concepts (i.e., those that strongly influence the central concept), reap an indication of how each sub-FCM influences the FCM in the upper layer, and use this information towards setting simulations (i.e., dynamic analysis). Interested readers can refer to [9] for further details on the analysis framework.

Dynamic analysis allows one to assess the behavior of the model in execution through simulations under manually configured scenarios (described through activation values). The main goal is to study the activation levels of the concepts (nodes) and how these levels change over time. The first step is to execute two “extreme” positive and negative scenarios, which should drive the model to the extreme positive/negative outcome (the central concept should get a level close to 1 or 0 respectively). If the model performs as expected, then additional simulations can be run with different configurations and initial activation levels (representing what-if scenarios), followed by a study of the correlation of these levels and the final outcome. This should support the significance ranking of the different concepts obtained from the static analysis.

Results of Static Analysis. Table 2 shows that FCM1 (Fig. 1) has high density (number of edges and nodes) and it is above the threshold for medium magnitude (\(density \ge 0.6\)). All the second layer sub-FCMs are complete graphs, with density values equal to 1 and high complexity, but mitigated by their fairly small size, between 3 and 6 nodes (6 and 30 edges respectively). Thus, the model can be characterized as a complex two-layer structure [9]. The number of positive feedback cycles is higher than the negative ones for all nodes. This indicates how the model tends to behave: given a slightly positive modification in any activation level, the corresponding level of the central node of interest (Microservices Migration) is promoted, and vice-versa.

Results also suggest that the top three concepts of the main FCM (FCM1) are Infrastructure and Management Services (its activation level is calculated using the concepts in its sub-FCM), Maintainability & Evolvability, and Reliability. This finding calls for further investigation of the behavior of these concepts both individually and as a group, and whether Maintainability& Evolvability and Reliability could be also decomposed so as to understand which factors influence them at a finer granularity. Interestingly, Business Complexity and Cost are the weakest concepts in the ML-FCM. This means that, in the following dynamic analysis, we can simulate whether the model behaves the same when removing these concepts.

Table 2. Strength and tendency indicators for every sub-FCM.

Results of Dynamic Analysis. Figure 2 depicts the results for the extreme scenarios (positive and negative) after 100 iterations of the model execution. As we can see, the FCM reached an equilibrium state, clearly leading the concept of interest (black line) to a positive/negative value for the positive/negative scenario, with values 0.89 (“very high”) and 0.11 (“very low”) respectively. These results show not only that the model behaves as expected for the extreme scenarios, matching the desired outcome, but also it converges to stable values after a certain number of iterations, without behaving randomly, or exhibiting cycles that hinder the applicability of the model.

Then, we posed what-if scenarios based on the findings of the preceding static analysis. Scenario 1 investigates the possibility of simplifying the model without affecting the outcome, by removing the weakest concepts (Cost and Business Complexity, according to the static analysis), with the goal of making the model easier to execute and understand, as fewer concepts have to be defined and analyzed. Thus, we setup and run 20 simulations before and after removing these two concepts (and subsequently sub-FCM4), with a set of randomized initial activation levels. The final activation values (in linguistic and numerical form) of the central concept for the 20 executions are summarized in Table 3. Note that the final outcome of the model across the simulations is almost identical between the simplified and the full model for the first scenario. Additionally, a Root Mean Square Error (RMSE) of 0.016 indicates that the values show low deviation. Conclusively, the first what-if scenario suggests that a simplified model can deliver the same decision outcome without the weakest nodes. This result asks for further discussion about the significance of these two concepts.

Fig. 2.
figure 2

Activation values (y-axis) for the concepts in FCM1 throughout 100 iterations (x-axis).

Analogously, what-if Scenario 2 defines an overly simplified model that only comprises the three strongest nodes, and the same set of initial activation levels as the previous one. This scenario investigates whether the strongest concepts by themselves can deliver identical results to the original model. As we can see in Table 3, the results show that the simplified model is not able to substitute the original model as it leads to different outcomes, thus fails to capture the dynamics of the domain under study. Based on this result, one can iteratively add strong nodes to come up with an “optimal” FCM configuration, which only comprises the concepts that truly affect the final outcome.

Table 3. Outcome of what-if scenarios (20 executions, random activation levels).

Finally, a threat to validity regarding model construction, refers to domain experts, who may introduce a degree of subjectivity and bias. This is a common weakness in expert-based models, mitigated by the assessment with different scenarios, which show whether the model behaves correctly or not, and allows us to calibrate it accordingly. Furthermore, the group of experts did not include any project manager or executive, whom may lean towards concepts such as Cost and Business Complexity, at the expense of more “technical” concepts. Besides, more experiments are needed, specially by applying the model to real-world scenarios.

4 Conclusions and Future Work

This paper identifies the key concepts and drivers related to the decision of migrating to microservices, by means of a literature review and experts’ feedback through questionnaires and interviews. Then, we developed a DSS by organizing these concepts as a ML-FCM, a graph-shaped computational intelligence model that allows one to support decision-makers through automated reasoning.

Our future work comprises fine-tunning the model by considering other concepts, performing simulations with new scenarios, and engaging experts with different background. After that, our goal is to apply the model on real-world cases, and to support the decision-making process of an on-going project.