Keywords

1 Introduction

A data analytic methodology is useful in improving the productivity of the data analytic process and quality of the output generated. A process model includes a set of processing steps that should be followed by practitioners and researchers involved with analytics projects. A methodology can be described as an instance of a process model with sets of inputs, outputs, tasks and specifications on ‘how to perform’ a certain activity [1]. This allows to carry out projects in a systematic manner as it defines the policies, procedures and processes that should be followed by analysts [2]. For example, there are methodologies like CRISP-DM [3] and SEMMA [4] for data mining (DM) and USAM for Health Analytics (HA) [5].

Even though there are visible benefits of using an analytics or a DM methodology, it could be noted that such methods are not dispersed among the practitioners. According to studies carried out related to software engineering (SE) methodologies’ it is found that user resistance is the main reason for not using new methodologies [6, 7]. Thus, it is important to understand the essential factors affecting methodology acceptance and this study plans to understand individual’s attitude towards using a methodology. Since it is considered that the initial decision to adopt will be made at individual user level, this study was performed at individual level rather than at organizational level. Furthermore, most of these projects are usually carried out by one or two individuals at the organization context (with interactions with many stakeholders); the decisions will be made at individual level rather than at organizational level based on their personal preferences. In this paper, the target users will be novice users as they will be new to the analytic process and their learning curve will be steep if there is no methodology to follow. Thus, undergraduate students who will be learning the techniques were used as novice users.

In this study, focus was on the perception of the technical aspect of the methodology (methodological attributes) instead of looking at the actual primary methodological attributes. It was considered that their perception of the artifact will depend on how they perceive these primary attributes [8] and the individual perception about an innovation’s potential effect on his/her work will impact the intention to use [6]. Potential individual novice users will adopt the methodology based on their perception [9] of how its attributes fulfill their requirements. The research question (RQ) is, “what methodological attributes affect the novice analyst’s decision to use it”. Thus this study will make it possible to understand an individual’s attitude towards using an analytic methodology that is deemed suitable for its users.

2 Conceptual Background

Even though there are several DM methodologies, there is a dearth of empirical studies related to adoption of such methodologies. The available studies are confined to case studies carried out in organization context on adoption of business intelligence (e.g. [10]). Thus, it was necessary to examine the literature related to SE methodology adoption. Several authors have carried out empirical studies on the adoption of a SE methodology by individual users in an organization. Even among those studies, most are carried out as case studies [11].

Recently, researchers have started to look at methodologies as an innovation, if they are reflected to be new by the potential users [8]. Most of the authors have carried out these user acceptance studies in a technology acceptance and innovation diffusion perspective (as a technology innovation rather than considering as a process) [2, 8]. For example, they have used Technology Acceptance Model (TAM) [12] and Diffusion of Innovation (DOI) [13] to examine technical characteristics of the methods [7, 14]. Also, Raghavan and Chand [15] suggested that DOI is suitable for methodological acceptance studies [6]. However, earlier, it was under the notion that these theories are used to study the acceptance and diffusion of products (not for practices). In a similar sense, Riemenschneider et al. [7] used TAM, TAM2, Theory of Planned Behavior (TPB), Perceived Characteristics of Innovating (PCI), Model of Personal Computer Utilization (MPCU) to examine the acceptance of SE processes and found the relationship between perceived usefulness, voluntariness, compatibility and subjective norm to be significant with intension to use the SE process. Hardgrave et al. [6] reported similar findings using TAM and DOI.

It was decided to draw Roger’s DOI Theory with TAM as the theoretical foundation of this study. DOI is selected due to several reasons. First, based on DOI, the innovation’s adoption rate is most extensively determined by its characteristics. Second, DOI is applied at individual level. Third, as previously mentioned, DOI had been used in studying methodological characteristics [6]. Thus, it will provide necessary theoretical basis to study the RQ. In previous methodological studies, DOI characteristics had given mixed results relevant to the significance of their influence on adoption [6, 7]. Similarly, TAM also provides a suitable theoretical foundation on intention to use based on ease of use and usefulness of the innovation [12].

On the other hand, some authors have examined the effect of organizational characteristics effect on the acceptance of SE processes. They have shown organizational culture [16], management support, training and external support influencing the acceptance of them [17]. In this study, organizational characteristics were not considered as undergraduate students who do not have prior work experience were used for the study. Johnson et al. [18] identified a list of beliefs underlying intention formation to use object oriented development and it includes several usefulness elements like process usefulness and communication usefulness. According to Nambisan [19], IT plays four roles in new product development (NPD) in IS, namely, process management, project management (PM), communication management (CM) and knowledge management (KM). Latter three can be considered as supporting dimensions on process management. Thus, perceived usefulness of each of these three dimensions can be considered as separate usefulness elements.

3 Research Model and Hypotheses

The proposed research model developed based on the conceptual background outlined above is presented in Fig. 1. The dependent variable is the intention to use a methodology. According to Rogers [13], perceived characteristics of innovations are relative advantage, compatibility, complexity (replaced as ease of use), trialability and observability (replaced as result demonstrability). The variations to the characteristics were made based on the prior literature and according to the context studied. The justifications for the replacement for each construct are given in subsequent sections. Process management (analytical data model development process) is represented by the five model characteristics. The final two constructs represent the usefulness of supporting elements to the main model development process.

Fig. 1.
figure 1

Research model

Ease of Use. Ease of use refers to ‘the degree to which a person believes that using a particular system would be free of effort’ [20]. Ease of use has been used to address complexity construct in technology adoption literature [8]. As such, instead of using complexity, ease of use is considered [9]. The decision to use a methodology will depend on whether it is perceived to be easy to understand and use. Therefore, if the users find a methodology is free of mental and physical effort and it is easy to learn, they are likely to use it.

HYPOTHESIS 1 (H1): Ease of use has a positive effect on the intention to use a methodology

Relative Advantage. Relative advantage refers to ‘the degree to which an innovation is perceived as being better than its precursor’ [9]. This is the lead of methodologies like CRISP-DM over using an ad hoc approach. Excellence of a methodology can be measured through improvement of status as well as through improvement of efficiency and productivity [6, 8] or meeting intended purpose [9]. Similarly, perceived usefulness in TAM demonstrates conceptual equivalence to the relative advantage [9]. The expectation of developing a structured process is to improve the application of the analytics techniques to the processed data based on the user requirements and coming up with better results while having a low learning curve which would not have been possible by using an ad hoc approach. Therefore, if the novice users find that using a methodology for analytics will be useful for their work there is a prospect of successful deployment of it.

HYPOTHESIS 2 (H2): Relative advantage has a positive effect on the intention to use a methodology

Compatibility. Compatibility refers to ‘the degree to which an innovation has been consistent with existing values, needs, and past experiences of potential adopters’ [9]. If an individual is used to certain habits, there may be resistance from users towards a new process. In analytics, if the users are used to their own personal styles of carrying out analytics projects which have been developed based on their experiences, they may find it hard to change their practices. Even for novice users, if there is a certain style learnt earlier, they may find it hard to deviate from it as it is the initial practice that had been engraved in them. Therefore, if the methodology is compatible with past experiences and learning of the users, they will use a new methodology.

HYPOTHESIS 3 (H3): Compatibility has a positive effect on the intention to use a methodology

Result Demonstrability. Result demonstrability refers to ‘the degree to which the results of using an innovation are observable by others’ [8, 9]. Thus, as indicated by Moore and Benbasat [9], if it is perceived that the methodology provides observable results which can be communicated then it is considered that the results are demonstrable. Poor communication of usage benefits and quantification of results in an analytic method will make it hard for others to see the results as highlighted in any other methodological domain [8]. Particularly, as novice users, they will be more concerned about the quantification of results. Therefore, if the results are demonstrable the novice users will intend on using a methodology.

HYPOTHESIS 4 (H4): Result demonstrability has a positive effect on the intention to use a methodology

Triability. Triability refers to ‘the degree to which an innovation may be experimented with before adoption’ [9]. Ability of the users to test the method before making the final decision will allow them to make an informed decision about the method. This allows users to understand the un-communicated benefits of the method [8]. Therefore, if the novice users can try out a methodology before adopting, there will be a positive influence on the prospect of using it.

HYPOTHESIS 5 (H5): Triability has a positive effect on the intention to use a methodology

Usefulness. Perceived usefulness is ‘the degree to which an individual expects that following a methodology will improve job performance’ [6]. Even in analytics projects, PM, KM and CM are playing a key role. Since no (or minimum) attention has been given to CM in existing methodologies, it is not considered in this study even though the result demonstrability focuses on some attributes of CM. As perceived usefulness of the process is evaluated through relative advantage from DOI [9], the process management was not considered here. Thus, only the influence of usefulness of PM and KM on usage intention of the process model will be considered here.

Considering the risk involved in analytic projects, having PM elements in the process model is useful [21]. PM is to establish reasonable plans for performing and managing the project [22] and it includes estimating the work to be performed (milestones), identifying necessary resources and creating schedules. In considering the uncertainty involved in analytic outputs, PM is useful in scheduling the resources and keeping the project on track. Therefore, novice users will find PM useful to plan out and perform their tasks.

HYPOTHESIS 6 (H6): Usefulness of project management has a positive effect on the intention to use a methodology

KM is an important part in a methodology. Chan and Thong [2] considered KM as a strategic perspective to be considered in implementation of agile methodologies in SE. Similarly, in an analytic methodology too, achieving positive KM outcomes (create, retain and transfer of knowledge) are crucial for learning and in replicating the best practices [23]. Success of an analytic project depends on how knowledge is retained within the project teams and how they are transferred to team members. Therefore, such a suitable means for KM will be useful for novice users in coping with and adopting the organizational context in less time thus increasing their intent to use a methodology.

HYPOTHESIS 7 (H7): Usefulness of knowledge management has a positive effect on the intention to use a methodology

4 Research Methodology

As described earlier, students following two courses relevant to analytics are used for the survey. This was carried out among undergraduate students studying a module related to HA and a module related to business intelligence at a local university having around 30,000 students. Also, as a requirement for the module, they are assigned to read research papers related to analytics every week. Thus, those students were considered to have sufficient understanding of analytics and as they are new to analytic context we considered them as novice users. The survey was carried out at the end of the semester (during the last lecture), with the assumption that the students would have gained a satisfactory idea of their subject through lectures, assignments and reading material (research papers). Even though, both modules are related to analytics, certain differences between those modules increase the generalizing ability of the results. This is made possible as one module deals with analytics in general and the other module is designed specifically for HA.

4.1 Operationalization of Constructs

To develop the survey instrument, existing validated scales were used.Footnote 1 To measure, the intention to use a methodology, scales were adapted from Venkatesh et al. [24] by considering the research context of analytics. Items for compatibility and usefulness were adapted from Hardgrave et al. (2003). Items from previous literature were adapted to measure the other perceived characteristics of a methodology [9]. Seven-point Likert scale ranging from 1 (strongly-disagree) to 7 (strongly-agree) was used in the questionnaire for all the constructs expect for usage intention. Usage intention was measured using a scale ranging from 1 (no) to 3 (yes). In addition, gender was used as a control in the model analysis. To ensure the appropriateness of the questions, the questionnaire was reviewed by three IS researchers prior to the actual survey. Then a separate pilot study was conducted among 20 3rd and 4th year undergraduate students to improve the validity and reliability of the instrument.

4.2 Data Collection

As survey participants we used undergraduate students studying analytics in two courses. The questionnaire was given as paper based surveys to students. It was decided to not to use online surveys as the students may not be receptive to them and there is a high chance of them delaying in providing responses to the survey. Even though, online surveys are flexible and one can create and distribute surveys (via emails, social networks) and collect and organize data very swiftly, we decided to use the paper based surveys to ensure participation of all the selected students in the survey. However, the participation in the survey was totally on a voluntary basis. The questionnaire was distributed during the break of the lesson on the last day of the module at the end of the semester with prior permission from the respective lecturers. A three to four minutes verbal explanation on what is an analytic methodology and about the survey was given in addition to the explanations on CRISP-DM given in the front page of the questionnaire.

A total of 114 completed and valid responses were collected. As a general rule, there should be at least 10 times of number of constructs as the minimum sample [25, 26]. As there are only seven constructs, it is reckoned that the sample size of 114 is adequate. The correlations of the sample are given in Table 1. The descriptive statistics indicates that students are between age of 20--28 years (mean 23.75 years and standard deviation of 1.75).

Table 1. Correlations

5 Data Analysis and Results

The data analysis was performed using the partial least squares (PLS) technique with SmartPLS. PLS was selected as it enables to analyze measurement model (relationship between items and constructs) and structural model (relationship among constructs) [27] with multi items constructs and not restrictive on the sample as covariance based structural equation modeling (SEM) [28]. Since PLS is primarily intended to be used in early stages of theory development [27] and as this is one of the first attempts to do a causal predictive analysis on the behavioral intention to use a methodology for analytics, PLS was considered to be suitable for this study.

5.1 Instrument Validation

The convergent validity and discriminant validity of the constructs were assessed to demonstrate the construct validity. Convergent validity indicating the extent to which two or more items measure the same construct is examined using (1) standardized path loadings of items, (2) composite reliability (CR), and (3) average variance extracted (AVE), [28]. The standardized path loadings are significant (at t-value > 1.96) with a threshold of 0.7. It is considered appropriate to have at least 0.7 for CR and 0.5 for AVE [28]. Thus, based on the results it could be noted that the construct’s convergent validity was acceptable. The squared root of AVE of each construct and the CR are shown in Table 1. The discriminant validity indicates the degree to which items that measure different constructs differ [25]. This is satisfied by having a square root of the AVE for each construct greater than its correlation with other constructs (Table 1) [28]. Based on the results discriminant validity is supported.

5.2 Hypotheses Testing

After establishing the instrument validity, PLS was used for hypotheses testing. Gender was used as the control variable as it is expected that the males may be more willing to take advantage of available opportunities [29] and prefer a structured process. Age is not considered as a control variable as all the users are from the same age category. In Fig. 2, path coefficients and significant results are indicated. Perceived relative advantage, result demonstrability, triability and usefulness of KM indicate a significant effect on the intention to use the methodology for analytics. However, the direction of relationship between triability and intention to use is negative (path coefficient = -0.047), and as such the hypothesis H5 is not supported. All the other significant relationships indicate a positive influence and as such H2, H4 and H7 are supported. The explanatory power (R2) is 0.31 and it is above the threshold of 0.10 as specified by Falk and Miller [30].

Fig. 2.
figure 2

Results of hypothesis tests

6 Discussion

Several important relationships were found from this study. First, characteristics such as relative advantage and results demonstrability are important attributes in a process. Novice users may also like to get a relative advantage over others by using a methodology. They will see that using a method will enable them to kick start the project rather than going in ad hoc directions. Similar results could be observed in considering the previous studies related to methodology adoption too. Consistently, relative advantage is the only attribute that is significant in those studies while other attributes are insignificant [8]. Even through the study carried out by Riemenschneider et al. [7] using five theoretical models this is justified as perceived usefulness (referred as relative advantage in DOI) was the only construct significant in all models. Novice users will like to see if the progress of their performance is shown or demonstrated and also giving them the possibility of showing their progress even to their seniors. Specially, this will be a motivator and will allow getting further assistance from the senior analysts.

Second, it is noted that novice users find KM components in a methodology as useful. Thus, having documentation will be useful in managing the creation and transfer of knowledge. In their study, Chan and Thong [2] also indicate the usefulness of KM in SE methodology usage. However, it is important to explore how KM is used in successful data analytic teams.

Third, it is interesting to note that triability is showing a negative relationship. It is a negative relationship of low significance. Nevertheless, individuals might not try out a new innovation if they perceive risks in doing so or if there is no continued accessibility [31]. Accessibility should be provided through proper information management (access to specific information on usage, e.g. user manual). Furthermore, it is hard to explore a process without actually using in a real context. Thus, this may be indicating a negative relationship. However, this can be explored further with KM.

Fourth, compatibility and ease of use are not proving significant relationships. Hardgrave et al. [6] found the relationship between compatibility and SE methodology usage to be significant but weak. CRISP-DM like methodologies are introduced independent of the data, analytic tools or analytic algorithms that are being used. As such compatibility may not be a relevant issue. However, if a practice is more compatible with the type of projects that are been carried out and if they are compatible with existing work practices, the users will be more willing to use a process model [6]. As such, when developing new methodologies it is important to look into components that are having a greater alignment to actual settings and project types to be included in the process model. It is interesting to note that ease of use (complexity in DOI) was not significant among all five models used by Riemenschneider et al. [7]. Hardgrave et al. [6], also found similar results in their study too. This is a variation from the technology acceptance studies [2]. Rather than considering the ease of use, a higher focus should be given to providing comprehensive and complete specification of the phases and tasks to be followed in the full data analytic process.

Finally, the relationship with usefulness of PM is not significant. For novice users, PM may not be useful in carrying out their university projects. However, in real settings PM is important. Thus, it is essential to explore how PM can be incorporated in the model in a useful manner to the novice user starting projects in real organizational settings.

There are several limitations encountered in this study and suggestions for future research. First, additional antecedents and interaction effects could have been considered. For example, personal characteristics and individual needs could have been considered as factors that can affect the motivation to use the method. Mohan and Ahlemann [8] conceptualize that acceptance of a methodology will depend on the individual needs and it will motivate them to use the methods. They had considered individual needs as moderators. Second, a large sample size could be used to further test the robustness of the results and study could be further extended to other user groups, such as very fresh employees in an analytics organization.

7 Conclusion

In this paper, it is elaborated on how a survey is used to explore methodological attributes that are important for adoption of a data analytic methodology by novice users. The conceptual model used for hypothesis testing was developed based on DOI and TAM. The survey data showed that the relative advantage and result demonstrability of the analytical model development process as well as the usefulness of knowledge management are the attributes affecting the usage intention of an analytics methodology. Usage of these findings in developing a methodology will help in reducing the user resistance for its adoption.