Skip to main content

Advertisement

Log in

Enterprise’s internal control for knowledge discovery in a big data environment by an integrated hybrid model

  • Published:
Information Technology and Management Aims and scope Submit manuscript

Abstract

This research aims to (1) identify the critical risk factors that influence the governance of enterprise internal control in a big data environment, (2) depict the intertwined and complicated relationships among risk factors, and (3) yield an attainable target for performance improvement over both the short term and long term. To address these challenging issues, we propose an innovative hybrid decision architecture that combines artificial intelligence-based rule generation techniques and a multiple attribute decision making approach, called herein multiple rule-base decision making. Examining real cases, our study shows that the control environment and information technology (IT) control construction are the top dimension and criterion, respectively. This finding can be taken as a reference for managing and controlling risk factors under a big data environment. In an upcoming improvement/advancement on internal control/information technology (IT) governance, the related factors can also be viewed as essential requirements for enterprises when conducting effective internal control and audit inspection, which can help with more audit success and less lawsuit problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Gantz J, Reinsel D (2011) Extracting value from chaos. IDC Review pp 1–12.

  2. French AM, Shim JP (2016) The digital revolution: internet of things, 5G, and beyond. Commun Assoc Inf Syst 38:40

    Google Scholar 

  3. Lin R, Xie Z, Hao Y, Wang J (2020) Improving high-tech enterprise innovation in big data environment: a combinative view of internal and external governance. Int J Inform Manage 50:575–585

    Article  Google Scholar 

  4. Chang SI, Chang LM, Liao JC (2020) Risk factors of enterprise internal control under the internet of things governance: a qualitative research approach. Inform Manage 57(6):103335

    Article  Google Scholar 

  5. Beasley MS, Clune R, Hermanson DR (2005) Enterprise risk management: an empirical analysis of factors associated with the extent of implementation. J Account Public Pol 24(6):521–531

    Article  Google Scholar 

  6. Mikalef P, Wetering R, Krogstie J (2020) Building dynamic capabilities by leveraging big data analytics: The role of organizational inertia. Inform Manage 58(6):103412

    Article  Google Scholar 

  7. Chan HC (2015) Internet of things business models. J Serv Sci Manag 8(4):552–568

    Google Scholar 

  8. Institute of Internal Auditors (IIA) (2011). “Practice guide: auditing the control environment” (April 2011). Available at:http://www.iia.nl/SiteFiles/IIA_leden/Auditing_the_Con-trol_Environment.pdf.

  9. Nuijten MB, van Assen MALM, Augusteijn HEM, Crompvoets EAV, Wicherts JM (2018) Effect sizes, power, and biases in intelligence research: a meta-meta-analysis. J Intel 8(4):36. https://doi.org/10.3390/jintelligence8040036

    Article  Google Scholar 

  10. Mikalef P, Boura M, Lekakos G, Krogstie J (2020) The role of information governance in big data analytics driven innovation. Inform Manage 57(7):103361

    Article  Google Scholar 

  11. Committee of Sponsoring Organizations of the Treadway Commission (COSO). (2013). Internal control—integrated framework executive summary. Available at: https://www.coso.org/Pages/ic.aspx

  12. Graham L (2015) Internal control audit and compliance- documentation and testing under the new COSO framework. Wiley, Hoboken, New Jersey

    Google Scholar 

  13. Rubino M, Vitolla F, Garzoni A (2017) The impact of an IT governance framework on the internal control environment. Records Manag J 27(1):19–41

    Article  Google Scholar 

  14. Lee KM, Ra I (2020) Data privacy-preserving distributed knowledge discovery based on the blockchain. Inf Technol Manag 21(4):191–204

    Article  Google Scholar 

  15. Gandomi A, Haider M (2015) Beyond the hype: big data concepts, methods, and analytics. Int J Inf Manage 35:137–144

    Article  Google Scholar 

  16. Al-Laith AAG (2012) Adaptation of the internal control systems with the use of information technology and its effects on the financial statements reliability: an applied study on commercial banks. Int Manag Rev 8(1):12–20

    Google Scholar 

  17. Chiu T, Wang T (2019) The COSO framework in emerging technology environments: an effective in-class exercise on internal. J Emer Tech Account 16(2):89–98

    Article  Google Scholar 

  18. Hunziker S (2017) Efficiency of internal control: evidence from Swiss non-financial companies. J Manag Gov 21:399–433

    Article  Google Scholar 

  19. Lin Y, Wang Y, Chiou J, Huang H (2014) CEO characteristics and internal control quality. Corp Gov 22(1):24–42

    Article  Google Scholar 

  20. Kim JC, Chung K (2020) Knowledge-based hybrid decision model using neural network for nutrition management. Inf Technol Manag 21:29–39

    Article  Google Scholar 

  21. Weber RH (2010) Internet of Things: new security and privacy challenges. Comput Law Secur Rev 26(1):23–30

    Article  Google Scholar 

  22. Gubbi J, Buyya R, Marusic S, Palaniswami M (2013) Internet of Things (IoT): a vision, architectural elements, and future directions. Future Gener Comp Syst 29(7):1645–1660

    Article  Google Scholar 

  23. Debreceny RS, Gray GL (2013) IT governance and process maturity: a multinational field study. J Inf Syst 27(1):157–188

    Google Scholar 

  24. Ghoshal A, Hao J, Menon S, Sarkar S (2020) Hiding sensitive information when sharing distributed transactional data. Inf Syst Res 31(2):473–490

    Article  Google Scholar 

  25. Lee KM, Ra I (2020) Data privacy-preserving distributed knowledge discovery based on the blockchain. Inf Technol Manag 21:191–204

    Article  Google Scholar 

  26. Torrecilla-Salinas CJ, De Troyer O, Escalona MJ, Mejías M (2019) A Delphi-based expert judgment method applied to the validation of a mature Agile framework for Web development projects. Inf Technol Manag 20:9–40

    Article  Google Scholar 

  27. Hsu MF, Lin SJ (2021) A BSC-based network DEA model equipped with computational linguistics for performance assessment and improvement. Int J Mach Learn Cyber. https://doi.org/10.1007/s13042-021-01331-7

    Article  Google Scholar 

  28. Panwar S, Kapur PK, Singh O (2021) Predicting diffusion dynamics and launch time strategy for mobile telecommunication services: an empirical analysis. Inf Technol Manag 22:33–51

    Article  Google Scholar 

  29. Pawlak Z (1982) Rough Sets. Int J Comput Inf Sci 11:341–356

    Article  Google Scholar 

  30. Salehi M, Shiri MM, Hossini SZ (2020) The relationship between managerial ability, earnings management and internal control quality on audit fees in Iran. Int J Prod Perf Manag 69(4):685–703

    Google Scholar 

  31. Jensen R, Shen Q (2005) Fuzzy-rough data reduction with ant colony optimization. Fuzzy Sets Syst 149(1):5–20

    Article  Google Scholar 

  32. Bonabeau E, Dorigo M, Theraulaz G (1999) Swarm intelligence: from natural to artificial systems. Oxford University Press US.

  33. Hu KH, Lin SJ, Hsu MF, Chen FH (2020) A dynamic network-based decision architecture for performance evaluation and improvement. J Intell Fuzzy Syst 39(3):4299–4311

    Article  Google Scholar 

  34. Hu X, Sun B, Chen X (2020) Double quantitative fuzzy rough set-based improved AHP method and application to supplier selection decision making. Int J Mach Learn Cyber 11:153–167

    Article  Google Scholar 

  35. Nowak-Brzezińska A, Wakulicz-Deja A (2019) Exploration of rule-based knowledge bases: a knowledge engineer’s support. Inf Sci 485:301–318

    Article  Google Scholar 

  36. Ma XA (2021) Fuzzy entropies for class-specific and classification-based attribute reducts in three-way probabilistic rough set models. Int J Mach Learn Cyber 12:433–457

    Article  Google Scholar 

  37. Anand A, Sharma R, Kohli R (2020) The effects of operational and financial performance failure on BI&AEnabled search behaviors: a theory of performance-driven search. Inf Syst Res 31(4):1144–1163

    Article  Google Scholar 

  38. Pfeiffer J, Pfeiffer T, Meißner M, Weiß E (2020) Eye-tracking-based classification of information search behavior using machine learning: evidence from experiments in physical shops and virtual reality shopping environments. Inf Syst Res 31(3):675–691

    Article  Google Scholar 

  39. Chen Z, Ming X, Zhang X, Yin D, Sun Z (2019) A rough-fuzzy DEMATEL-ANP method for evaluating sustainable value requirement of product service system. J Clean Prod 228:485–508

    Article  Google Scholar 

  40. Xu Z, Wei C (1999) A consistency improving method in the analytic hierarchy process. Eur J Oper Res 116(2):443–449

    Article  Google Scholar 

  41. Hu KH, Chen FH, Hsu MF, Tzeng GH (2021) Identifying key factors for adopting artificial intelligence-enabled auditing techniques by joint utilization of fuzzy-rough set theory and MRDM technique. Technol Econ Dev Econ 27(2):459–492

    Article  Google Scholar 

  42. Lu Y, Cao Y (2018) The individual characteristics of board members and internal control weakness: evidence from China. Pac-Basin Financ J 51:75–94

    Article  Google Scholar 

  43. Hu G, Yuan R, Xiao JZ (2017) Can independent directors improve internal control quality in China? Eur J Financ 23(7–9):626–647

    Article  Google Scholar 

  44. Ashfaq K, Rui Z (2019) The effect of board and audit committee effectiveness on internal control disclosure under different regulatory environments in South Asia. J Financl Report Account 17(2):170–200

    Article  Google Scholar 

  45. Yu J, Jin X, Liang SK (2017) Does the geographical proximity between the chairman and the CEO affect internal control quality? China J Account Stud 5(3):344–360

    Article  Google Scholar 

  46. Wang J (2015) An empirical study of the effectiveness of internal control and influencing factors. Manag Eng 18:1838–5745

    Google Scholar 

  47. Guiso L, Sapienza P, Zingales L (2015) The value of corporate culture. J Financ Econ 117(1):60–76

    Article  Google Scholar 

  48. Chen Y, Knechel WR, Marisetty VB (2017) Board independence and internal control weakness: evidence from SOX 404 disclosures. AUDIT J Pract Theory 36(2):45–62

    Article  Google Scholar 

  49. Turedi H, Celayir D (2018) Role of effective internal control structure in achievement of targeted success in businesses. Eur Sci J 14(1):1–18

    Google Scholar 

  50. Cai C, Mei S, Zhong W (2019) Configuration of intrusion prevention systems based on a legal user: the case for using intrusion prevention systems instead of intrusion detection systems. Inf Technol Manag 20:55–71

    Article  Google Scholar 

  51. D’Aquila J, Houmes R (2014) COSO’s updated internal control and enterprise risk management frameworks-Applying the concepts to governments and not-for-profit organizations. The CPA J 84(5):54–59

    Google Scholar 

  52. Deane JK, Goldberg DM, Rakes TR, Rees LP (2019) The effect of information security certification announcements on the market value of the firm. Inf Technol Manag 20:107–121

    Article  Google Scholar 

  53. Bruwer JP, Coetzee P, Meiring J (2018) Can internal control activities and managerial conduct influence business sustainability? A South African SMME perspective. J Small Bus Enterp Dev 25(5):710–729

    Article  Google Scholar 

  54. Michelon G, Bozzolan S, Beretta S (2015) Board monitoring and internal control system disclosure in different regulatory environments. J Appl Account Res 16(1):138–164

    Article  Google Scholar 

  55. Birnberg JG, Zhang Y (2011) When betrayal aversion meets loss aversion: the effects of changes in economic conditions on internal control system choices. J Manage Account Res 23(1):169–187

    Article  Google Scholar 

  56. Mullakhmetov K (2016) Control in the system of managerial decisions procedures: a conceptual view. Probl Perspect Manage 14(3):64–76

    Google Scholar 

  57. Ji Y, Kumar S, Mookerjee V (2016) When being hot is not cool: monitoring hot lists for information security. Inf Syst Res 27(4):897–918

    Article  Google Scholar 

  58. Akhmetshin EM (2017) The system of internal control as a factor in the integration of the strategic and innovation dimensions of a company’s development. J Adv Res Law Econ 6(28):1684–1692

    Google Scholar 

  59. Djalil M, Nadirsyah SE, Yahya MR, Jalaluddin J, Ramadhanti SV (2017) The effect of used information technology, internal control, and regional accounting system on the performance of city governance agency of Banda Aceh city, Indonesia. Broad Res Account Negot Distrib 8(1):25–37

    Google Scholar 

  60. Kuhn JR, Morris B (2017) IT internal control weaknesses and the market value of firms. J Enterp Inf Manag 30(6):964–986

    Article  Google Scholar 

  61. Harp NL, Barnes BG (2018) Internal control weaknesses and acquisition performance. Account Rev 93(1):235–258

    Article  Google Scholar 

  62. Ettish AA, EL-Gazzar SM, Jacob RA, (2017) Integrating internal control frameworks for effective corporate information technology governance. J Inf Syst Technol Manage 14(3):361–370

    Google Scholar 

  63. Weng TC, Chi HY, Chen GZ (2015) Internal control weakness and information quality. J Appl Financ Bank 5(5):135–169

    Google Scholar 

  64. Ionescu L (2011) Monitoring as a component of internal control systems. Manag Financ Mark 6(2):800–804

    Google Scholar 

  65. D’Aquila J (2013) COSO’s internal control integrated framework updating the original concepts for today’s environment. The CPA J 83(10):22–29

    Google Scholar 

  66. Fu HP, Yeh H, Ma RL (2018) A study of the CSFs of an e-cluster platform adoption for microenterprises. Inf Technol Manag 19:231–243

    Article  Google Scholar 

  67. Saaty TL (1996) Decision making with dependence and feedback: Analytic network process (RWS Publications, Pittsburgh, 1996).

  68. Manaligod HJT, Diño MJS, Jo S et al (2020) Knowledge discovery computing for management. Inf Technol Manag 21:61–62

    Article  Google Scholar 

  69. Thangavel K, Karnan M, Pethalakshmi A (2005) Performance analysis of rough reduct algorithms in mammogram. Int J Glob Vis Image Process 5(8):13–21

    Google Scholar 

  70. Opricovic S, Tzeng GH (2004) Compromise solution by MCDM methods: a comparative analysis of VIKOR and TOPSIS. Eur J Oper Res 156:445–455

    Article  Google Scholar 

  71. Alaeddini M, Mir-Amini M (2020) Integrating COBIT with a hybrid group decision-making approach for a business-aligned IT roadmap formulation. Inf Technol Manag 21(2):63–94

    Article  Google Scholar 

  72. Li J, Wang Jq, Hu Jh (2019) Multi-criteria decision-making method based on dominance degree and BWM with probabilistic hesitant fuzzy information. Int J Mach Learn Cyber 10:1671–1685

    Article  Google Scholar 

  73. Pinar A, Boran FE (2020) A q-rung orthopair fuzzy multi-criteria group decision making method for supplier selection based on a novel distance measure. Int J Mach Learn Cyber 11:1749–1780

    Article  Google Scholar 

  74. Höhle U (1984) Compact G-Fuzzy topological spaces. Fuzzy Sets Syst 13:39–63

    Article  Google Scholar 

  75. Hu X, Sun B, Chen X (2020) Double quantitative fuzzy rough set-based improved AHP method and application to supplier selection decision making. Int J Mach Learn Cyber 11(2):153–167

    Article  Google Scholar 

  76. Cornelis C, Medina J, Verbiest N (2014) Multi-adjoint fuzzy rough sets: definition, properties and attribute selection. Int J Approx Reason 55:412–426

    Article  Google Scholar 

  77. Jensen R, Cornelis C (2011) Fuzzy-rough nearest neighbour classification and prediction. Theor Comput Sci 412:5871–5884

    Article  Google Scholar 

  78. Saha J, Mukherjee J (2021) CNAK: cluster number assisted K means. Pattern Recognit 110:107625

    Article  Google Scholar 

  79. Chung K, Jung H (2020) Knowledge-based dynamic cluster model for healthcare management using a convolutional neural network. Inf Technol Manag 21:41–50

    Article  Google Scholar 

  80. Ni T, Qiao M, Chen Z, Zhang S, Zhong H (2021) Utility-efficient differentially private K-means clustering based on cluster merging. Neurocomputing 424:205–214

    Article  Google Scholar 

  81. Sarens G, Christopher J (2010) The association between corporate governance guidelines and risk management and internal control practices: evidence from a comparative study. Manag Audit J 25(4):288–308

    Article  Google Scholar 

Download references

Acknowlegements

This article was supported by Department of Education of Guangdong, China (No. 2020WTSCX139).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ming-Fu Hsu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: ACO-FRST

ACO-FRST is briefly summarized as follows.

In the same manner that crisp equivalence is the core element of RST, fuzzy crisp equivalence is the core element of FRST [72, 73]. By taking the fuzzy similarity relation \(B\) on the universe into consideration, the crisp equivalence classes can be extended to fuzzy ones. Based on the fuzzy similarity relation, the fuzzy equivalence class \(\left[ g \right]_{B}\) for an instance closest to \(g\) can be expressed as:

$$\mu_{{\left[ g \right]_{B} }} (h) = \mu_{B} (g,h)$$
(A.1)

The following axioms hold for a fuzzy equivalence class \(F\) [74]:

$$\begin{gathered} \exists_{g} ,\begin{array}{*{20}c} {} \\ \end{array} \mu_{F} (g) = 1 \hfill \\ \mu_{F} (g) \wedge \mu_{B} (g,h) \le \mu_{F} (h) \hfill \\ \mu_{F} (g) \wedge \mu_{F} (h) \le \mu_{B} (g,h) \hfill \\ \end{gathered}$$

The first axiom is utilized to indicate that an equivalence class is non-empty. The second axiom states that the elements in \(h\)’s neighborhood are in the equivalence class of \(h\). The last axiom denotes that any two elements in \(G\) have some relation by utilizing \(S\). The fuzzy \(Q\)-lower and \(Q\)-upper approximations are expressed in Eqs. (A.2) and (A.3), respectively.

$$\mu_{{\underline{Q} G}} (F_{i} ) = inf_{g} \max \left\{ {1 - \mu_{{F_{i} }} (g),\mu_{G} (g)} \right\}\begin{array}{*{20}c} {} \\ \end{array} \forall i$$
(A.2)
$$\mu_{{\overline{Q} G}} (F_{i} ) = \sup_{g} \min \left\{ {\mu_{{F_{i} }} (g),\mu_{G} (g)} \right\}\begin{array}{*{20}c} {} \\ \end{array} \forall i$$
(A.3)

where \(F\) denotes as a fuzzy equivalence class that is designated as the partition of \({\mathbb{R}}\) for a given feature subset \(S\).

With feature \(x\) as an example, the partition of the universe by \(\left\{ x \right\}\) (represented as \({\mathbb{R}}/IND\left( {\left\{ x \right\}} \right)\)) is taken as fuzzy equivalence classes for that feature. In the crisp case, \({\mathbb{R}}/S\) contains sets of objects grouped together that are complicated to separate via two features \(x\) and \(y\). In the fuzzy case, objects may belong to numerous different classes, and so two Cartesian products of \({\mathbb{R}}/IND\left( {\left\{ x \right\}} \right)\) and \({\mathbb{R}}/IND\left( {\left\{ y \right\}} \right)\) must be considered for computing \({\mathbb{R}}/S\). The mathematical formulation is displayed as:

$${\mathbb{R}}/S = \otimes \left\{ {x \in S:{\mathbb{R}}/IND\left( {\left\{ x \right\}} \right)} \right\}$$
(A.4)

Although the universe of discourse in feature selection is finite, this is not the case in general. Therefore, \(\sup\) and \(\inf\) are considered in prior equations.

The following equations display the new formats of fuzzy lower and upper approximations:

$$\mu_{{\underline{S} G}} (g) = \mathop {\sup }\limits_{{F \in {\mathbb{R}}/S}} \min \left( {\mu_{F} \left( g \right),\begin{array}{*{20}c} {} \\ \end{array} \mathop {\inf }\limits_{{h \in {\mathbb{R}}}} \max \left\{ {1 - \mu_{F} \left( h \right),\mu_{G} \left( h \right)} \right\}} \right)$$
(A.5)
$$\mu_{{\overline{S} G}} (g) = \mathop {\sup }\limits_{{F \in {\mathbb{R}}/P}} \min \left( {\mu_{F} \left( G \right),\begin{array}{*{20}c} {} \\ \end{array} \mathop {\sup }\limits_{{h \in {\mathbb{R}}}} \min \left( {\mu_{F} \left( g \right),\mu_{G} \left( h \right)} \right)} \right)$$
(A.6)

In real-life applications, not all \(h \in {\mathbb{R}}\) are considered—only those where \(\mu_{F} \left( h \right)\) is non-zero. The tuple \(\left\langle {\underline{S} G,\overline{S} G} \right\rangle\) can be used to express FRST. Each set in \({\mathbb{R}}/S\) denotes an equivalence class. The magnitude of an instance that can be assigned to an equivalence class is calculated by utilizing the conjunction of constituent fuzzy equivalence classes \(F_{i} ,i = 1, \ldots ,n.\)

$$\mu_{{F_{1} \cap \ldots \cap F_{n} }} \left( g \right) = \min \left( {\mu_{{F_{1} }} \left( g \right),\mu_{{F_{2} }} \left( g \right), \ldots ,\mu_{{F_{n} }} \left( g \right)} \right)$$
(A.7)

By the extension of the crisp positive region in RST, the membership of an instance classified into the fuzzy positive region can be represented by:

$$\mu_{{POS_{S} \left( P \right)}} (g) = \mathop {\sup }\limits_{{G \in {\mathbb{R}}/P}} \mu_{{\underline{S} G}} (g)$$
(A.8)

Instance \(g\) is not classified into the positive region only if the equivalence class it belongs to is not a constituent of the positive region. By performing the concept of the fuzzy positive region, the dependency function can be expressed as:

$$\gamma ^{\prime}_{S} (P) = \frac{{\left| {\mu_{{POS_{S} (p)}} (g)} \right|}}{{\left| {\mathbb{R}} \right|}} = \frac{{\sum\nolimits_{{G \in {\mathbb{R}}}} {\mu_{{POS_{S} (P)}} (g)} }}{{\left| {\mathbb{R}} \right|}}$$
(A.9)

The minimal subset is determined by performing Eq. (A.9) to gauge the quality of the selected subset. For a dataset with large dimensionalities, FRST becomes less suitable for minimal subset determination [75]. As identifying the minimal reduct for FRST is a classical optimization task, this study considers one swarm intelligence method, called ant colony optimization (ACO), that has proven its usefulness in the optimization task [28, 76]. The concept of ACO is a simulation of the behavior of ants foraging. During their foraging when food is discovered, as the ants transport it back and forth their body secretes a special substance known as a pheromone. By spreading these odors, ants are able to tell other ants which direction they should follow to find the food source. This concept can transform rule extraction into an optimization problem and summarize the knowledge that can be easily understood by the user to strengthen the decision maker’s judgment.

Equation (A.10) represents the probability that an ant moves from node i to node j at time t.

$$l_{ij}^{m} (t) = \frac{{\left[ {\alpha_{ij} (t)} \right]^{\beta } \cdot \left[ {\eta_{ij} } \right]^{\lambda } }}{{\sum\nolimits_{{p \in J_{i}^{m} }} {\left[ {\alpha_{ip} (t)} \right]^{\beta } \cdot \left[ {\eta_{ip} } \right]^{\lambda } } }}$$
(A.10)

Here, \(m\) denotes the number of ants, \(J_{i}^{m}\) represents the set of ant \(m\)’s unvisited node, \(\eta_{ij}\) expresses the heuristic desirability of selecting node \(j\) when at node \(i\), and \(\alpha_{ij} (t)\) indicates the amount of pheromone on edge \(\left( {i,j} \right)\). Here, \(\beta\) and \(\lambda\) are determined manually. The pheromone on each edge can be updated by utilizing the following equation.

$$\alpha_{ij} (t + 1) = (1 - \rho ) \cdot \alpha_{ij} (t) + \Delta \alpha_{ij} (t)$$
(A.11)

Here, \(\Delta \alpha_{ij} (t) = \sum\limits_{m = 1}^{n} {\left( {\gamma ^{\prime}\left( {G^{m} } \right)/\left| {G^{m} } \right|} \right)}\).

The term ρ represents the evaporation coefficient of the pheromone, and \(G^{m}\) is the feature subset determined by ant \(m\). The pheromones are updated based on the goodness of the feature subset (\(\gamma ^{\prime}\)) in FRST. More detailed illustrations of ACO-FRST can be seen in [28, 76, 77].

Appendix B: D-ANP

DEMATEL can be arranged briefly by the following procedures.

  • Procedure 1: Creating the average relation matrix Z

The direct influence is evaluated by each expert, and the degree of mutual influence is identified using a pairwise comparison based on the measurement scale of absolutely no influence (0), low influence (1), medium influence (2), high influence (3), and very high influence (4) to form a direct relation matrix. The average relation matrix Z is displayed in Eq. (B.1).

$${\varvec{Z}} = \left[ {\begin{array}{*{20}c} {z_{11}^{{}} } & \cdots & {z_{1j}^{{}} } & \cdots & {z_{1n}^{{}} } \\ \vdots & {} & \vdots & {} & \vdots \\ {z_{i1}^{{}} } & \cdots & {z_{ij}^{{}} } & \cdots & {z_{in}^{{}} } \\ \vdots & {} & \vdots & {} & \vdots \\ {z_{n1}^{{}} } & \cdots & {z_{nj}^{{}} } & \cdots & {z_{nn}^{{}} } \\ \end{array} } \right]$$
(B.1)
  • Procedure 2: Calculating the normalized direct influence matrix H. The normalized direct influence matrix H is calculated using Eqs. (B.2) and (B.3):

    $${\varvec{H}} = \phi \cdot {\varvec{Z}}$$
    (B.2)
    $$\phi = \min \left\{ {\frac{1}{{\max_{1\; \le i \le n} \sum\nolimits_{j = 1}^{n} {z_{ij} } }},\frac{1}{{\max_{1\; \le j \le n} \sum\nolimits_{i = 1}^{n} {z_{ij} } }}} \right\}$$
    (B.3)
  • Procedure 3: Computing the total relationship matrix T. The total relationship matrix, which involves direct and indirect effects, can be obtained by applying Eq. (B.4).

    $${\varvec{T}} = {\varvec{M}} + {\varvec{M}}^{2} + {\varvec{M}}^{3} + \ldots + {\varvec{M}}^{l} = {\varvec{M}}({\varvec{I}} - {\varvec{M}})^{ - 1} \quad {\text{when}}\quad \mathop {\lim }\limits_{l \to \infty } {\varvec{M}}^{l} = \left[ {\mathbf{0}} \right]_{n \times n}$$
    (B.4)

    where I is the identity matrix.

  • Procedure 4: Drawing IINRM. We use the summation of row vector \({\varvec{r}} = (r_{i} )_{n \times 1}\) (\(\left[ {\sum\nolimits_{j = 1}^{n} {t_{ij} } } \right]_{n \times 1} = (r_{1} \ldots ,r_{i} \ldots ,r_{n} )^{\prime}\)) and column vector \({\varvec{s}} = (s_{j} )_{n \times 1}\) (\(\left[ {\sum\nolimits_{i = 1}^{n} {t_{ij} } } \right]^{\prime }_{1 \times n} = (s_{1} \ldots ,s_{j} \ldots ,s_{n} )^{\prime}\)) in the relationship matrix T to illustrate the degree of influence among criteria and dimensions; when \(i = j\), IINRM is constructed. The casual graph of the influential network relationship can be obtained by mapping the values of \((r_{i} + s_{i} )\) and \((r_{i} - s_{i} )\), and the results are utilized to improve the favor value of each dimension and criterion.

    ANP can be arranged briefly by the following procedures.

  • Procedure 1: Formulating the unweighted super-matrix W. First, the total-influence relation matrix \({\varvec{T}}_{C}\) is obtained from DEMATEL as:

    $${\varvec{T}}_{C} = \begin{array}{*{20}c} \begin{gathered} D_{1} \hfill \\ \hfill \\ \hfill \\ \vdots \hfill \\ \hfill \\ \end{gathered} \\ \begin{gathered} \hfill \\ D_{i} \hfill \\ \hfill \\ \hfill \\ \vdots \hfill \\ \end{gathered} \\ \begin{subarray}{l} \\ \\ \end{subarray} \\ {D_{m} } \\ \end{array} \begin{array}{*{20}c} {c_{11} } \\ {c_{12} } \\ \vdots \\ {c_{{1m_{1} }} } \\ \vdots \\ {ci1} \\ {c_{i2} } \\ \vdots \\ {c_{{im_{i} }} } \\ \vdots \\ {c_{m1} } \\ {c_{m2} } \\ \vdots \\ {c_{{mm_{m} }} } \\ \end{array} \mathop {\mathop {\left[ {\begin{array}{*{20}c} {{\varvec{T}}_{C}^{11} } & \cdots & {{\varvec{T}}_{C}^{1j} } & \cdots & {{\varvec{T}}_{C}^{1m} } \\ \vdots & {} & \vdots & {} & \vdots \\ {{\varvec{T}}_{C}^{i1} } & \cdots & {{\varvec{T}}_{C}^{ij} } & \cdots & {{\varvec{T}}_{C}^{im} } \\ \vdots & {} & \vdots & {} & \vdots \\ {{\varvec{T}}_{C}^{m1} } & \cdots & {{\varvec{T}}_{C}^{mj} } & \cdots & {{\varvec{T}}_{C}^{mm} } \\ \end{array} } \right]}\limits^{{\begin{array}{*{20}c} {\;\;\;\;c_{11 \cdots } c_{{1m_{1} }} \;\;...\;\;} & {c_{j1 \cdots } c_{{jm_{j} }} \;} & \cdots \\ \end{array} \;\quad \;c_{m1 \cdots } c_{{mm_{m} }} \,\;\;}} }\limits^{{\begin{array}{*{20}c} {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\quad \quad \;D_{1} \quad \;\;\quad \;} & {} & {\quad D_{j} \;} \\ \end{array} } & {} & {} \\ \end{array} } & {} & {} \\ \end{array} } & {D_{m} } & {} \\ \end{array} \;\;}}_{{m_{j} \times m_{j} |m < n,\;\sum\nolimits_{j = 1}^{m} {m_{j} = n} }}$$
    (B.5)

Normalizing the total-influence relation matrix \({\varvec{T}}_{C}\), we next derive a new matrix \({\varvec{T}}_{C}^{\rho }\):

$${\varvec{T}}_{C}^{\rho } = \begin{array}{*{20}c} \begin{gathered} D_{1} \hfill \\ \hfill \\ \hfill \\ \vdots \hfill \\ \hfill \\ \end{gathered} \\ \begin{gathered} \hfill \\ D_{i} \hfill \\ \hfill \\ \hfill \\ \vdots \hfill \\ \end{gathered} \\ \begin{subarray}{l} \\ \\ \end{subarray} \\ {D_{m} } \\ \end{array} \begin{array}{*{20}c} {c_{11} } \\ {c_{12} } \\ \vdots \\ {c_{{1m_{1} }} } \\ \vdots \\ {ci1} \\ {c_{i2} } \\ \vdots \\ {c_{{im_{i} }} } \\ \vdots \\ {c_{m1} } \\ {c_{m2} } \\ \vdots \\ {c_{{mm_{m} }} } \\ \end{array} \mathop {\mathop {\left[ {\begin{array}{*{20}c} {{\varvec{T}}_{C}^{\rho 11} } & \cdots & {{\varvec{T}}_{C}^{\rho 1j} } & \cdots & {{\varvec{T}}_{C}^{\rho 1m} } \\ \vdots & {} & \vdots & {} & \vdots \\ {{\varvec{T}}_{C}^{\rho i1} } & \cdots & {{\varvec{T}}_{C}^{\rho ij} } & \cdots & {{\varvec{T}}_{C}^{\rho im} } \\ \vdots & {} & \vdots & {} & \vdots \\ {{\varvec{T}}_{C}^{\rho m1} } & \cdots & {{\varvec{T}}_{C}^{\rho mj} } & \cdots & {{\varvec{T}}_{C}^{\rho mm} } \\ \end{array} } \right]}\limits^{{\begin{array}{*{20}c} {\;\;\;\;c_{11 \cdots } c_{{1m_{1} }} \;\;...\;\;} & {c_{j1 \cdots } c_{{jm_{j} }} \;} & \cdots \\ \end{array} \;\quad \;c_{n1 \cdots } c_{{mm_{m} }} \,\;\;}} }\limits^{{\begin{array}{*{20}c} {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\quad \quad \;D_{1} \quad \;\;\quad \;} & {} & {\quad D_{j} \;} \\ \end{array} } & {} & {} \\ \end{array} } & {} & {} \\ \end{array} } & {D_{m} } & {} \\ \end{array} \;\;}}_{{m_{j} \times m_{j} |m < n,\;\sum\nolimits_{j = 1}^{m} {m_{j} = n} }}$$
(B.6)

Transposing the normalized total-influence relation matrix \({\varvec{T}}_{C}^{\rho }\), we yield the unweighted super-matrix \({\varvec{W}} = ({\varvec{T}}_{C}^{\rho } )^{^{\prime}}\) as:

$${\varvec{W}} = \left( {{\varvec{T}}_{C}^{\rho } } \right)^{\prime } = \begin{array}{*{20}c} \begin{gathered} D_{1} \hfill \\ \hfill \\ \hfill \\ \vdots \hfill \\ \hfill \\ \end{gathered} \\ \begin{gathered} \hfill \\ D_{j} \hfill \\ \hfill \\ \hfill \\ \vdots \hfill \\ \end{gathered} \\ \begin{subarray}{l} \\ \\ \end{subarray} \\ {D_{m} } \\ \end{array} \begin{array}{*{20}c} {c_{11} } \\ {c_{12} } \\ \vdots \\ {c_{{1m_{1} }} } \\ \vdots \\ {cj1} \\ {c_{j2} } \\ \vdots \\ {c_{{jm_{j} }} } \\ \vdots \\ {c_{m1} } \\ {c_{m2} } \\ \vdots \\ {c_{{mm_{m} }} } \\ \end{array} \mathop {\mathop {\left[ {\begin{array}{*{20}c} {{\varvec{W}}^{11} } & \cdots & {{\varvec{W}}^{i1} } & \cdots & {{\varvec{W}}^{m1} } \\ \vdots & {} & \vdots & {} & \vdots \\ {{\varvec{W}}^{1j} } & \cdots & {{\varvec{W}}^{ij} } & \cdots & {{\varvec{W}}^{mj} } \\ \vdots & {} & \vdots & {} & \vdots \\ {{\varvec{W}}^{1m,} } & \cdots & {{\varvec{W}}^{im} } & \cdots & {{\varvec{W}}^{mm} } \\ \end{array} } \right]}\limits^{{\begin{array}{*{20}c} {\;\;\;\;c_{11 \cdots } c_{{1m_{1} }} \;\;...\;\;} & {c_{j1 \cdots } c_{{im_{i} }} \;} & \cdots \\ \end{array} \;\quad \;c_{n1 \cdots } c_{{mm_{m} }} \,\;}} }\limits^{{\begin{array}{*{20}c} {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\quad \quad \;D_{1} \quad \;\;\quad \;} & {} & {\quad D_{i} \;\;} \\ \end{array} } & {} & {} \\ \end{array} } & {} & {} \\ \end{array} } & {D_{m} \,} & {} \\ \end{array} \;}}_{{m_{j} \times m_{j} |m < n,\;\sum\nolimits_{j = 1}^{m} {m_{j} = n} }}$$
(B.7)
  • Procedure 2: Obtaining the weighted super-matrix \({\varvec{W}}^{\rho }\). We derive the total-influence relation matrix for dimensions \({\varvec{T}}_{D}\) using the DEMATEL technique as:

    $$\user2{\rm T}_{D}^{{}} = \left[ {\begin{array}{*{20}c} {t_{D}^{11} } & \cdots & {t_{D}^{1j} } & \cdots & {t_{D}^{1m} } \\ \vdots & {} & \vdots & {} & \vdots \\ {t_{D}^{i1} } & \cdots & {t_{D}^{ij} } & \cdots & {t_{D}^{im} } \\ \vdots & {} & \vdots & {} & \vdots \\ {t_{D}^{m1} } & \cdots & {t_{D}^{mj} } & \cdots & {t_{D}^{mm} } \\ \end{array} } \right]_{m \times m}$$
    (B.8)

The normalized influential matrix for clusters \({\varvec{T}}_{D}^{\rho }\) is formed through each element divided by the sum of corresponding rows \(d_{i} = \sum\limits_{j = 1}^{n} {t_{ij}^{D} }\) in the calculation process:

$$\user2{\rm T}_{{\text{D}}}^{\rho } = \left[ {\begin{array}{*{20}c} {t_{D}^{11} /{\text{d}}_{1} } & \cdots & {t_{D}^{1j} /d_{1} } & \cdots & {t_{D}^{1m} /d_{1} } \\ \vdots & {} & \vdots & {} & \vdots \\ {t_{D}^{i1} /d_{i} } & \cdots & {t_{D}^{ij} /d_{i} } & \cdots & {t_{D}^{im} /d_{i} } \\ \vdots & {} & \vdots & {} & \vdots \\ {t_{D}^{m1} /d_{m} } & \cdots & {t_{D}^{mj} /d_{m} } & \cdots & {t_{D}^{mm} /d_{m} } \\ \end{array} } \right]_{m \times m} = \left[ {\begin{array}{*{20}c} {t_{11}^{\rho D} } & \cdots & {t_{1j}^{\rho D} } & \cdots & {t_{1m}^{\rho D} } \\ \vdots & {} & \vdots & {} & \vdots \\ {t_{i1}^{\rho D} } & \cdots & {t_{ij}^{\rho D} } & \cdots & {t_{im}^{\rho D} } \\ \vdots & {} & \vdots & {} & \vdots \\ {t_{m1}^{\rho D} } & \cdots & {t_{mj}^{\rho D} } & \cdots & {t_{mm}^{\rho D} } \\ \end{array} } \right]_{m \times m}$$
(B.9)

Finally, we obtain the weighted super-matrix \({\varvec{W}}^{\rho }\) for the normalized total-relation matrix as:

$${\varvec{W}}^{\rho } = {\varvec{T}}_{D}^{\rho } \times {\varvec{W}} = \begin{array}{*{20}c} \begin{gathered} D_{1} \hfill \\ \hfill \\ \hfill \\ \vdots \hfill \\ \hfill \\ \end{gathered} \\ \begin{gathered} \hfill \\ D_{j} \hfill \\ \hfill \\ \hfill \\ \vdots \hfill \\ \end{gathered} \\ \begin{subarray}{l} \\ \\ \end{subarray} \\ {D_{m} } \\ \end{array} \begin{array}{*{20}c} \begin{gathered} c_{11} \hfill \\ c_{\begin{subarray}{l} 12 \\ \vdots \end{subarray} } \hfill \\ c_{\begin{subarray}{l} 1m_{1} \\ \vdots \end{subarray} } \hfill \\ c_{j1} \hfill \\ c_{\begin{subarray}{l} j2 \\ \vdots \end{subarray} } \hfill \\ \end{gathered} \\ \begin{gathered} c_{\begin{subarray}{l} jm_{j} \\ \vdots \end{subarray} } \hfill \\ c_{m1} \hfill \\ c_{\begin{subarray}{l} m2 \\ \vdots \end{subarray} } \hfill \\ c_{{mm_{m} }} \hfill \\ \end{gathered} \\ \end{array} \mathop {\mathop {\left[ {\begin{array}{*{20}c} {t_{11}^{\rho D} \times {\varvec{W}}^{11} } & \cdots & {t_{i1}^{\rho D} \times {\varvec{W}}^{i1} } & \cdots & {t_{m1}^{\rho D} \times {\varvec{W}}^{m1} } \\ \vdots & {} & \vdots & {} & \vdots \\ {t_{1j}^{\rho D} \times {\varvec{W}}^{1j} } & \cdots & {t_{ij}^{\rho D} \times {\varvec{W}}^{ij} } & \cdots & {t_{mj}^{\rho D} \times {\varvec{W}}^{mj} } \\ \vdots & {} & \vdots & {} & \vdots \\ {t_{1m}^{\rho D} \times {\varvec{W}}^{1m} } & \cdots & {t_{im}^{\rho D} \times {\varvec{W}}^{im} } & \cdots & {t_{mm}^{\rho D} \times {\varvec{W}}^{mm} } \\ \end{array} } \right]}\limits^{{\begin{array}{*{20}c} {\;\;c_{11 \cdots } c_{{1m_{1} }} \quad \;\,\quad \;\;...\quad \quad \quad \quad } & {\;\;c_{i1 \cdots } c_{{im_{i} }} \;\;\;} & {\;\quad \; \cdots } \\ \end{array} \;\quad \quad \quad \;\;c_{m1 \cdots } c_{{mm_{m} }} \,\;}} }\limits^{{\begin{array}{*{20}c} {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\quad \quad \;\;\;D_{1} \quad \quad \;\;\quad \;} & {\quad \;} & {\quad \quad \;\quad \quad D_{i} \;} \\ \end{array} } & \quad & {} \\ \end{array} } & {} & {} \\ \end{array} \quad \quad } & {D_{m} } & {} \\ \end{array} \quad \;\;}}$$
(B.10)

where \(t_{ij}^{\rho D} = {{t_{D}^{ij} } \mathord{\left/ {\vphantom {{t_{D}^{ij} } {d_{i} }}} \right. \kern-\nulldelimiterspace} {d_{i} }}\).

  • Procedure 3: Providing the global weights. Limiting the weighted super-matrix based on the Markov Chain approach for reaching a long-term stable super-matrix \(\mathop {\lim }\limits_{\varphi \to \infty } ({\varvec{W}}^{\rho } )^{\varphi }\), we arrive at the D-ANP influence weights \((w_{1} \ldots ,w_{j} \ldots ,w_{n} )\).

Appendix C: Modified-VIKOR approach

The modified-VIKOR can be arranged briefly by the following procedures.

  • Procedure 1: Normalizing the performance gap calculation. As \(f_{j}^{aspiration} = 10\) \({( }{\varvec{f}}^{aspiration} = (f_{1}^{aspiration} ,...,f_{j}^{aspiration} ,...,f_{n}^{aspiration} ))\) and \(f_{j}^{worst} = 0\) (\({ (}{\varvec{f}}^{worst} = (f_{1}^{worst} ,...,f_{j}^{worst} ,...,f_{n}^{worst} ))\) are the positive ideal solution (aspiration level) and negative ideal solution (worst level) among criteria j = 1,2,…,n, respectively, we utilize the concept of the “aspiration-worst” to determine the normalized rating matrix of the internal control practices of alternative k for criterion j. Accordingly, we obtain the performance rating ratios \(\left[ {r_{gj} } \right]_{G \times n}\) through a normalized performance matrix \(\left[ {f_{gj} } \right]_{G \times n}\), as shown in Eq. (C.1).

    $$[r_{gj} ]_{G \times n} = {{[\left( {\left| {f_{j}^{aspiration} - f_{gj} } \right|} \right)} \mathord{\left/ {\vphantom {{[\left( {\left| {f_{j}^{aspiration} - f_{gj} } \right|} \right)} {\left( {\left| {f_{j}^{aspiration} - f_{j}^{worst} } \right|} \right)}}} \right. \kern-\nulldelimiterspace} {\left( {\left| {f_{j}^{aspiration} - f_{j}^{worst} } \right|} \right)}}]_{G \times n}$$
    (C.1)
  • Procedure 2: Determining the maximum group utility \(S_{g}\) and the minimum individual regret gap \(Q_{g}\). We calculate these gap values [70] using the following two equations:

    $$L_{g}^{\varphi = 1} = S_{g} = \sum\limits_{j = 1}^{n} {w_{j} r_{gj} } = \sum\limits_{j = 1}^{n} {w_{j} (} |f_{j}^{aspiration} - f_{gj} |)/(|f_{j}^{aspiration} - f_{j}^{worst} |),g = 1,2, \ldots ,n.$$
    (C.2)
    $$L_{g}^{\varphi = \infty } = Q_{g} = \max_{j} (r_{gj} |g = 1,2,...,n)$$
    (C.3)

    where \(w_{j}\) represents the weights of the jth criterion (called D-ANP influential weights), and \(f_{gj}\) indicates the real performance value in each criterion j of each alternative. The performance gap-ratio (\(r_{gj}\)) reduction of each alternative can thus be achieved.

  • Procedure 3: Computing the comprehensive index values \(R_{g}\). We calculate the integrated values from the following equation:

    $$R_{g} = v(S_{g} - S^{aspiration} )/(S^{worst} - S^{aspiration} ) + (1 - v)(Q_{g} - Q^{aspiration} )/(Q^{worst} - Q^{aspiration} )$$
    (C.4)

where \(S^{aspiration} = \max_{g} S_{g}\), \(S^{worst} = \min_{g} S_{g}\), \(Q^{aspiration} = \max_{g} Q_{g}\), \(Q^{worst} = \min_{g} Q_{g}\), and \(0 \le \nu \le 1\).

Appendix D: K-means approach

Cluster analysis groups a set of instances by some characteristics in a multi-dimensional space so that the instances in the same group are more similar to each other than to those in other groups [78, 79]. With easy computation and intuitiveness, the K-means clustering approach is one of the most well-known types of clustering. Its fundamental concept is to identify K centroids for K clusters, and these centroids should be as far as possible from each other. The basic illustration of K-means runs as follows.

For a dataset \(G = \left\{ {x_{1} , \ldots ,x_{N} } \right\},x_{i} \in R^{g}\), K-means partitions the data into K disjoint groups \(\left( {B_{1}^{*} , \ldots ,B_{k}^{*} } \right)\). The quality of a cluster outcome is then evaluated by a normalized intra-cluster variance that is represented by:

$$\frac{1}{N}\sum\limits_{j = 1}^{k} {\sum\limits_{{x_{i} \in B_{j}^{ * } }} {\left\| {x_{i} - B_{j} } \right\|_{2} } }$$
(D.1)

where \(B_{j}\) denotes the centroid of cluster \(B_{j}^{*}\); the smaller the variance value is, the more superior is the clustering quality. The algorithm starts by randomly selecting \(k\) points as initial centroids by the centroid selection approach. The quality of the centroids is then adjusted iteratively until the centroids do not change. In each iteration, the approach traverses whole instances, assigns the instances to the nearest cluster, and then updates the centroid of each cluster.

$$B_{j}^{t} = \frac{{\sum\nolimits_{{x_{i} \in B_{j}^{*} }} {x_{i}^{t} } }}{{\left| {B_{j}^{*} } \right|}},\forall t \in \left\{ {1, \ldots ,g} \right\}$$
(D.2)

where the \(t\) th dimension of the \(j\) th centroid is represented as \(B_{j}^{t}\), and the \(t\) th dimension of \(x_{i}\) is denoted by \(x_{i}^{t}\). For a more detailed illustration, one can refer to Ni et al. [80] and Saha and Mukherjee [81].

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, FH., Hsu, MF. & Hu, KH. Enterprise’s internal control for knowledge discovery in a big data environment by an integrated hybrid model. Inf Technol Manag 23, 213–231 (2022). https://doi.org/10.1007/s10799-021-00342-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10799-021-00342-8

Keywords

JEL Classification

Navigation