Abstract
Determining the spreading ability of nodes is considered a fundamental issue in network science, with numerous applications in controlling system failure, rumors spreading, and product advertising. Many methods have been proposed to identify influential nodes, which, despite their advantages, suffer from high time complexity, low accuracy, and low resolution. This paper presents a feature based on K-Shell and the degree applied to the node and its neighbors. It adjusts the contribution of various features. The number of selected neighbors and the influence of each neighbor are chosen according to the structural features of the graph. The actual spreading ability of the node is measured with the Susceptible-Infected-Recovered (SIR) model, and the evaluations include accuracy, precision, resolution, correlation, Kolmogorov–Smirnov Test, and time complexity. Assessing 14 real-world and 20 artificial networks compared to 12 recent methods, such as the HGSM (Hybrid Global Structure Model), indicates that the proposed method performs best in various aspects.















Similar content being viewed by others
Data Availability
No datasets were generated or analysed during the current study.
References
Liu JB, Zheng YQ, Lee CC (2024) Statistical analysis of the regional air quality index of yangtze river delta based on complex network theory. Appl Energy 357:122529. https://doi.org/10.1016/j.apenergy.2023.122529
Vitevitch MS, Pisoni DB, Soehlke L, Foster TA (2024) Using complex networks in the hearing sciences. Ear and Hearing 45(1):1–9
Ji P, Ye J, Mu Y, Lin W, Tian Y, Hens C, Perc M, Tang Y, Sun J, Kurths J (2023) Signal propagation in complex networks. Phy reports 1017:1–96
Li Y, Chen Y, Fan Y, Chen Y, Chen Y (2023) Dynamic network modeling of gut microbiota during alzheimer?s disease progression in mice. Gut Microbes 15(1):2172672
Silva DH, Anteneodo C, Ferreira SC (2023) Epidemic outbreaks with adaptive prevention on complex networks. Communications in Nonlinear Sci and Numerical Simulation 116:106877. https://doi.org/10.1016/j.cnsns.2022.106877
Bianchi PA, Causholli M, Minutti-Meza M, Sulcaj V (2023) Social networks analysis in accounting and finance. Contemporary accounting research 40(1):577–623
Logan AP, LaCasse PM, Lunday BJ (2023) Social network analysis of twitter interactions: a directed multilayer network approach. Social Network Analysis and Mining 13(1):65
Aïmeur E, Amri S, Brassard G (2023) Fake news, disinformation and misinformation in social media: a review. Social Network Analysis and Mining 13(1):30
Zhu E, Wang H, Zhang Y, Zhang K, Liu C (2024) Phee: Identifying influential nodes in social networks with a phased evaluation-enhanced search. Neurocomputing 572:127195
Zhu P, Cheng L, Gao C, Wang Z, Li X (2022) Locating multi-sources in social networks with a low infection rate. IEEE Trans Network Sci Eng 9(3):1853–1865
Zhang Y, Ren W, Feng J, Zhao J, Chen Y, Mi Y (2024) A cascading failure propagation model for a network with a node emergency recovery function. Appl Energy 371:123655
Ruj S, Pal A (2024) in A Practical Guide on Security and Privacy in Cyber-Physical Systems: Foundations, Applications and Limitations (World Scientific), pp. 173–211
Aggarwal K, Arora A (2023) Influence maximization in social networks using discrete bat-modified (dbatm) optimization algorithm: a computationally intelligent viral marketing approach. Social Network Analysis and Mining 13(1):146
Ojha RP, Srivastava PK, Awasthi S, Srivastava V, Pandey PS, Dwivedi RS, Singh R, Galletta A (2023) Controlling of fake information dissemination in online social networks: an epidemiological approach. IEEE Access 11:32229–32240
Chen T, Ma J, Zhu Z, Guo X (2023) Evaluation method for node importance of urban rail network considering traffic characteristics. Sustain 15(4):3582
Tong Y, Zhen R, Dong H, Liu J (2023) Identifying influential ships in multi-ship encounter situation complex network based on improved wvoterank approach. Ocean Eng 284:115192
Cooper I, Mondal A, Antonopoulos CG (2020) A sir model assumption for the spread of covid-19 in different communities. Chaos, Solitons & Fractals 139:110057
Chen D, Lü L, Shang MS, Zhang YC, Zhou T (2012) Identifying influential nodes in complex networks. Phy a: Statistical mechanics and its applications 391(4):1777–1787
Miorandi D, De Pellegrini F (2010) K-shell decomposition for dynamic complex networks, in 8th international symposium on modeling and optimization in mobile, Ad Hoc, and wireless networks (IEEE), pp. 488–496
Barthelemy M (2004) Betweenness centrality in large complex networks. The European phy j B 38(2):163–168
Cohen E, Delling D, Pajor T, Werneck RF (2014) Computing classic closeness centrality, at scale, in Proceedings of the second ACM conference on Online social networks, pp. 37–50
Chen D, Su H (2023) Identification of influential nodes in complex networks with degree and average neighbor degree. IEEE J on Emerging and Selected Topics in Circuits and Systems 13(3):734–742
Zhao J, Wang Y, Deng Y (2020) Identifying influential nodes in complex networks from global perspective. Chaos, Solitons & Fractals 133:109637
Ullah A, Wang B, Sheng J, Long J, Khan N, Sun Z (2021) Identification of nodes influence based on global structure model in complex networks. Scientific Reports 11(1):6173
Mukhtar MF, Abal Abas Z, Baharuddin AS, Norizan MN, Fakhruddin WFWW, Minato W, Rasib AHA, Abidin ZZ, Rahman AFNA, Anuar SHH (2023) Integrating local and global information to identify influential nodes in complex networks. Scientific Reports 13(1):11411
Ullah A, Sheng J, Wang B, Din SU, Khan N (2024) Leveraging neighborhood and path information for influential spreaders recognition in complex networks. J Intelligent Info Syst 62(2):377–401
Telesford QK, Joyce KE, Hayasaka S, Burdette JH, Laurienti PJ (2011) The ubiquity of small-world networks
Qiu L, Zhang J, Tian X (2021) Ranking influential nodes in complex networks based on local and global structures. Applied intelligence 51:4394–4407
Esfandiari S, Fakhrahmad M (2024) Predicting Node Influence in Complex Networks by the K-Shell Entropy and Degree Centrality. Companion Proceedings of the ACM on Web Conference 2024:629–632
Esfandiari S, Moosavi MR (2024) Identifying influential nodes in complex networks through the k-shell index and neighborhood information. J Comput Sci 15:102473
Esfandiari S, Fakhrahmad SM (2024) The collaborative role of K-Shell and PageRank for identifying influential nodes in complex networks. Physica A: Statistical Mechanics and its Applications 658:130256
Ma Ll, Ma C, Zhang HF, Wang BH (2016) Identifying influential spreaders in complex networks based on gravity formula. Phy A: Statistical Mechanics and its Applications 451:205–212
Shang Q, Deng Y, Cheong KH (2021) Identifying influential nodes in complex networks: Effective distance gravity model. Info Sci 577:162–179
Li H, Shang Q, Deng Y (2021) A generalized gravity model for influential spreaders identification in complex networks. Chaos, Solitons & Fractals 143:110456
Li Z, Ren T, Ma X, Liu S, Zhang Y, Zhou T (2019) Identifying influential spreaders by gravity model. Scientific reports 9(1):8387
Ganguly M, Dey P, Roy S (2024) Influence maximization in community-structured social networks: a centrality-based approach. The Journal of Supercomputing pp. 1–44
Khomami MMD, Rezvanian A, Meybodi MR, Bagheri A (2021) Cfin: A community-based algorithm for finding influential nodes in complex social networks. The J Supercomputing 77(3):2207–2236
HamaKarim BR, Mohammadiani RP, Sheikhahmadi A, Hamakarim BR, Bahrami M (2023) A method based on k-shell decomposition to identify influential nodes in complex networks. The J Supercomputing 79(14):15597–15622
Li Q, Cheng L, Wang W, Li X, Li S, Zhu P (2023) Influence maximization through exploring structural information. Applied Mathematics and Computation 442:127721
Zheng H, Zhao H, Ahmadi G (2024) Towards improving community detection in complex networks using influential nodes. J Complex Networks 12(1):cnae001
Sheykhzadeh J, Zarei B, Gharehchopogh FS (2024) Community detection in social networks using a local approach based on node ranking. IEEE Access
Ishfaq U, Khan HU, Iqbal S (2022) Identifying the influential nodes in complex social networks using centrality-based approach. J King Saud University-Computer and Info Sci 34(10):9376–9392
Sheikhahmadi A, Veisi F, Sheikhahmadi A, Mohammadimajd S (2022) A multi-attribute method for ranking influential nodes in complex networks. Plos one 17(11):e0278129
Zhang Y, Lu Y, Yang G, Hang Z (2022) Multi-attribute decision making method for node importance metric in complex network. Appl Sci 12(4):1944
Katz L (1953) A new status index derived from sociometric analysis. Psychometrika 18(1):39–43
Bonacich P (1972) Factoring and weighting approaches to status scores and clique identification. J mathematical sociology 2(1):113–120
Bae J, Kim S (2014) Identifying and ranking influential spreaders in complex networks by neighborhood coreness. Phy A: Statistical Mechanics and its Applications 395:549–559
Zareie A, Sheikhahmadi A, Fatemi A (2017) Influential nodes ranking in complex networks: An entropy-based approach. Chaos, Solitons & Fractals 104:485–494
Hajarathaiah K, Enduri MK, Anamalamudi S, Abdul A, Chen J (2024) Node significance analysis in complex networks using machine learning and centrality measures. IEEE Access
Rezaei AA, Munoz J, Jalili M, Khayyam H (2023) A machine learning-based approach for vital node identification in complex networks. Expert Systems with Applications 214:119086
Zhao G, Jia P, Zhou A, Zhang B (2020) Infgcn: Identifying influential nodes in complex networks with graph convolutional networks. Neurocomputing 414:18–26
Rossi R, Ahmed N (2015) The network data repository with interactive graph analytics and visualization, in Proceedings of the AAAI conference on artificial intelligence, vol. 29
Kunegis J (2013) Konect: the koblenz network collection, in Proceedings of the 22nd international conference on world wide web, pp. 1343–1350
Knight WR (1966) A computer method for calculating kendall’s tau with ungrouped data. J the American Statistical Association 61(314):436–439
Costa GS, Ferreira SC (2020) Nonmassive immunization to contain spreading on complex networks. Phy Review E 101(2):022311
Wang J, Hou X, Li K, Ding Y (2017) A novel weight neighborhood centrality algorithm for identifying influential spreaders in complex networks. Phy A: Statistical Mechanics and its Applications 475:88–105
Zhao N, Yang S, Wang H, Zhou X, Luo T, Wang J (2024) A novel method to identify key nodes in complex networks based on degree and neighborhood information. Appl Sci 14(2):521
Massey FJ Jr (1951) The kolmogorov-smirnov test for goodness of fit. J the American statistical Association 46(253):68–78
Author information
Authors and Affiliations
Contributions
Shima Esfandiari: Methodology, Software, Validation, Writing - Original Draft, Writing - Review & Editing Seyed Mostafa Fakhrahmad: Supervision, Writing - Review & Editing.
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
1.1 Preliminaries
The methods presented here will be implemented in different networks and compared with the proposed method from various aspects.
-
DC [18]: It is a local and simple feature, counting the number of neighbors for each node without considering the importance of each neighbor.
-
KS [19]: It is a global metric with linear time complexity. It has a hierarchical algorithm that operates based on node degree. Initially, it removes nodes with a degree=1 and assigns them a value of K-Shell index equal to one, then continues to remove and assign KS =1 to the remaining nodes until no nodes with a degree of one are left. This process repeats for nodes with a degree of two, assigning them KS-index, and continues iteratively until all nodes are assigned KS.
-
GP [23]: It is a global metric based on degree and incorporates two features: SI and GI. SI calculates the spreading capability of each node according to its degree powered by a base e. GI considers the degrees of neighboring nodes in proportion to the inverse of their distance. Finally, the spreading capability of each node is determined by a weighted combination of SI and GI, where the parameter weights SI is \(e^{\vartheta }\) and GI is weighed by \(\sigma\). It is defined by equation 7,8. Symbols \(k_{i}\), \(n\), \(d_{ij}\) represent the degree of node i, the number of nodes, and the shortest distance between two nodes i and j, respectively.
$$\begin{aligned} {SI}_{i}= & e^{\frac{1}{n}*k_{i}*\vartheta }, {GI}_{i} = \sum _{i \ne j} \frac{k_{j}*\sigma }{d_{ij}} \end{aligned}$$(7)$$\begin{aligned} {GP}_{i}= & \left\{ \begin{matrix} {SI}_{i} \times {GI}_{i}\ \ \ \ \ connected\ \ node \\ \ 0\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ unconnected\ node \\ \end{matrix} \right. \end{aligned}$$(8) -
GSM [24]: It is a global metric based on KS, similar to GP, and consists of two features SI (Self-Influence) and GI (Global-Influence). The Self-Influence considers the spreading capability of a node by ‘e’ raised to the power of K-Shell divided by the size of the network. GI is a global metric that considers all nodes and computes the influence of each node based on the K-Shell and the inverse of the distance to the target node. Ultimately, the GSM method’s spreading capability is equal to the product of SI and GI. It is defined by equation 9 where \({ks}_{i}\), \(n\), \(d_{ij}\) represent the K-Shell of node i, the number of nodes, and the shortest distance between two nodes i and j, respectively.
$$\begin{aligned} {GSM}_{i} = \ {SI}_{i} \times {GI}_{i} = e^{\frac{{ks}_{i}}{n}} \times \sum _{i \ne j} \frac{{ks}_{j}}{d_{ij}} \end{aligned}$$(9) -
HGSM [25]: It is a hybrid method that integrates local and global features. It uses Degree as the local metric and K-Shell as the global metric. It also defines two features of Self-Influence (SI) and Global Influence (GI) similar to GP and GSM. In SI, the spreading capability of each node is calculated by e raised to the power of the multiply of its Degree and K-Shell, divided by the number of nodes. The GI calculates the spreading capability of each node by distance and a metric derived from the Degree and K-Shell. It is defined by equation 10 where \({ks}_{i}\), \(n\), \(d_{ij}\) represent the K-Shell of node i, the number of nodes, and the shortest distance between nodes i and j, respectively. The \(k_{i},\) \(ks,\ k\) represents the degree of node i, the K-Shell value in the network, and the average degree in the network, respectively.
$$\begin{aligned} {HGSM}_{i} = \ {SI}_{i} \times {GI}_{i} = e^{\frac{{{ks}_{i}*k}_{i}}{n}} \times \sum _{i \ne j} \frac{e^{\frac{{{ks}_{j}*k}_{j}}{n}}}{{d_{ij}}^{\lceil \log _{2}{< e^{\frac{ks*k}{n}}>} \rceil }} \end{aligned}$$(10) -
Degree and Neighborhood information Centrality (DNC) [57]: It uses the sum of the local clustering coefficients of first-level neighbors and combines it in a weighted manner with the node degree. Its formula is in equation 11, 12. Symbols \(C_{j}\), \(k_{i}\), \(N_{1}(i)\) denote the local clustering coefficient of node j, the degree of node i and first-level neighbors of node i.
$$\begin{aligned} {slc}_{i}= & \sum _{j \in N_{1}(i)} C_{j} \end{aligned}$$(11)$$\begin{aligned} DNC(i)= & k_{i} + \alpha \sum _{j \in N_{1}(i)} C_{j},\ \alpha> 0 \end{aligned}$$(12) -
NPIC [26]: It is a combined metric that integrates global and local features. It uses a combination of node degree and KS. The calculated value for each node is divided by the total number of nodes in the network, and for calculating the influence received from other nodes their distance is important. Additionally, two parameters, \(\alpha\), and \(\beta\), are used to adjust the impact of nodes and their neighbors. It is defined by equation 13 where \({ks}_{i}\), \(n\), \(d_{ij}\) represent the K-Shell of node i, the number of nodes, and the shortest distance between nodes i and j, respectively. The \(k_{i}\) denotes the degree of node i.
$$\begin{aligned} NPIC(i) = \frac{({ks}_{i}*k_{i}) + \alpha }{n}*\sum _{i \ne j} \frac{({ks}_{j}*k_{j}) + \beta }{d_{ij}} \end{aligned}$$(13) -
LGI [28]: In LGI calculation, the number of common neighbors between each node and its neighbors is counted and divided by the node’s degree to create the LC feature. Then, this metric and the degree are normalized based on their sum within their neighbors and combined in a weighted manner to derive the LI feature. LI is a local feature. It has a global feature named GI that is constructed by the KS of nodes and the sum of its neighbor’s KS. Finally, the GI is combined with LI to compute LGI. Its definition is presented in equation 18 where \(N_{1}(j),\ k_{i},\ {ks}_{j}\) represent the first-level neighbors of node j, the degree of node i, and k-shell of node j, respectively. The \(\alpha\) and \(\beta\) are tuning parameters.
$$\begin{aligned} R(i,j)= & \frac{\left| N_{1}(j) \cap N_{1}(i) \right| }{k_{i}} \end{aligned}$$(14)$$\begin{aligned} LC(i)= & \sum _{j \in N_{1}(i)} {R(i,j)} \end{aligned}$$(15)$$\begin{aligned} LI(i)= & \alpha \frac{k_{i}}{\sum _{j \in N_{1}(i)} k_{j}} + \beta \frac{LC(i)}{\sum _{j \in N_{1}(i)} {LC(j)}} \end{aligned}$$(16)$$\begin{aligned} GI(i)= & {ks}_{i} + \sum _{j \in N_{1}(i)} {ks}_{j} \end{aligned}$$(17)$$\begin{aligned} LGI(i)= & \sum _{j \in N_{1}(i)} {LI(j)*GI(j)} \end{aligned}$$(18) -
GGC [34]: Similar to the Gravity formula, it uses degree centrality and a coefficient composed of the local clustering coefficient as weight. Its definition is in equation 19 where \(k_{i},\ C_{j},\ d_{ij}\) denotes the degree of node i, local clustering coefficient of node j, and the distance between two nodes i and j, respectively. The \(R\) and \(\alpha\) represent the half of the average shortest path and tuning parameter.
$$\begin{aligned} GGC(i) = \sum _{d_{ij} \le R,\ \ \ j \ne i} \frac{e^{- \alpha \left( C_{i} + C_{j} \right) }k_{i}k_{j}}{d_{ij}^{2}} \end{aligned}$$(19) -
DC+, DCGM+, DCGM++: In [22], three metrics are introduced. Initially, the DC+ feature is defined as the sum of the degrees of first-level neighbors divided by the node’s degree. Then, it is utilized as a weight in the gravity formula, forming DCGM+. Finally, DC+ is combined with DCGM+ to introduce DCGM++, which integrates both metrics. They are defined as equations (20-23) where \(k_{j},\ d_{ij},\ N_{1}(i)\) represents the degree of node j, distance between two nodes i and j, and first-level neighbors of node i, respectively.
$$\begin{aligned} {DC}^{+}(i)= & \frac{\sum _{j \in N_{1}(i)} k_{j}}{k_{i}} \end{aligned}$$(20)$$\begin{aligned} {DCGM}^{+}(i)= & \sum _{d_{ij} \le R,\ \ \ j \ne i} \frac{{DC}^{+}(i){DC}^{+}(j)}{{d_{ij}}^{2}} \end{aligned}$$(21)$$\begin{aligned} {DCGM}^{+ +}(i)= & {DC}^{+}(i)DCGM(i) \end{aligned}$$(22)$$\begin{aligned} {DCGM}(i)= & \sum _{d_{ij} \le R,\ \ \ j \ne i} \frac{{DC}^{*}(i){DC}^{*}(j)}{{d_{ij}}^{2}}, \ \ \ {DC}^{*}(i)= \frac{k_{i}}{n-1} \end{aligned}$$(23)
Despite the benefits of the mentioned methods, they also have some limitations. Some of them have high time complexity; others have poor accuracy or low resolution. For instance, the computational complexity of Global Perspective, GSM and HGSM, is quadratic. The GGC and DNC use a combination of degree and local clustering coefficients but do not consider the position of nodes in the network adequately. We proposed a method that performs well in various evaluation aspects while maintaining high accuracy, precision, and resolution, with low time complexity.
1.2 Other evaluation metrics
-
SIR Model: The SIR model [17] is employed to determine the actual dissemination capability of each node [14,15,16,17,18,19]. Nodes are classified into three conditions: removed, infected, or susceptible. Initially, all nodes are in the susceptible state except one infected node. In each step, the infected node infects its susceptible neighbors with a beta probability and goes to the removed state with a probability of \(\alpha\). Once a node enters the R state, it becomes immune forever. This process continues until there are no infected nodes in the network. The average percentage of enhanced nodes after 100 iterations is considered the real influence ability of the node (the same node that was initially infected).
-
Kolmogorov–Smirnov (K-S) Test [58]: A nonparametric statistical test is used to compare two distributions. In this paper, the two-sample test version is employed, where the accuracy curve of one of the recent methods in the given network is selected and compared with the proposed method. This test aims to determine the maximum difference between the empirical distribution functions of the accuracy curves of the two methods and whether this difference is statistically significant. Additionally, it identifies which method is superior in this test.
-
Omega [27]: Real-world networks represent activities in the real world which can be modeled as complex networks. Complex networks have several categories, like small-world, and random networks, scale-free. Scale-free networks follow a power law degree distribution, featuring a few high-degree hub nodes connected to many low-degree nodes. Connections in random networks are created randomly with a constant probability. The small-world networks have a high clustering coefficient and a short average shortest path, adhering to the concept of six degrees of separation. Small-world networks are similar to regular networks in their high clustering but resemble random graphs in their short path lengths between nodes. The omega metric quantifies these features and indicates the similarity of a network to a regular network and a random network. In other words, omega shows the small-world coefficient, as shown in equation 24.
$$\begin{aligned} \omega = \frac{L_{rand}}{L} - \frac{C}{C_{latt}} \end{aligned}$$(24)In equation 24, L represents the average shortest path, and C indicates the average clustering coefficient. \(L_{rand}\) denotes the value of L in a random network with the same number of nodes, and \(C_latt\) denotes C in a lattice network with the same number of nodes in the network under investigation. Figure 15 illustrates these points. The \(\omega\) is in (-1, 1). If \(\omega\) is close to 0, it indicates that the network is small-world. Positive values show randomness, while negative values indicate a regular network, such as a lattice-like network.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Esfandiari, S., Fakhrahmad, S.M. Identifying influential nodes in complex networks by adjusted feature contributions and neighborhood impact. J Supercomput 81, 503 (2025). https://doi.org/10.1007/s11227-024-06645-1
Accepted:
Published:
DOI: https://doi.org/10.1007/s11227-024-06645-1