Skip to main content
Log in

Clustering uncertain graphs using ant colony optimization (ACO)

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

In deterministic graphs, an edge between two vertices denotes a certain link. In contrast, in probabilistic graph, a link between two vertices merely implies the possibility of its existence based on probability. Probabilistic data results from uncertainties due to the preprocessing, data collection process, or the inherent nature of the problem which results in uncertain outcomes. These types of graphs are common in the real-world applications such as protein–protein interactions and identifying links in social media. Clustering probabilistic graphs is a challenging task since computing traditional metrics (like distance, paths, etc.) will all be probabilistic. Therefore, determining a valid clustering or making the data deterministic is an important research problem. We propose a new clustering algorithm for probabilistic graphs using the ant colony optimization (ACO) technique. The algorithm uses multiple versions of the probabilistic graph and employs a modified ACO to optimize the objective function. Moreover, heuristics are proposed to guide the algorithm for better accuracy and faster convergence. The proposed approach is tested against two real-world probabilistic graphs and five synthetic datasets using multiple cluster validity indices. Results show that ACO with heuristic guidance can produce good solutions that are comparable to or better than other traditional approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Gotlieb CC, Kumar S (1968) Semantic clustering of index terms. J ACM (JACM) 15:493–513

    Article  Google Scholar 

  2. Pacheco TM, Gonçalves LB, Ströele V, Soares SSR (2018) An ant colony optimization for automatic data clustering problem. In: 2018 IEEE congress on evolutionary computation (CEC). IEEE, pp 1–8

  3. Hussain SF, Haris M (2019) A k-means based co-clustering (kCC) algorithm for sparse, high dimensional data. Expert Syst Appl 118:20–34

    Article  Google Scholar 

  4. Hussain SF (2011) Bi-clustering gene expression data using co-similarity. In: Proceedings of the international conferences on advanced data mining and applications (ADMA). Beijing, China, pp 190–200

  5. Zhao B, Wang J, Li M et al (2014) Detecting protein complexes based on uncertain graph model. IEEE/ACM Trans Comput Biol Bioinform (TCBB) 11:486–497

    Article  Google Scholar 

  6. Vu K, Zheng R (2011) Robust coverage under uncertainty in wireless sensor networks. In: Proceedings of IEEE international conference on computer communications (INFOCOM). IEEE, pp 2015–2023

  7. Ahmed NM, Chen L (2016) An efficient algorithm for link prediction in temporal uncertain social networks. Inf Sci 331:120–136

    Article  MathSciNet  Google Scholar 

  8. Chen X, Chen M, Shi W et al (2019) Embedding uncertain knowledge graphs. In: Proceedings of the AAAI conference on artificial intelligence pp 3363–3370

  9. Halim Z, Waqas M, Hussain SF (2015) Clustering large probabilistic graphs using multi-population evolutionary algorithm. Inf Sci 317:78–95

    Article  Google Scholar 

  10. İnkaya T, Kayalıgil S, Özdemirel NE (2015) Ant colony optimization based clustering methodology. Appl Soft Comput 28:301–311

    Article  Google Scholar 

  11. Shelokar PS, Jayaraman VK, Kulkarni BD (2004) An ant colony approach for clustering. Anal Chim Acta 509:187–195

    Article  Google Scholar 

  12. Jahanshahi M, Maleki E, Ghiami A (2017) On the efficiency of artificial neural networks for plastic analysis of planar frames in comparison with genetic algorithms and ant colony systems. Neural Comput Appl 28:3209–3227

    Article  Google Scholar 

  13. AlFarraj O, AlZubi A, Tolba A (2019) Optimized feature selection algorithm based on fireflies with gravitational ant colony algorithm for big data predictive analytics. Neural Comput Appl 31:1391–1403

    Article  Google Scholar 

  14. Gao W, Hu L, Zhang P (2018) Class-specific mutual information variation for feature selection. Pattern Recogn 79:328–339

    Article  Google Scholar 

  15. Agrawal P, Sarma AD, Ullman J, Widom J (2010) Foundations of uncertain-data integration. In: Proceedings of the VLDB endowment 3, pp 1080–1090

  16. Aggarwal CC (2013) A survey of uncertain data clustering algorithms. Taylor and Francis, England

    Book  Google Scholar 

  17. Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings 2nd international conference on knowledge discovery and data mining (KDD), pp 226–231

  18. Kriegel H-P, Pfeifle M (2005) Density-based clustering of uncertain data. In: Proceedings of the eleventh ACM SIGKDD international conference on knowledge discovery in data mining, pp 672–677

  19. Kriegel H-P, Pfeifle M (2005) Hierarchical density-based clustering of uncertain data. In: Fifth IEEE international conference on data mining (ICDM’05) IEEE, p 4

  20. Ankerst M, Breunig MM, Kriegel HP, Sander J (1999) OPTICS: ordering points to identify the clustering structure. In: Proceedings of the ACM SIGMOD 28, pp 49–60

  21. Chau M, Cheng R, Kao B, Ng J (2006) Uncertain data mining: an example in clustering location data. In: Pacific-Asia conference on knowledge discovery and data mining. Springer, pp 199–204

  22. Ngai WK, Kao B, Chui CK et al (2006) Efficient clustering of uncertain data. In: Sixth international conference on data mining (ICDM’06). IEEE, pp 436–445

  23. Cormode G, McGregor A (2008) Approximation algorithms for clustering uncertain data. In: Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems pp 191–200

  24. Foggia P, Percannella G, Sansone C, Vento M (2007) A graph-based clustering method and its applications. In: International symposium on brain, vision, and artificial intelligence. Springer, pp 277–287

  25. Pfeiffer, J. and Neville, J., (2011) Methods to determine node centrality and clustering in graphs with uncertain structure. In Proceedings of the International AAAI Conference on Web and Social Media (Vol. 5, No. 1, pp. 590-593).

  26. Pelekis N, Kopanakis I, Kotsifakos EE et al (2011) Clustering uncertain trajectories. Knowl Inf Syst 28:117–147

    Article  Google Scholar 

  27. Di Mauro N, Taranto C, Esposito F (2014) Link classification with probabilistic graphs. J Intell Inf Syst 42:181–206

    Article  Google Scholar 

  28. Kollios G, Potamias M, Terzi E (2011) Clustering large probabilistic graphs. IEEE Trans Knowl Data Eng 25:325–336

    Article  Google Scholar 

  29. Symeonidis P, Iakovidou N, Mantas N, Manolopoulos Y (2013) From biological to social networks: link prediction based on multi-way spectral clustering. Data Knowl Eng 87:226–242

    Article  Google Scholar 

  30. Halim Z, Waqas M, Baig AR, Rashid A (2017) Efficient clustering of large uncertain graphs using neighborhood information. Int J Approx Reason 90:274–291

    Article  MathSciNet  Google Scholar 

  31. Dadaneh BZ, Markid HY, Zakerolhosseini A (2016) Unsupervised probabilistic feature selection using ant colony optimization. Expert Syst Appl 53:27–42

    Article  Google Scholar 

  32. Kassiano V, Gounaris A, Papadopoulos AN, Tsichlas K (2016) Mining uncertain graphs: an overview. In: International workshop of algorithmic aspects of cloud computing. Springer, pp 87–116

  33. Ceccarello M, Fantozzi C, Pietracaprina A et al (2017) Clustering uncertain graphs. In: Proceedings of the VLDB endowment 11, pp 472–484

  34. Han K, Gui F, Xiao X et al (2019) Efficient and effective algorithms for clustering uncertain graphs. In: Proceedings of the VLDB endowment 12, pp 667–680

  35. Xu R, Wunsch D (2005) Survey of clustering algorithms. IEEE Trans Neural Networks 16:645–678

    Article  Google Scholar 

  36. Buhmann JM (2003) Data clustering and learning. In: The handbook of brain theory and neural networks, pp 278–281

  37. Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv (CSUR) 31:264–323

    Article  Google Scholar 

  38. Hussain SF, Iqbal S (2018) CCGA: co-similarity based co-clustering using genetic algorithm. Appl Soft Comput 72:30–42

    Article  Google Scholar 

  39. Gambardella LM, Dorigo M (2000) An ant colony system hybridized with a new local search for the sequential ordering problem. Informs J Comput 12:237–255

    Article  MathSciNet  Google Scholar 

  40. Stutzle T, Hoos H (1997) Max-min ant system and local search for combinatorial optimization. In: 2nd international conference on metaheuristics, Sophie-Antipolis, France

  41. Chiaravalloti AD, Greco G, Guzzo A, Pontieri L (2006) An information-theoretic framework for high-order co-clustering of heterogeneous objects. Lect Notes Comput Sci 4212:598

    Article  Google Scholar 

  42. Davis JV, Kulis B, Jain P et al (2007) Information-theoretic metric learning. In: Proceedings of the 24th international conference on Machine learning. p 216

  43. Shang C, Li M, Feng S et al (2013) Feature selection via maximizing global information gain for text classification. Knowl Based Syst 54:298–309

    Article  Google Scholar 

  44. Hussain SF, Maab I (2021) Clustering probabilistic graphs using neighborhood paths. Inform Sci Appear. https://doi.org/10.1016/j.ins.2021.03.057

    Article  Google Scholar 

  45. Krogan NJ, Cagney G, Yu H et al (2006) Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature 440:637–643

    Article  Google Scholar 

  46. Dhillon IS (2001) Co-clustering documents and words using bipartite spectral graph partitioning. In: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining, pp 269–274

  47. Hussain SF (2019) A novel robust kernel for classifying high-dimensional data using support vector machines. Expert Syst Appl 131:116–131

    Article  Google Scholar 

  48. Glenn TC, Zare A, Gader PD (2014) Bayesian fuzzy clustering. IEEE Trans Fuzzy Syst 23:1545–1561

    Article  Google Scholar 

  49. Hussain SF, Pervaiz A, Hussain M (2020) Co-clustering optimization using artificial bee colony (ABC) algorithm. Appl Soft Comput 97:106725

    Article  Google Scholar 

  50. Li M (2015) Efficiency improvement of ant colony optimization in solving the moderate LTSP. J Syst Eng Electron 26(6):1300–1308

    Article  Google Scholar 

Download references

Acknowledgements

This work was done as part of an MS thesis by Ifra Arif Butt. The author wishes to acknowledge the Ghulam Ishaq Khan Institute of Engineering Sciences and Technology for providing a funded scholarship for her MS studies.

Funding

None.

Author information

Authors and Affiliations

Authors

Contributions

Syed Fawad Hussain proposed the main idea of this work and is responsible for writing the major chunk of the manuscript, including the related work, proposed work and discussion related to results. Ifra Arif is responsible for coding the methods (proposed and comparative analysis) and generating the result section. She is also responsible for the initial draft and parts of the related work. Muhammad Hanif has been involved in discussions during the work and contributed to writing Introduction as well as the artwork. Sajid Anwar gave many valuable inputs and critical analysis regarding the result section. He also contributed in the manuscript including parts of the results and discussion section, as well as overall revision and improvement of the manuscript.

Corresponding author

Correspondence to Syed Fawad Hussain.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hussain, S.F., Butt, I.A., Hanif, M. et al. Clustering uncertain graphs using ant colony optimization (ACO). Neural Comput & Applic 34, 11721–11738 (2022). https://doi.org/10.1007/s00521-022-07063-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-07063-1

Keywords

Navigation