Skip to main content
Log in

Improving network systems performance by clustering distributed database sites

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Clustering network sites is a vital issue in parallel and distributed database systems DDBS. Grouping distributed database network sites into clusters is considered an efficient way to minimize the communication time required for query processing. However, clustering network sites is still an open research problem since its optimal solution is NP-complete. The main contribution in this field is to find a near optimal solution that groups distributed database network sites into disjoint clusters in order to minimize the communication time required for data allocation. Grouping a large number of network sites into a small number of clusters effectively increases the transaction response time, results in better data distribution, and improves the distributed database system performance. We present a novel algorithm for clustering distributed database network sites based on the communication time as database query processing is time dependent. Extensive experimental tests and simulations are conducted on this clustering algorithm. The experimental and simulation results show that a better network distribution is achieved with significant network servers load balance and network delay, a minor communication time between network sites is realized, and a higher distributed database system performance is recognized.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Ozsu M, Valduriez P (1991) Principles of distributed database systems, 1st edn. Prentice-Hall, Englewood Cliffs

    Google Scholar 

  2. Chen E (2007) Distributed DBMS concepts and design. Available from: http://www.cs.sjsu.edu/~lee/cs157b/fall2003/Edward_Chen_Chapter%2022.ppt. Accessed 9th November, 2007

  3. Graham J (2005) Efficient allocation in distributed object oriented databases with capacity and security constraints. Ph.D. Dissertation. University of Idaho

  4. Hoffer J, Prescott M, McFadden F (2004) Modern database management, 7th edn. Prentice-Hall, Englewood Cliffs

    Google Scholar 

  5. Ozsu M, Valduriez P (1999) Principles of distributed database systems, 2nd edn. Prentice-Hall, Englewood Cliffs

    Google Scholar 

  6. Can F (1993) Incremental clustering for dynamic information processing. ACM Trans Inf Syst 11(2):143–164

    Article  MathSciNet  Google Scholar 

  7. Younis O, Fahmy S (2004) Distributed clustering in ad-hoc sensor networks: a hybrid, energy-efficient approach. In: The conference on computer communications, the twenty-third conference of the IEEE communications society, March 7–11, Hong Kong

  8. Halkidi M, Batistakis Y, Vazirgiannis M (2001) Clustering algorithms and validity measures. In: Proceedings of the SSDBM conference

  9. Lingras P, West C (2004) Interval set clustering of web users with rough k-means. J Intell Inf Syst 23(1):5–16

    Article  MATH  Google Scholar 

  10. Shyu M, Chen S, Rubin S (2004) Stochastic clustering for organizing distributed information sources. IEEE Trans Syst Man, Cybern B 34(5):2035–2047

    Article  Google Scholar 

  11. Son J, Kim M (2004) An adaptable vertical partitioning method in distributed systems. J Syst Softw 73(3):551–561

    Article  Google Scholar 

  12. Agrawal S, Narasayya V, Yang B (2004) Integrating vertical and horizontal partitioning into automated physical database design. In: SIGMOD 2004, Paris, France. ACM, New York, pp 359–370

    Google Scholar 

  13. Ma H, Scchewe K, Wang Q (2007) Distribution design for higher-order data models. Data Knowl Eng 60:400–434

    Article  Google Scholar 

  14. Costa R, Lifschitz S (2003) Database allocation strategies for parallel BLAST evaluation on clusters. Distrib Parallel Databases 13:99–127

    Article  MATH  Google Scholar 

  15. Menon S (2005) Allocating fragments in distributed databases. IEEE Trans Parallel Distrib Syst 16(7):577–585

    Article  Google Scholar 

  16. Hababeh I, Ramachandran M, Bowring N (2007) A high-performance computing method for data allocation in distributed database systems. J Supercomput 39(1):3–18

    Article  Google Scholar 

  17. Hababeh I, Ramachandran M, Bowring N (2008) Designing a high performance integrated strategy for secured distributed database systems. Int J Comput Res (IJCR) 16(1):1–52

    MATH  Google Scholar 

  18. Hamerly G, Elkan C (2003) Learning the K in K-means. In: 7th Annual conference on neural information processing systems

  19. Lingras P, Yao Y (2002) Time complexity of rough clustering: gas versus k-means. In: Third international conference on rough sets and current trends in computing. LNCS. Springer, London, pp 263–270

    Chapter  Google Scholar 

  20. Kumar P, Krishna P, Bapi R, Kumar S (2007) Rough clustering of sequential data. Data Knowl Eng 63:183–199

    Article  Google Scholar 

  21. Fronczak A, Holyst J, Jedyank M, Sienkiewicz J (2002) Higher order clustering coefficients. Barabasi-Albert networks. Physica A 316(1–4):688–694

    Article  MathSciNet  MATH  Google Scholar 

  22. OPNET IT Guru Academic Edition 9.1, OPNET Technologies, Inc (2003) Available from: http://www.opnet.com/university_program/itguru_academic_edition/. Accessed 30th January, 2009

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ismail Hababeh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hababeh, I. Improving network systems performance by clustering distributed database sites. J Supercomput 59, 249–267 (2012). https://doi.org/10.1007/s11227-010-0436-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-010-0436-9

Keywords

Navigation