Skip to main content
Log in

Logically Clustered Architectures for Networked Databases

  • Published:
Distributed and Parallel Databases Aims and scope Submit manuscript

Abstract

By effectively harnessing networked computing resources, the two-tier client-server model has been used to support shared data access. In systems based on this approach, the database servers often become performance bottlenecks when the number of concurrent users is large. Client data caching techniques have been proposed in order to ease resource contention at the servers. The key theme of these techniques is the exploitation of user data access locality. In this paper, we propose a three-tiered model that takes advantage of such data access locality to furnish a much more scalable system. Groups of clients that demonstrate similarities in their data access behavior are logically clustered together. Each such group of clients is handled by an Intermediate Cluster Manager (ICM) that acts as a cluster-wide directory service and cache manager. Clients within the same cluster are now capable of sharing data among themselves without interacting with the server(s). This results in reduced server load and allows the support of a much larger number of clients. Through prototyping and experimentation, we show that the logical clustering of clients, and the introduction of the ICM layer, significantly improve system scalability as well as transaction response times. Logical clusters, consisting of clients with similar data access patterns, are identified with the help of both a greedy algorithm and a genetic algorithm. For the latter, we have developed an encoding scheme and its corresponding operators.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. J. Andrade, M. Carges, and M. MacBlane, “The TUXEDO System: An open on-line transaction processing environment,” Data Engineering Bulletin, vol. 17, no. 1, 1994.

  2. P. Apers, “Data allocation in distributed database systems,” ACM-Transaction on Database Systems, vol. 13, no. 3, pp. 263–304, 1988.

    Google Scholar 

  3. J. Banerjee, W. Kim, S.-J. Kim, and J.F. Garza, “Clustering a DAG for CAD Databases,” IEEE Transactions on Software Engineering, vol. 14, no. 11, 1988.

  4. P. Bernstein, V. Hadzilakos, and N. Goodman, Concurrency Control and Recovery in Database Systems. Addison-Wesley Longman, Reading, MA, 1987.

    Google Scholar 

  5. A. Biliris and J. Orenstein, “Object storage management architectures,” in: Advances in Object-Oriented Database Systems, Proceedings of the NATO Advanced Study Institute on Object-Oriented Database Systems, Kusadasi, Turkey, 1993.

  6. M. Blaze and R. Alonso, “Dynamic hierarchical caching in large-scale distributed File Systems,” in: Proc. 12th International Conference On Distributed Computing Systems, Yokohama, Japan, 1992.

  7. P. Butterworth, “The resurgent mainframe and the future of distributed computing,” Technical report, Forté Software Inc. Oakland, CA. White Paper on Forté Fusion Technologies available at http://www.forte.com, 1999.

    Google Scholar 

  8. M. Carey, M. Franklin, and M. Zaharioudakis, “Fine-grained sharing in a page server OODBMS,” in: Proceedings of the ACM SIGMOD Conference, Minneapolis, MN, 1994.

  9. A. Chankhunthod, P. Danzig, C. Neerdaels, M. Schwartz, and K. Worrell, “A hierarchical internet object cache”. in: Proceedings of the USENIX 1996 Annual Technical Conference, San Diego, pp. 153–163, 1996.

  10. R. Chow and T. Johnson, Distributed Operating Systems and Algorithms, Addison-Wesley Reading, MA 1997.

    Google Scholar 

  11. I. Chu and M. Winslett, “Choices in database workstation-server architecture,” in: Proceedings of the 17th Annual International Computer Software and Applications Conference, Phoenix, AZ, 1993.

  12. T. Cormen, C. Leiserson, and R. Rivest: 1990, Introduction to Algorithms. New York, NY: McGraw Hill.

    Google Scholar 

  13. M. Dahlin, C. Mather, R. Wang, T. Anderson, and D. Patterson, “A quantitative analysis of cache policies for scalable network file systems,” in Proceedings of the Sigmetrics Conference on Measurement and Modeling of Computer Systems, 1994.

  14. A. Delis and N. Roussopoulos, “Performance comparison of three modern DBMS architectures,” IEEE- Transactions on Software Engineering, vol. 19, no. 2, pp. 120–138, 1993.

    Google Scholar 

  15. D. DeWitt, P. Futtersack, D. Maier, and F. Velez, “A study of three alternative workstation-server architectures for object oriented database systems,” in Proceedings of the 16th International Conference on Very Large Data Bases, Brisbane, Queensland, Australia, pp. 107–121, 1990.

  16. D. Dias, W. Kish, R. Mukherjee, and R. Tewari, “A scalable and highly available web server,” in Proceedings of COMPCON 1996, Forty-First IEEE Computer Society International Conference: Technologies for the Information Superhighway, Santa Clara, CA, 1996.

  17. D. Dilts and W. Wu, “Using knowledge-based technology to integrate CIM databases,” IEEE Transactions on Knowledge and Data Engineering, vol. 3, no. 2, pp. 237–245, 1991.

    Google Scholar 

  18. B. Duska, D. Marwood, and M. Freeley, “The measured access characteristics ofWorld-Wide-Web client proxy caches,” in Proceedings of the 1997 USENIX Symposium on Internet Technologies and Systems (USITS-97), Monterey, CA, 1997.

  19. L. Fan, P. Cao, J. Almeida, and A. Broder, “Summary cache:Ascalable wide-area web cache sharing protocol,” in: Proceedings of the ACM SIGCOMM'98 Conference, Vancouver, Canada, pp. 254–265, 1998.

  20. S. Gadde, M. Rabinovich, and J. Chase, “Reduce, reuse, recycle: An approach to building large internet caches,” in Proceedings of the 6th Workshop on Hot Topics in Operating Systems, Cape Cod, MA, 1997.

  21. M. Garey and D. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness W.H. Freeman & Company, New York, NY, 1979.

    Google Scholar 

  22. G. Gerlhof, A. Kemper, C. Kilger, and G. Moerkotte, “Partition-based clustering in object bases: From theory to practice,” in Proceedings of the International Conference on Foundations of Data Organization, Chicago, IL, 1993.

  23. S. Glassman, “A caching relay for theWorld-WideWeb,” in Proceedings of the First InternationalWorldWide Web Conference, Geneva, Switzerland, 1994.

  24. D. Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley Reading, MA, 1989.

    Google Scholar 

  25. J. Grefenstette, “Optimization of control parameters for genetic algorithms,” IEEE Transactions on Systems, Man and Cybernetics, vol. 16, no. 1, pp. 122–128, 1986.

    Google Scholar 

  26. S. Gribble and E. Brewer, “System design issues for internet middleware services: Deductions from a large client trace,” in Proceedings of the 1997 USENIX Symposium on Internet Technologies and Systems (USITS-97), Monterey, CA, 1997.

  27. J. Holland, Adaptation in Natural and Artificial Systems, Ann Arbor, MI, University of Michigan Press, 1975.

    Google Scholar 

  28. S. Hudson and R. King, “Cactis: A self-adaptive, concurrent implementation of an object-oriented database management system,” ACM Transactions on Database Systems, vol. 14, no. 3, pp. 291–321, 1989.

    Google Scholar 

  29. A. Hurson, S. Pakzad, and J. Cheng, “Object-oriented database management systems: Evolution and performance issues,” IEEE Computer, vol. 26, no. 2, 1993.

  30. H. Ishikawa, Y. Yamane, Y. Izumida, and N. Kawato, “An object-oriented database system Jasmine: Implementation, application, and extension,” IEEE Transactions on Knowledge and Data Engineering, vol. 8, no. 2, 1996.

  31. A. Iyengar, M. Squillante, and L. Zhang, “Analysis and characterization of large-scale web server access patterns and performance,” World Wide Web, vol. 2, nos. 1- 2, 1999.

  32. A. Jain, M. Murty, and P. Flynn, “Data clustering: A review,” ACM Computing Surveys, vol. 31, no. 3, pp. 264–323, 1999.

    Google Scholar 

  33. G. Jones, ObjectStore 6.0, Technical report, Object Design, Inc., 1999.

  34. H. Kitagawa and N. Ohbo, “Design data modeling with versioned conceptual configuration,” in Proceedings of the 13th Annual International Computer Software and Applications Conference, Orlando, FL, September 1989.

  35. A. Leff, P. Yu, and J. Wolf, “Policies for efficient memory utilization in a remote caching architecture,” Miami Beach, FL, December 1991.

  36. A. Luotonen and K. Atlis, “World-Wide Web proxies,” in Proceedings of the First International World Wide Web Conference, Geneva, Switzerland, 1994.

  37. R. Malpani, J. Lorch, and D. Berger, “Making World-Wide Web caching servers cooperate,” in Proceedings of the 4th International WWW Conference, Boston, MA, 1995.

  38. J. McIver and R. King, “Self-adaptive, on-line reclustering of complex object data,” in Proceedings of the International Conference on Management of Data, ACM Press, Minneapolis, MI, 1994.

    Google Scholar 

  39. S. Milliner, A. Bouguettaya, and M. Papazoglou, “A scalable architecture for autonomous heterogeneous database interactions,” in Proceedings of the 21st International Conference on Very Large Data Bases, Zurich, Switzerland, 1995.

  40. C. Mohan and I. Narang, “ARIES/CSA: A method for database recovery in client-server architectures,” SIGMOD Record, vol. 23, no. 2, pp. 55–66, 1994.

    Google Scholar 

  41. M. Oates, D. Corne, and R. Loader, “Investigating evolutionary approaches for self-adaptation in the large distributed databases,” in Proceedings of the 1998 IEEE International Conference on Evolutionary Computation, Anchorage, AK, 1998.

  42. M. Ozsu and P. Valduriez, Principles of Distributed Database Systems, Upper Saddle River, NJ, Second Edition, 1999.

  43. E. Panagos, A. Biliris, H. Jagadish, and R. Rastogi, “Client-based logging for high performance distributed architectures,” in Proceedings of the 12th International Conference on Data Engineering, New Orleans, LA, pp. 344–351, 1996.

  44. J. Park, V. Kanitkar, R. Uma, and A. Delis, “Optimal client clustering is NP-complete,” Technical Report, Polytechnic University, Brooklyn, NY, 1998.

    Google Scholar 

  45. R. Polamraju and W. Potter, “Databases for engineering applications,” in IEEE Proceedings of SOUTHEASTCON' 91, vol. 2. Williamsburg, VA, 1991.

  46. M. Rabinovich, J. Chase, and S. Gadde, “Not all hits are created equal: Cooperative proxy caching over a wide-area network,” Computer Networks and ISDN Systems, vol. 30, nos. 22- 23, pp. 2253–2259, 1998.

    Google Scholar 

  47. D. Saccá and G. Wiederhold, “Database partitioning in a cluster of processors,” ACM-Transaction on Database Systems, vol. 10, no. 1, pp. 29–56, 1985.

    Google Scholar 

  48. H. Sandhu and S. Zhou, “Cluster-based file replication in large-scale distributed systems,” in ACMSIGMETRICS and Performance' 92 Conference, 1992.

  49. M. Satyanarayanan, J. Kistler, P. Kumar, M. Okasaki, E. Siegel, and D. Steere, “Coda: A highly available file system for a distributed workstation environment,” IEEE- Transactions on Computers, vol. 39, no. 4, 1990.

  50. A. Sinha, “Client- server computing,” Communications of ACM, vol. 35, no. 7, 1992.

  51. S. Su, H. Lam, S. Eddula, J. Arroyo, N. Prasad, and R. Zhuang, “OSAM*KBMS:Anobject-oriented knowledge base management system for supporting advanced applications,” in Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, DC, 1993.

  52. T.M.D. Team, “The miniRel relational DBMS,” University of Wisconsin, Madison, WI, 1989.

    Google Scholar 

  53. R. Tewari, M. Dahlin, H. Vin, and J. Kay, “Design considerations for distributed caching on the internet,” in Proceedings of the 19th IEEE International Conference on Distributed Computing Systems, Austin, TX, 1999.

  54. S. Theodoridis and K. Koutroumbas, Pattern Recognition, Academic Press, London, 1999.

    Google Scholar 

  55. P. Triantafillou and C. Neilson, “Achieving strong consistency in a distributed file system,” IEEE Transactionson Software Engineering, vol. 3, no. 1, pp. 35–55, 1997.

    Google Scholar 

  56. M. Tsangaris and J. Naughton, “On the performance of object clustering techniques,” in Proceedings of 20th ACM SIGMOD Conference on the Management of Data, San Diego, CA, 1992.

  57. Y. Wang and L. Rowe, “Cache consistency and concurrency control in a client/server DBMS architecture,” in Proceedings of the 1991 ACM SIGMOD Conference, Denver, CO, 1991.

  58. V. Wietrzyk and M. Orgun, “Dynamic reorganization of object databases,” in Proceedings of the the 1999 IEEE International Database Engineering and Applications Symposium, Montreal, Canada, 1999.

  59. K. Wilkinson and M. Neimat, “Maintaining consistency of client- cached data,” in Proceedings of the 16th International Conference on Very Large Data Bases, pp. 122–133, 1990.

  60. C. Yu, C. Suen, K. Lam, and M. Siu, “Adaptive record clustering,” ACM Transactions on Database Systems, vol. 10, no. 2, pp. 180–204, 1985.

    Google Scholar 

  61. P. Yu, M. Chen, H. Heiss, and S. Lee, “On workload characterization of relational database environments,” IEEE Transaction of Software Engineering, vol. 18, no. 4, pp. 347–355, 1992.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Park, JH., Kanitkar, V. & Delis, A. Logically Clustered Architectures for Networked Databases. Distributed and Parallel Databases 10, 161–198 (2001). https://doi.org/10.1023/A:1019284429578

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1019284429578

Navigation