Abstract
By effectively harnessing networked computing resources, the two-tier client-server model has been used to support shared data access. In systems based on this approach, the database servers often become performance bottlenecks when the number of concurrent users is large. Client data caching techniques have been proposed in order to ease resource contention at the servers. The key theme of these techniques is the exploitation of user data access locality. In this paper, we propose a three-tiered model that takes advantage of such data access locality to furnish a much more scalable system. Groups of clients that demonstrate similarities in their data access behavior are logically clustered together. Each such group of clients is handled by an Intermediate Cluster Manager (ICM) that acts as a cluster-wide directory service and cache manager. Clients within the same cluster are now capable of sharing data among themselves without interacting with the server(s). This results in reduced server load and allows the support of a much larger number of clients. Through prototyping and experimentation, we show that the logical clustering of clients, and the introduction of the ICM layer, significantly improve system scalability as well as transaction response times. Logical clusters, consisting of clients with similar data access patterns, are identified with the help of both a greedy algorithm and a genetic algorithm. For the latter, we have developed an encoding scheme and its corresponding operators.
Similar content being viewed by others
References
J. Andrade, M. Carges, and M. MacBlane, “The TUXEDO System: An open on-line transaction processing environment,” Data Engineering Bulletin, vol. 17, no. 1, 1994.
P. Apers, “Data allocation in distributed database systems,” ACM-Transaction on Database Systems, vol. 13, no. 3, pp. 263–304, 1988.
J. Banerjee, W. Kim, S.-J. Kim, and J.F. Garza, “Clustering a DAG for CAD Databases,” IEEE Transactions on Software Engineering, vol. 14, no. 11, 1988.
P. Bernstein, V. Hadzilakos, and N. Goodman, Concurrency Control and Recovery in Database Systems. Addison-Wesley Longman, Reading, MA, 1987.
A. Biliris and J. Orenstein, “Object storage management architectures,” in: Advances in Object-Oriented Database Systems, Proceedings of the NATO Advanced Study Institute on Object-Oriented Database Systems, Kusadasi, Turkey, 1993.
M. Blaze and R. Alonso, “Dynamic hierarchical caching in large-scale distributed File Systems,” in: Proc. 12th International Conference On Distributed Computing Systems, Yokohama, Japan, 1992.
P. Butterworth, “The resurgent mainframe and the future of distributed computing,” Technical report, Forté Software Inc. Oakland, CA. White Paper on Forté Fusion Technologies available at http://www.forte.com, 1999.
M. Carey, M. Franklin, and M. Zaharioudakis, “Fine-grained sharing in a page server OODBMS,” in: Proceedings of the ACM SIGMOD Conference, Minneapolis, MN, 1994.
A. Chankhunthod, P. Danzig, C. Neerdaels, M. Schwartz, and K. Worrell, “A hierarchical internet object cache”. in: Proceedings of the USENIX 1996 Annual Technical Conference, San Diego, pp. 153–163, 1996.
R. Chow and T. Johnson, Distributed Operating Systems and Algorithms, Addison-Wesley Reading, MA 1997.
I. Chu and M. Winslett, “Choices in database workstation-server architecture,” in: Proceedings of the 17th Annual International Computer Software and Applications Conference, Phoenix, AZ, 1993.
T. Cormen, C. Leiserson, and R. Rivest: 1990, Introduction to Algorithms. New York, NY: McGraw Hill.
M. Dahlin, C. Mather, R. Wang, T. Anderson, and D. Patterson, “A quantitative analysis of cache policies for scalable network file systems,” in Proceedings of the Sigmetrics Conference on Measurement and Modeling of Computer Systems, 1994.
A. Delis and N. Roussopoulos, “Performance comparison of three modern DBMS architectures,” IEEE- Transactions on Software Engineering, vol. 19, no. 2, pp. 120–138, 1993.
D. DeWitt, P. Futtersack, D. Maier, and F. Velez, “A study of three alternative workstation-server architectures for object oriented database systems,” in Proceedings of the 16th International Conference on Very Large Data Bases, Brisbane, Queensland, Australia, pp. 107–121, 1990.
D. Dias, W. Kish, R. Mukherjee, and R. Tewari, “A scalable and highly available web server,” in Proceedings of COMPCON 1996, Forty-First IEEE Computer Society International Conference: Technologies for the Information Superhighway, Santa Clara, CA, 1996.
D. Dilts and W. Wu, “Using knowledge-based technology to integrate CIM databases,” IEEE Transactions on Knowledge and Data Engineering, vol. 3, no. 2, pp. 237–245, 1991.
B. Duska, D. Marwood, and M. Freeley, “The measured access characteristics ofWorld-Wide-Web client proxy caches,” in Proceedings of the 1997 USENIX Symposium on Internet Technologies and Systems (USITS-97), Monterey, CA, 1997.
L. Fan, P. Cao, J. Almeida, and A. Broder, “Summary cache:Ascalable wide-area web cache sharing protocol,” in: Proceedings of the ACM SIGCOMM'98 Conference, Vancouver, Canada, pp. 254–265, 1998.
S. Gadde, M. Rabinovich, and J. Chase, “Reduce, reuse, recycle: An approach to building large internet caches,” in Proceedings of the 6th Workshop on Hot Topics in Operating Systems, Cape Cod, MA, 1997.
M. Garey and D. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness W.H. Freeman & Company, New York, NY, 1979.
G. Gerlhof, A. Kemper, C. Kilger, and G. Moerkotte, “Partition-based clustering in object bases: From theory to practice,” in Proceedings of the International Conference on Foundations of Data Organization, Chicago, IL, 1993.
S. Glassman, “A caching relay for theWorld-WideWeb,” in Proceedings of the First InternationalWorldWide Web Conference, Geneva, Switzerland, 1994.
D. Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley Reading, MA, 1989.
J. Grefenstette, “Optimization of control parameters for genetic algorithms,” IEEE Transactions on Systems, Man and Cybernetics, vol. 16, no. 1, pp. 122–128, 1986.
S. Gribble and E. Brewer, “System design issues for internet middleware services: Deductions from a large client trace,” in Proceedings of the 1997 USENIX Symposium on Internet Technologies and Systems (USITS-97), Monterey, CA, 1997.
J. Holland, Adaptation in Natural and Artificial Systems, Ann Arbor, MI, University of Michigan Press, 1975.
S. Hudson and R. King, “Cactis: A self-adaptive, concurrent implementation of an object-oriented database management system,” ACM Transactions on Database Systems, vol. 14, no. 3, pp. 291–321, 1989.
A. Hurson, S. Pakzad, and J. Cheng, “Object-oriented database management systems: Evolution and performance issues,” IEEE Computer, vol. 26, no. 2, 1993.
H. Ishikawa, Y. Yamane, Y. Izumida, and N. Kawato, “An object-oriented database system Jasmine: Implementation, application, and extension,” IEEE Transactions on Knowledge and Data Engineering, vol. 8, no. 2, 1996.
A. Iyengar, M. Squillante, and L. Zhang, “Analysis and characterization of large-scale web server access patterns and performance,” World Wide Web, vol. 2, nos. 1- 2, 1999.
A. Jain, M. Murty, and P. Flynn, “Data clustering: A review,” ACM Computing Surveys, vol. 31, no. 3, pp. 264–323, 1999.
G. Jones, ObjectStore 6.0, Technical report, Object Design, Inc., 1999.
H. Kitagawa and N. Ohbo, “Design data modeling with versioned conceptual configuration,” in Proceedings of the 13th Annual International Computer Software and Applications Conference, Orlando, FL, September 1989.
A. Leff, P. Yu, and J. Wolf, “Policies for efficient memory utilization in a remote caching architecture,” Miami Beach, FL, December 1991.
A. Luotonen and K. Atlis, “World-Wide Web proxies,” in Proceedings of the First International World Wide Web Conference, Geneva, Switzerland, 1994.
R. Malpani, J. Lorch, and D. Berger, “Making World-Wide Web caching servers cooperate,” in Proceedings of the 4th International WWW Conference, Boston, MA, 1995.
J. McIver and R. King, “Self-adaptive, on-line reclustering of complex object data,” in Proceedings of the International Conference on Management of Data, ACM Press, Minneapolis, MI, 1994.
S. Milliner, A. Bouguettaya, and M. Papazoglou, “A scalable architecture for autonomous heterogeneous database interactions,” in Proceedings of the 21st International Conference on Very Large Data Bases, Zurich, Switzerland, 1995.
C. Mohan and I. Narang, “ARIES/CSA: A method for database recovery in client-server architectures,” SIGMOD Record, vol. 23, no. 2, pp. 55–66, 1994.
M. Oates, D. Corne, and R. Loader, “Investigating evolutionary approaches for self-adaptation in the large distributed databases,” in Proceedings of the 1998 IEEE International Conference on Evolutionary Computation, Anchorage, AK, 1998.
M. Ozsu and P. Valduriez, Principles of Distributed Database Systems, Upper Saddle River, NJ, Second Edition, 1999.
E. Panagos, A. Biliris, H. Jagadish, and R. Rastogi, “Client-based logging for high performance distributed architectures,” in Proceedings of the 12th International Conference on Data Engineering, New Orleans, LA, pp. 344–351, 1996.
J. Park, V. Kanitkar, R. Uma, and A. Delis, “Optimal client clustering is NP-complete,” Technical Report, Polytechnic University, Brooklyn, NY, 1998.
R. Polamraju and W. Potter, “Databases for engineering applications,” in IEEE Proceedings of SOUTHEASTCON' 91, vol. 2. Williamsburg, VA, 1991.
M. Rabinovich, J. Chase, and S. Gadde, “Not all hits are created equal: Cooperative proxy caching over a wide-area network,” Computer Networks and ISDN Systems, vol. 30, nos. 22- 23, pp. 2253–2259, 1998.
D. Saccá and G. Wiederhold, “Database partitioning in a cluster of processors,” ACM-Transaction on Database Systems, vol. 10, no. 1, pp. 29–56, 1985.
H. Sandhu and S. Zhou, “Cluster-based file replication in large-scale distributed systems,” in ACMSIGMETRICS and Performance' 92 Conference, 1992.
M. Satyanarayanan, J. Kistler, P. Kumar, M. Okasaki, E. Siegel, and D. Steere, “Coda: A highly available file system for a distributed workstation environment,” IEEE- Transactions on Computers, vol. 39, no. 4, 1990.
A. Sinha, “Client- server computing,” Communications of ACM, vol. 35, no. 7, 1992.
S. Su, H. Lam, S. Eddula, J. Arroyo, N. Prasad, and R. Zhuang, “OSAM*KBMS:Anobject-oriented knowledge base management system for supporting advanced applications,” in Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, DC, 1993.
T.M.D. Team, “The miniRel relational DBMS,” University of Wisconsin, Madison, WI, 1989.
R. Tewari, M. Dahlin, H. Vin, and J. Kay, “Design considerations for distributed caching on the internet,” in Proceedings of the 19th IEEE International Conference on Distributed Computing Systems, Austin, TX, 1999.
S. Theodoridis and K. Koutroumbas, Pattern Recognition, Academic Press, London, 1999.
P. Triantafillou and C. Neilson, “Achieving strong consistency in a distributed file system,” IEEE Transactionson Software Engineering, vol. 3, no. 1, pp. 35–55, 1997.
M. Tsangaris and J. Naughton, “On the performance of object clustering techniques,” in Proceedings of 20th ACM SIGMOD Conference on the Management of Data, San Diego, CA, 1992.
Y. Wang and L. Rowe, “Cache consistency and concurrency control in a client/server DBMS architecture,” in Proceedings of the 1991 ACM SIGMOD Conference, Denver, CO, 1991.
V. Wietrzyk and M. Orgun, “Dynamic reorganization of object databases,” in Proceedings of the the 1999 IEEE International Database Engineering and Applications Symposium, Montreal, Canada, 1999.
K. Wilkinson and M. Neimat, “Maintaining consistency of client- cached data,” in Proceedings of the 16th International Conference on Very Large Data Bases, pp. 122–133, 1990.
C. Yu, C. Suen, K. Lam, and M. Siu, “Adaptive record clustering,” ACM Transactions on Database Systems, vol. 10, no. 2, pp. 180–204, 1985.
P. Yu, M. Chen, H. Heiss, and S. Lee, “On workload characterization of relational database environments,” IEEE Transaction of Software Engineering, vol. 18, no. 4, pp. 347–355, 1992.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Park, JH., Kanitkar, V. & Delis, A. Logically Clustered Architectures for Networked Databases. Distributed and Parallel Databases 10, 161–198 (2001). https://doi.org/10.1023/A:1019284429578
Issue Date:
DOI: https://doi.org/10.1023/A:1019284429578