Abstract
The paper is devoted to the problem of effective query execution in cluster-based systems. An original approach to data placement and replication on the nodes of a cluster system is presented. Based on this approach, a load balancing method for parallel query processing is developed. A method for parallel query execution in cluster systems based on the load balancing method is suggested. Results of computational experiments are presented, and analysis of efficiency of the proposed approaches is performed.
Similar content being viewed by others
References
Dean, J. and Ghemawat, S., MapReduce: Simplified Data Processing on Large Clusters, Commun. ACM, 2008, vol. 51, no. 1, pp. 107–113.
Chaudhuri, S. and Narasayya, V., Self-tuning Database Systems: A Decade of Progress, Proc. of the 33rd Int. Conf. on Very Large Data Bases, Vienna, 2007, pp. 3–14.
Xu, Y., Kostamaa, P., Zhou, X., and Chen, L., Handling Data Skew in Parallel Joins in Shared-nothing Systems, Proc. of the ACM SIGMOD Int. Conf. on Management of Data, Vancouver: ACM, 2008, pp. 1043–1052.
Han, W., Ng, J., Markl, V., Kache, H., and Kandil, M., Progressive Optimization in a Shared-nothing Parallel Database, Proc. of the ACM SIGMOD Int. Conf. on Management of Data, Beijing, 2007, pp. 809–820.
Zhou, J., Cieslewicz, J., Ross, K.A., and Shah, M., Improving Database Performance on Simultaneous Multithreading Processors, Proc. of the 31st Int. Conf. on Very Large Data Bases, Trondheim, Norway, 2005, pp. 49–60.
Garcia, P. and Korth, H.F., Pipelined Hash-join on Multithreaded Architectures, Proc. of the 3rd Int. Workshop on Data Management on New Hardware (DaMoN’07) (Beijing, China, 2007), New York: ACM, pp. 1–8.
Lakshmi, M.S. and Yu, P.S., Effect of Skew on Join Performance in Parallel Architectures, Proc. of the first Int. Symp. on Databases in Parallel and Distributed Systems, Austin, Texas: IEEE Comput. Society, 1988, pp. 107–120.
Ferhatosmanoglu, H., Tosun, A.S., Canahuate, G., and Ramachandran, A., Efficient Parallel Processing of Range Queries through Replicated Declustering, Distributed Parallel Databases, 2006, vol. 20, no. 2, pp. 117–147.
Kostenetskii, P.S., Lepikhov, A.V., and Sokolinskii, L.B., Technologies of Parallel Database Systems for Hierarchical Multiprocessor Environments, Avtom. Telemekh., 2007, no. 5, pp. 112–125 [Automation Remote Control (Engl. Transl.), 2007, vol. 68, no. 5, pp. 847–859.
Sokolinskii, L.B., Organization of Parallel Query Processing in Multiprocessor Database Machines with Hierarchical Architecture, Programmirovanie, 2001, no. 6, pp. 13–29. [Programming Comput. Software (Engl. Transl.), 2001, vol. 27, no. 6, pp. 297–308].
Lepikhov, A.V. and Sokolinsky, L.B., Data Placement Strategy in Hierarchical Symmetrical Multiprocessor Systems, Proc. of Spring Young Researchers Colloquium in Databases and Information Systems (SYRCo-DIS’2006), Moscow: Moscow State University, 2006, pp. 31–36.
Parallel DBMS “Omega” for Multiprocessor Hierarchies. URL: http://fireforge.net/projects/omega/.
Rating TOP50: A list of 50 Most Powerful Computers in CIS. URL: http://supercomputers.ru/.
Computational Cluster “SKIF Ural”. URL: http://supercomputer.susu.ru/computers/ckif-ural/.
Author information
Authors and Affiliations
Corresponding author
Additional information
Original Russian Text © A.V. Lepikhov, L.B. Sokolinsky, 2010, published in Programmirovanie, 2010, Vol. 36, No. 4.
Rights and permissions
About this article
Cite this article
Lepikhov, A.V., Sokolinsky, L.B. Query processing in a DBMS for cluster systems. Program Comput Soft 36, 205–215 (2010). https://doi.org/10.1134/S0361768810040031
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S0361768810040031