Abstract
In this paper we approach the solution of large instances of the distribution design problem. The traditional approaches do not consider that the size of the instances can significantly reduce the efficiency of the solution process, which only involves a model of the problem and a solution algorithm. We propose a new approach that incorporates multiple models and algorithms and mechanisms for instance compression, for increasing the scalability of the solution process. In order to validate the approach we tested it on a new model of the replicated version of the distribution design problem which incorporates generalized database objects, and a method for instance compression that uses clustering techniques. The experimental results, utilizing typical Internet usage loads, show that our approach permits to reduce at least 65% the computational resources needed for solving large instances, without significantly reducing the quality of its solution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Garey, M., Johnson, D.: Computer and Intractability: A guide to the theory of NP-Completeness. Freeman, New York (1979)
Papadimitriou, C., Steiglitz, K.: Combinatorial Optimization: Algorithms and Complexity. Dover Publications, New York (1998)
Barr, R., Golden, B., Kelly, J., Steward, W., Resende, M.: Guidelines for designing and reporting on computational experiments with heuristic methods. In: Proceedings of International Conference on Metaheuristics for Optimization, pp. 1–17. Kluwer Publishing, Dordrecht (2001)
Michalewicz, Z., Fogel, D.: How to Solve It: Modern Heuristics. Springer, Heidelberg (1999)
Pérez, J., Pazos, R., Frausto, J., Romero, D., Cruz, L.: Vertical fragmentation and allocation in distributed databases with site capacity restrictions using the threshold accepting algorithm. In: Cairó, O., Cantú, F.J. (eds.) MICAI 2000. LNCS, vol. 1793, pp. 75–81. Springer, Heidelberg (2000)
Pérez, J., Pazos, R., Frausto, J., Rodríguez, G., Cruz, L., Mora, G., Fraire, H.: Self-tuning mechanism for genetic algorithms parameters, an application to data-object allocation in the web. In: Laganá, A., Gavrilova, M.L., Kumar, V., Mun, Y., Tan, C.J.K., Gervasi, O. (eds.) ICCSA 2004. LNCS, vol. 3046, pp. 77–86. Springer, Heidelberg (2004)
Ceri, S., Navathe, S., Wiederhold, G.: Distribution design of logical database schemes. IEEE Transactions on Software Engineering SE-9, 487–503 (1983)
Navathe, S., Ceri, S., Wiederhold, G., Dou, J.: Vertical partitioning algorithms for database design, vol. 9, pp. 680–710 (1984)
Apers, P.: Data allocation in distributed database systems, vol. 13, pp. 263–304 (1988)
Johansson, J., March, S., Naumann, J.: The effects of parallel processing on update response time in distributed database design. In: Proceedings of the 21st International Conference on Information Systems, pp. 187–196 (2000)
Visinescu, C.: Incremental data distibution on internet-based distributed systems: A spring system approach. Master’s thesis, University of Waterloo, Ontario, Canada (2003)
Baiao, F., Mattoso, M., Zaverucha, G.: A distribution design metodology for objects dbms. Distributed and Parallel Databases. Kluwer Academic Publishers 16, 45–90 (2004)
Zilio, D., Rao, J., Lightstone, S., Lohman, G., Storm, A., Garcia-Arellano, C., Fadden, S.: Db2 design advisor: Integrated automatic physical database design. In: Proceedings of the Thirtieth International Conference on Very Large Data Bases 2004, Toronto, Canada, pp. 1087–1097 (2004)
Halkidi, M., Batistakis, Y., Vazirgiannis, M.: On clustering validation techniques. Journal of Intelligent Information Systems 17, 107–145 (2001)
Berkhin, P.: Survey of clustering data mining techniques. Technical report, Accrue Software (2002), http://www.accrue.com/products/rp_cluster_review.pdf
Pérez, J.: Integración de la Fragmentación Vertical y Ubicación en el Diseño Adaptativo de Bases de Datos Distribuidas. PhD thesis, ITESM, Morelos, México (1999)
Fraire, H.: Una Metodología para el Diseño de la Fragmentación y Ubicación en Grandes Bases de Datos Distribuidas. PhD thesis, CENIDET, Cuernavaca, Morelos, México (2005)
Cruz, L.: Clasificación de Algoritmos Heurísticos Para la Solución de Problemas de Bin Packing. PhD thesis, Centro Nacional de Investigación y Desarrollo Tecnológico (CENIDET), Cuernavaca, México (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ortega, J.P., Rangel, R.A.P., Florez, J.A.M., Barbosa, J.J.G., Diaz, E.A.M., Villanueva, J.D.T. (2005). Distribution Design in Distributed Databases Using Clustering to Solve Large Instances. In: Pan, Y., Chen, D., Guo, M., Cao, J., Dongarra, J. (eds) Parallel and Distributed Processing and Applications. ISPA 2005. Lecture Notes in Computer Science, vol 3758. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11576235_69
Download citation
DOI: https://doi.org/10.1007/11576235_69
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29769-7
Online ISBN: 978-3-540-32100-2
eBook Packages: Computer ScienceComputer Science (R0)