Skip to main content
Log in

Algorithms for Data Migration

  • Published:
Algorithmica Aims and scope Submit manuscript

Abstract

The data migration problem is the problem of computing a plan for moving data objects stored on devices in a network from one configuration to another. Load balancing or changing usage patterns might necessitate such a rearrangement of data. In this paper, we consider the case where the objects are fixed-size and the network is complete. Our results are both theoretical and empirical. Our main theoretical results are (1) a polynomial time algorithm for finding a near-optimal migration plan in the presence of space constraints when a certain number of additional nodes is available as temporary storage, and (2) a 3/2-approximation algorithm for the case where data must be migrated directly to its destination. We also run extensive experiments on several algorithms for various data migration problems and show that empirically, many algorithms perform better in practice than their theoretical bounds suggest. We conclude that many of the algorithms we present are both practical and effective for data migration.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Anderson, E., Hall, J., Hartline, J., Hobbes, M., Karlin, A., Saia, J., Swaminithan, R., Wilkes, J.: An experimental study of data migration algorithms. In: Proceedings of the Workshop on Algorithm Engineering (WAE), pp. 145–158 (2001)

  2. Anderson, E., Hobbs, M., Keeton, K., Spence, S., Uysal, M., Veitch, A.: Hippodrome: running circles around storage administration. In: Proceedings of the USENIX Conference on File and Storage Technologies (FAST), pp. 175–188 (2002)

  3. Borowsky, E., Golding, R., Merchant, A., Schreier, L., Shriver, E., Spasojevic, M., Wilkes, J.: Using attribute-managed storage to achieve QoS. In: Proceedings of the International Workshop on Quality of Service (IWQoS), pp. 199–202. Columbia University, New York, June 1997

  4. Coffman, E., Garey, M., Johnson, D., Lapaugh, A.: Scheduling file transfers. SIAM J. Comput. 14(3), 744–780 (1985)

    Article  MATH  MathSciNet  Google Scholar 

  5. Gavish, B., Sheng, O.: Dynamic file migration in distributed computer systems. Commun. ACM 33(2), 177–189 (1990)

    Article  Google Scholar 

  6. Golubchik, I., Khuller, S., Khanna, S., Thurimella, R., Zhu, A.: Approximation algorithms for data placement on parallel disks. In: Proceedings of the Eleventh Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 223–232 (2000)

  7. Golubchik, L., Khuller, S., Kim, Y., Shargorodskaya, S., Wan, Y.: Data migration on parallel disks. Algorithmica 45(1), 137–158 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  8. Grammatikakis, M., Hsu, D., Kraetzl, M., Sibeyn, J.: Packet routing in fixed-connection networks: A survey. J. Parallel Distrib. Process. 54(2), 77–132 (1998)

    MATH  Google Scholar 

  9. Hall, J., Hartline, J., Karlin, A., Saia, J., Wilkes, J.: On algorithms for efficient data migration. In: Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 620–629 (2001)

  10. Kashyap, S., Khuller, S., Wan, Y., Golubehik, L.: Fast reconfiguration of data placement in parallel disks. In: Proceedings of the Workshop on Algorithm Engineering and Experiments (ALENEX), pp. 95–107 (2006)

  11. Khuller, S., Kim, Y., Wan, Y.: Algorithms for data migration with cloning. SIAM J. Comput. 33(2), 448–461 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  12. Khuller, S., Kim, Y., Malekian, A.: Improved algorithms for data migration. In: Proceedings of the International Workshop on Approximation Algorithms for Combinatorial Optimization Problems (APPROX), pp. 164–175 (2006)

  13. Kim, Y.: Data migration to minimize the average completion time. In: Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 97–98 (2003)

  14. Kim, Y.: Data migration to minimize the average completion time. J. Algorithms 55(1), 42–57 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  15. Kubale, M.: Preemptive versus nonpreemptive scheduling of biprocessor tasks on dedicated processors. Eur. J. Oper. Res. 94(2), 242–251 (1996)

    Article  MATH  Google Scholar 

  16. Micali, S., Vazirani, V.: An \(o(\sqrt{|V|}|e|)\) algorithm for finding a maximum matching in general graphs. In: Proceedings of the Symposium on Foundations of Computer Science (FOCS), pp. 17–27 (1980)

  17. Nishizeki, T., Kashiwagi, K.: On the 1.1 edge-coloring of multigraphs. SIAM J. Discrete Math. 3(3), 391–410 (1990)

    Article  MATH  MathSciNet  Google Scholar 

  18. Sanders, P., Solis-Oba, R.: How helpers hasten h-relations. In: Proceedings of the European Symposium on Algorithms (ESA), pp. 392–402 (2000)

  19. Shannon, C.: A theorem on colouring lines of a network. J. Math. Phys. 28, 148–151 (1949)

    MATH  MathSciNet  Google Scholar 

  20. Transaction Processing Performance Council. TPC Benchmark D (Decision Support) Standard Specification Revision 2.1. Transaction Processing Performance Council (1996)

  21. Vizing, V.: On an estimate of the chromatic class of a p-graph. Discrete Anal. 3, 23–30 (1964) (in Russian)

    MathSciNet  Google Scholar 

  22. Wolf, J.: The placement optimization problem: a practical solution to the disk file assignment problem. In: Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, pp. 1–10 (1989)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to J. Saia.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Anderson, E., Hall, J., Hartline, J. et al. Algorithms for Data Migration. Algorithmica 57, 349–380 (2010). https://doi.org/10.1007/s00453-008-9214-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00453-008-9214-y

Keywords

Navigation