Skip to main content
Log in

Efficient allocation and composition of distributed storage

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

In this paper, we investigate the composition of cheap network storage resources to meet specific availability and capacity requirements. We show that the problem of finding the optimal composition for availability and price requirements can be reduced to the knapsack problem, and propose three techniques for efficiently finding approximate solutions. The first algorithm uses a dynamic programming approach to find mirrored storage resources for high availability requirements, and runs in the pseudo-polynomial O(n 2 c) time where n is the number of sellers’ resources to choose from and c is a capacity function of the requested and minimum availability. The second technique is a heuristic which finds resources to be agglomerated into a larger coherent resource, with complexity of O(nlog n). The third technique finds a compromise between capacity and availability (which in our phrasing is a complex integer programming problem) using a genetic algorithm. The algorithms can be implemented on a broker that intermediates between buyers and sellers of storage resources. Finally, we show that a broker in an open storage market, using the combination of the three algorithms can more frequently meet user requests and lower the cost of requests that are met compared to a broker that simply matches single resources to requests.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Lindenstruth V, Panse R, Steinbeck T, Tilsner H, Wiebalck A (2005) Remote administration and fault tolerance in distributed computer infrastructures. In: Getov V, Laforenza D, Reinefeld A (eds) Future generation grids. Springer, Berlin, pp 61–80

    Google Scholar 

  2. Bonhomme A, Prylli L (2004) Reconfiguration of RAID-like data layouts in distributed storage systems. In: Proceedings of the 2004 international parallel and distributed processing symposium, April, pp 212–219

  3. Bonhomme A, Prylli L (2005) Performance evaluation of a distributed video storage system. In: Proceedings of the 2005 international parallel and distributed processing symposium, April, pp 126–135

  4. Hababeh IO, Ramachandran M, Bowring N (2007) A high-performance computing method for data allocation in distributed database systems. J Supercomput 39(1):3–18

    Article  Google Scholar 

  5. Jin H, Xie X, Li Y, Han Z, Dai Z, Lu P (2005) A real-time performance evaluation model for distributed software with reliability constraints. J Supercomput 34(2):165–179

    Article  Google Scholar 

  6. Distributed.net. http://distributed.net/

  7. Seti@home. http://setiathome.berkeley.edu/

  8. Anderson DP (2004) Boinc: A system for public-resource computing and storage. In: Proceedings of the 5th IEEE/ACM international workshop on grid computing, November

  9. Sarmenta LFG (2001) Sabotage-tolerance mechanisms for volunteer computing systems. In: Proceedings of the ACM/IEEE international symposium on cluster computing and the grid (CCGrid’01), May

  10. The free network project. http://freenetproject.org/

  11. Grothoff C (2003) An excess-based economic model for resource allocation in peer-to-peer networks. Wirtschaftsinformatik 45(3):285–292

    Google Scholar 

  12. Buyya R, Abramson D, Venugopal S (2003) The grid economy. Proc IEEE 93(3):698–714. Special issue on grid computing

    Article  Google Scholar 

  13. Yu J, Venugopal S, Buyya R (2006) A market-oriented grid directory service for publication and discovery of grid service providers and their services. J Supercomput 36(1):17–31

    Article  Google Scholar 

  14. Stonebraker M, Aoki P, Pfeffer A, Sah A, Sidell J, Staelin C, Yu A (1996) Mariposa: A wide-area distributed database system. Very Large Databases (VLDB) 5(1):48–63

    Article  Google Scholar 

  15. Placek M, Buyya R (2006) Storage exchange: A global trading platform for storage services. In: Proceedings of the 12th international European parallel computing conference (EuroPar 2006). Springer, Berlin

    Google Scholar 

  16. Barmouta A, Buyya R (2003) GridBank: A grid accounting services architecture (GASA) for distributed systems sharing and integration. In: Workshop on Internet computing and e-commerce, Proceedings of the 17th annual international parallel and distributed processing symposium (IPDPS 2003, April)

  17. Buyya R, Abramson D, Giddy J (2000) An economy driven resource management architecture for global computational power grids. In: The 2000 international conference on parallel and distributed processing techniques and applications (PDPTA 2000, June)

  18. Buyya R, Vazhkudai S (2001) Computer power market: Towards a market-oriented grid. In: The 1st IEEE/ACM international symposium on cluster computing and the grid (CCGrid 2001, May), pp 574–581

  19. Weng C, Li M, Lu X (2007) Grid resource management based on economic mechanisms. J Supercomput 42(2):181–199

    Article  Google Scholar 

  20. Yang C, Shih P, Lin C, Chen S (2007) A resource broker with an efficient network information model on grid environments. J Supercomput 40(3):249–267

    Article  Google Scholar 

  21. Yang C, Yang I, Li K, Wang S (2007) Improvements on dynamic adjustment mechanism in co-allocation data grid environments. J Supercomput 40(3):269–280

    Article  Google Scholar 

  22. Kubiatowicz J, Bindel D, Chen Y, Czerwinski S, Eaton P, Geels D, Gummadi R, Rhea S, Weatherspoon H, Weimer W, Wells C, Zhao B (2000) Oceanstore: An architecture for global-scale persistent storage. In: Proceedings of the 9th international conference on architectural support for programming languages and operating systems (ASPLOS 2000, November), pp 190–201

  23. System reliability theory & principles reference from ReliaSoft. http://www.weibull.com/systemrelwebcontents.htm

  24. Kellerer H, Pferschy U, Pisinger D (2004) Knapsack problems. Springer, Berlin

    MATH  Google Scholar 

  25. Bellman R (1957) Dynamic programming. Princeton University Press, Princeton

    Google Scholar 

  26. Garfinkel R, Nemhauser G (1972) Integer programming. Wiley, New York

    MATH  Google Scholar 

  27. Hu T (1969) Integer programming and network flows. Addison–Wesley, Reading

    MATH  Google Scholar 

  28. Chu P, Beasley J (1998) A genetic algorithm for the multidimensional knapsack problem. J Heuristics 4(1):63–86

    Article  MATH  Google Scholar 

  29. Anderson DP, Fedak G (2006) The computational and storage potential of volunteer computing. In: Sixth IEEE international symposium on cluster computing and the grid (CCGRID’06), May 16–19, pp 73–80

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jimmy Secretan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Secretan, J., Lawson, M. & Bölöni, L. Efficient allocation and composition of distributed storage. J Supercomput 47, 286–310 (2009). https://doi.org/10.1007/s11227-008-0193-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-008-0193-1

Keywords