Abstract
In this paper, the problem of determining an optimal location strategy for an individual program execution is considered. In addition, we propose a heuristic approach for the dynamic file allocation problem. In order to reduce the complexity of the optimization problems, a cluster-based approach is used.To access the data files of a distributed file system, a user initiates a program execution. Based on the current allocation of the program and data files as well as the knowledge about the characteristics of the programs, a first optimization calculates the optimal cluster for each individual program execution. The objective of this optimization is the minimization of the intercluster traffic of an individual program execution. Within the optimal cluster, a simple load-balancing strategy is used to determine the corresponding executing node.A second optimization looks for file allocations where the global intercluster traffic is minimized subject to the following constraints: minimal number of file copies, availability, and storage capacity.Experimental results showing the efficiency of the proposed algorithms are examined, and the implications of the model for the design of very large distributed file systems are discussed.
- [1] Adam, N.R., Tewari, R.: Regeneration with Virtual Copies for Replicated Databases. Proc. 11th IEEE Int. Conf. on Distributed Computing Systems, Arlington, TX, May 1991. Los Alamitos, CA: IEEE Comp. Soc. Press, pp. 429-436.Google ScholarCross Ref
- [2] Akoka, J.: Design of Optimal Distributed Database Systems. Proc. 1st Int. Symp. on Distributed Data Base, Paris, France, 1980, pp. 229-245.Google Scholar
- [3] Banawan, S.A.: An Evaluation of Load Sharing in Locally Distributed Systems. Dept. of Computer Science, Univ. of Washington, Seattle, WA, Technical Report 87-08-02, Aug. 1987.Google Scholar
- [4] Barbara, D., Garcia-Molina, H., Spauster, A.: Increasing Availability under Mutual Exclusion Constraints with Dynamic Vote Reassignment. ACM Transactions on Computer Systems 7:4, 394-426 (Nov. 1989). Google ScholarDigital Library
- [5] Bloch, J.J., Daniels, D.S., Spector, A.Z.: A Weighted Voting Algorithm for Replicated Directories. Journal of the ACM 34:4, 859-909 (Oct. 1987). Google ScholarDigital Library
- [6] Borghoff, U.M.: A Priority-Driven, Consistency-Preserving Strategy for the Relocation Problem of Replicated Files. In: Müller-Stoy, P. (ed.): Proc. 11th ITG/GI Conf. - Architecture of Computing Systems, Munich, Germany, Mar. 1990. Berlin, Offenbach: VDE-Verlag, pp. 365-375. Google ScholarDigital Library
- [7] Borghoff, U.M.: Voting and Relocation Strategies Preserving Consistency among Replicated Files. In: Abiteboul, S., Kanellakis, P.C. (eds.): Proc. 3rd Int. Conf. on Database Theory (ICDT'90), Paris, France, Dec. 1990. Lecture Notes in Computer Science 470. Berlin, Heidelberg, New York: Springer-Verlag, pp. 318-332. Google ScholarDigital Library
- [8] Borghoff, U.M.: Catalogue of Distributed File/Operating Systems. Berlin, Heidelberg, New York: Springer-Verlag, 1st edition, 1991. Google ScholarDigital Library
- [9] Borghoff, U.M.: Fehlertoleranz in verteilten Dateisystemen: Eine Übersicht über den heutigen Entwicklungsstand bei den Votierungsverfahren. Informatik-Spektrum 14:1, 15-27 (Feb. 1991).Google Scholar
- [10] Carroll, J.L., Long, D.D.E., Pâris, J.-F.: Block-level Consistency of Replicated Files. Proc. 7th IEEE Int. Conf. on Distributed Computing Systems, Berlin, Germany, Sep. 1987. Los Alamitos, CA: IEEE Comp. Soc. Press, pp. 146-153.Google Scholar
- [11] Casey, R.G.: Allocation of Copies of a File in an Information Network. Proc. Spring Joint Computer Conf. 40, 1972. Arlington, VA: AFIPS Press, pp. 617-625.Google Scholar
- [12] Ceri, S., Martella, G., Pelagatti, G.: Optimal File Allocation in a Computer Network: A Solution Based on the Knapsack Problem. Computer Networks, pp. 345-357 (1982).Google Scholar
- [13] Chen, P.P.S., Akoka, J.: Optimal Design of Distributed Information Systems. IEEE Transactions on Computers c-20:12, 1068-1080 (Dec. 1980).Google Scholar
- [14] Chu, W.W.: Optimal File Allocation in a Multiple Computer System. IEEE Transactions on Computers c-18:10, 885-889 (Oct. 1969).Google Scholar
- [15] Chu, W.W.: Optimal File Allocation in a Computer Network. In: Abramson, N., Kuo, F.F. (eds.): Computer Communication Systems, 1973. Englewood Cliffs, NJ: Prentice-Hall, pp. 82-84.Google Scholar
- [16] Coffman, E.G., Gelenbe, E., Plateau, B.: Optimization of the Number of Copies of Files in a Distributed Database. IEEE Transactions on Software Engineering SE-7:1, 78-84 (Jan. 1981).Google ScholarDigital Library
- [17] Davcev, D., Burkhard, W.A.: Consistency and Recovery Control for Replicated Files. Proc. 10th ACM Symp. on Operating Systems Principles, Orcas Island, WA, Dec. 1985. ACM SIGOPS Operating Systems Review 19:5, pp. 87-96. Google ScholarDigital Library
- [18] Davidson, S., Garcia-Molina, H., Skeen, D.: Consistency in Partitioned Networks. ACM Computing Surveys 17:3, 341-370 (Sep. 1985). Google ScholarDigital Library
- [19] Davidson, S.B.: Optimism and Consistency in Partitioned Distributed Database Systems. ACM Transactions on Database Systems 9:3, 456-481 (Sep. 1984). Google ScholarDigital Library
- [20] Dowdy, L.W., Foster, D.V.: Comparative Models of the File Assignment Problem. ACM Computing Surveys 14:2, 287-313 (Jun. 1982). Google ScholarDigital Library
- [21] Eager, D.L., Lazowska, E.D., Zahorjan, J.: Adaptive Load Sharing in Homogeneous Distributed Systems. IEEE Transactions on Software Engineering SE-12, 662-675 (1986). Google ScholarDigital Library
- [22] Eswaran, K.P.: Placement of Records in a File and File Allocation in a Computer Network. Proc. IFIP Conf. Information Processing 74, Stockholm, Schweden, Aug. 1974. Amsterdam, New York: North-Holland, pp. 304-307.Google Scholar
- [23] Garcia-Molina, H., Abbott, R.K.: Reliable Distributed Database Management. Proc. IEEE 75:5, 601-620 (May 1987).Google Scholar
- [24] Garcia-Molina, H., Barbara, D.: How to Assign Votes in a Distributed System. Journal of the ACM 32:4, 841-860 (Oct. 1985). Google ScholarDigital Library
- [25] Gavish, B.: Optimization Models for Configuring Distributed Computer Systems. IEEE Transactions on Computers c-36:7, 773-793 (Jul. 1987). Google ScholarDigital Library
- [26] Gavish, B., Pirkul, H.: Computer and Database Location in Distributed Computer Systems. IEEE Transactions on Computers c-35:7, 583-590 (Jul. 1986). Google ScholarDigital Library
- [27] Gelenbe, E.: On the Availability of a Distributed Computer System with Failing Components. Proc. ACM SIGMETRICS Conf. on Measurement and Modeling of Computer Systems, Austin, TX, Aug. 1985. ACM SIGMETRICS Performance Evaluation Review, 13, pp. 6-13. Google ScholarDigital Library
- [28] Gerbessiotis, A.B., Kollias, J.G.: Towards an Optimal Allocation of Fragments of Relations. In: Speth, R. (ed.): Proc. EUTECO '88 - Research into Networks and Distributed Applications, Vienna, Austria, Apr. 1988. Amsterdam, New York: North-Holland.Google Scholar
- [29] Gifford, D.K.: Weighted Voting for Replicated Data. Proc. 7th ACM Symp. on Operating Systems Principles, Pacific Grove, CA, Dec. 1979. ACM SIGOPS Operating Systems Review 13:5, pp. 150-162. Google ScholarDigital Library
- [30] Grapa, E., Belford, G.G.: Some Theorems to Aid in Solving the File Allocation Problem. Communications of the ACM 20:11, 878-882 (Nov. 1977). Google ScholarDigital Library
- [31] Hac, A.: A Distributed Algorithm for Performance Improvement through File Replication, File Migration, and Process Migration. IEEE Transactions on Software Engineering SE-15:11, 1459-1470 (Nov. 1989). Google ScholarDigital Library
- [32] Hac, A., Jin, X.: Dynamic Load Balancing in a Distributed System Using a Decentralized Algorithm. Proc. 7th IEEE Int. Conf. on Distributed Computing Systems, Berlin, Germany, Sep. 1987. Los Alamitos, CA: IEEE Comp. Soc. Press, pp. 170-177.Google Scholar
- [33] Hac, A., Jin, X., Soo, J.-H.: Algorithms for File Replication in a Distributed System. Journal of Systems and Software 14:3, 173-181 (Mar. 1991). Google ScholarDigital Library
- [34] Hac, A., Johnson, T.J.: Dynamic Load Balancing Through Process and Read-Site Placement in a Distributed System. AT & T Bell Techn. Journal, pp. 72-85 (Oct. 1988).Google Scholar
- [35] Herlihy, M.P.: A Quorum-Consensus Replication Method for Abstract Data Types. ACM Transactions on Computer Systems 4:1, 32-53 (Feb. 1986). Google ScholarDigital Library
- [36] Jacqmot, C., Milgrom, E., Joosen, W., Berbers, Y.: UNIX and Load Balancing: A Survey. Proc. Europ. UNIX Systems User Group Conf. Spring '89, Brussels, Belgium, Apr. 1989. Buntingford Herts, UK: EUUG, pp. 1-15.Google Scholar
- [37] Jajodia, S.: Managing Replicated Files in Partitioned Distributed Database Systems. Proc. 3rd IEEE Int. Conf. on Data Engineering, Los Angeles, CA, Feb. 1987. Los Alamitos, CA: IEEE Comp. Soc. Press, pp. 412-418. Google ScholarDigital Library
- [38] Jajodia, S., Mutchler, D.: Enhancements to the Voting Algorithm. Proc. 13th Int. Conf. on Very Large Data Bases, Brighton, UK, 1987. Los Altos, CA: Morgan Kaufmann Publ. Inc., pp. 399-405. Google ScholarDigital Library
- [39] Kurose, J.F., Simha, R.: A Microeconomic Approach to Optimal File Allocation. Proc. 6th IEEE Int. Conf. on Distributed Computing Systems, Cambridge, MA, May 1986. Los Alamitos, CA: IEEE Comp. Soc. Press, pp. 28-35.Google Scholar
- [40] Kurose, J.F., Simha, R.: Second Derivative Algorithms for Optimal Resource Allocation in Distributed Computer Systems. Proc. 7th IEEE Int. Conf. on Distributed Computing Systems, Berlin, Germany, Sep. 1987. Los Alamitos, CA: IEEE Comp. Soc, Press, pp. 56-63.Google Scholar
- [41] Laning, L.J., Leonard, M.S.: File Allocation in a Distributed Computer Communication Network. IEEE Transactions on Computers c-32:3, 232-244 (Mar. 1983).Google Scholar
- [42] Levin, K.D., Morgan, H.L.: Optimizing Distributed Data Bases - A Framework for Research. Proc. AFIPS Nat. Computer Conf. 44, 1975. Arlington, VA: AFIPS Press, pp. 473-478.Google Scholar
- [43] Litzkow, M.J., Livny, M., Mutka, M.W.: Condor - A Hunter of Idle Workstations. Proc. 8th IEEE Int. Conf. on Distributed Computing Systems, San Jose, CA, Jun. 1988. Los Alamitos, CA: IEEE Comp. Soc. Press, pp. 104-111.Google ScholarCross Ref
- [44] Long, D.D.E.: The Management of Replication in a Distributed System. Dept. of Computer and Information Science, Univ. of California, Santa Cruz, CA, Technical Report UCSC/CRL 88/07, 1988.Google Scholar
- [45] Mahmoud, S., Riordon, J.S.: Optimal Allocation of Resources in Distributed Information Networks. ACM Transactions on Database Systems 1:1, 66-78 (Mar. 1976). Google ScholarDigital Library
- [46] Morgan, H.L., Levin, K.D.: Optimal Program and Data Locations in Computer Networks. Communications of the ACM 20:5, 315-321 (1977). Google ScholarDigital Library
- [47] Pâris, J.-F., Long, D.D.E.: Efficient Dynamic Voting Algorithms. Proc. 4th IEEE Int. Conf. on Data Engineering, Los Angeles, CA, Feb. 1988. Los Alamitos, CA: IEEE Comp. Soc. Press, pp. 268-275. Google ScholarDigital Library
- [48] Pattipati, K.R., Wolf, J.L.: A File Assignment Problem Model for Extended Local Area Network Environments. Proc. 10th IEEE Int. Conf. on Distributed Computing Systems, Paris, France, May 1990. Los Alamitos, CA: IEEE Comp. Soc. Press, pp. 554-561.Google ScholarCross Ref
- [49] Pirkul, H.: An Integer Programming Model for the Allocation of Databases in a Distributed Computer System. Europ. Journal of Operational Research 26, 401-411 (1986).Google ScholarCross Ref
- [50] Ram, S., Marsten, R.E.: A Model for Database Allocation Incorporating a Concurrency Control Mechanism. IEEE Transactions on Knowledge and Data Engineering 3:3, 389-395 (Sep. 1991). Google ScholarDigital Library
- [51] Ramamoorthy, C.V., Wah, B.W.: The Isomorphism of Simple File Allocation. IEEE Transactions on Computers c-32:3, 221-232 (Mar. 1983).Google Scholar
- [52] Ramesh, R., Ryan, B.: Optimal File Allocation and Report Assignment in Distributed Information Networks. Naval Research Logistics 37:1, 165-181 (Feb. 1990).Google ScholarCross Ref
- [53] Shoja, G.C.: A Distributed Facility for Load Sharing and Parallel Processing Among Workstations. Journal of Systems and Software 14:3, 163-172 (Mar. 1991). Google ScholarDigital Library
- [54] Sumita, U., Sheng, O.R.L.: Analysis of Query Processing in Distributed Database Systems with Fully Replicated Files: A Hierarchical Approach. Performance Evaluation 8, 223-238 (1988). Google ScholarDigital Library
- [55] Thomas, R.H.: A Majority Consensus Approach to Concurrency Control for Multiple Copy Databases. ACM Transactions on Database Systems 4:2, 180-209 (Jun. 1979). Google ScholarDigital Library
- [56] Wah, B.W.: File Placement on Distributed Computer Systems. IEEE Computer 17:1, 23-33 (Jan. 1984).Google ScholarDigital Library
- [57] Wang, Y.-T., Morris, R.J.T.: Load Sharing in Distributed Systems. IEEE Transactions on Computers c-34:3, 204-217 (Mar. 1985).Google Scholar
- [58] Whitney, W.K.M.: A Study of Optimal Assignment and Communication Network Configuration in Remote-Access Computer Message Processing and Communication Systems. System Energy Lab., Dept. of Elect. Eng., Univ. of Michigan, PhD thesis, Technical Report SEC no. 48, Sep. 1970. Google ScholarDigital Library
- [59] Woodside, C.M., Tripathi, S.K.: Optimal Allocation of File Servers in a Local Network Environment. IEEE Transactions on Software Engineering SE-12:8, 844-848 (Aug. 1986). Google ScholarDigital Library
Index Terms
- Design of optimal distributed file systems: a framework for research
Recommendations
The design and implementation of a log-structured file system
This paper presents a new technique for disk storage management called a log-structured file system. A log-structured file system writes all modifications to disk sequentially in a log-like structure, thereby speeding up both file writing and crash ...
File systems unfit as distributed storage backends: lessons from 10 years of Ceph evolution
SOSP '19: Proceedings of the 27th ACM Symposium on Operating Systems PrinciplesFor a decade, the Ceph distributed file system followed the conventional wisdom of building its storage backend on top of local file systems. This is a preferred choice for most distributed file systems today because it allows them to benefit from the ...
Comments