Abstract
Whole-file transfer is a basic primitive for Internet content dissemination. Content servers are increasingly limited by disk arm movement, given the rapid growth in disk density, disk transfer rates, server network bandwidth, and content size. Individual file transfers are sequential, but the block access sequence on a content server is effectively random when many slow clients access large files concurrently. Although larger blocks can help improve disk throughput, buffering requirements increase linearly with block size.
This article explores a novel block reordering technique that can reduce server disk traffic significantly when large content files are shared. The idea is to transfer blocks to each client in any order that is convenient for the server. The server sends blocks to each client opportunistically in order to maximize the advantage from the disk reads it issues to serve other clients accessing the same file. We first illustrate the motivation and potential impact of aggressive block reordering using simple analytical models. Then we describe a file transfer system using a simple block reordering algorithm, called Circus. Experimental results with the Circus prototype show that it can improve server throughput by a factor of two or more in workloads with strong file access locality.
- Acharya, S., Franklin, M., and Zdonik, S. 1997. Balancing push and pull for data broadcast. In Proceedings of the ACM SIGMOD, 183--194. Google ScholarDigital Library
- Allcock, B., Bester, J., Bresnahan, J., Chervenak, A. L., Foster, I., Kesselman, C., Meder, S., Nefedova, V., Quesnal, D., and Tuecke, S. 2002. Data management and transfer in high performance computational grid environments. Parallel Comput. J. 28, 5, 749--771. Google ScholarDigital Library
- Almeida, J. M., Krueger, J., Eager, D. L., and Vernon, M. K. 2001. Analysis of educational media server workloads. In Proceedings of the International Workshop on Network and Operating System Support for Digital Audio and Video, 21--30. Google ScholarDigital Library
- Anastasiadis, S. V., Sevcik, K. C., and Stumm, M. 2001. Modular and efficient resource management in the exedra media server. In Proceedings of the USENIX Symposium on Internet Technologies and Systems, 25--36. Google ScholarDigital Library
- Anastasiadis, S. V., Wickremesinghe, R. G., and Chase, J. S. 2004. Circus: Opportunistic block reordering for scalable content servers. In Proceedings of the USENIX Conference on File and Storage Technologies, 201--212. Google ScholarDigital Library
- Arlitt, M. F. and Williamson, C. L. 1996. Web server workload characterization: The search for invariants. In Proceedings of the ACM SIGMETRICS, 126--137. Google ScholarDigital Library
- Baker, M. G., Hartman, J. H., Kupfer, M. D., Shirriff, K. W., and Ousterhout, J. K. 1991. Measurements of a distributed file system. In Proceedings of the ACM Symposium on Operating Systems Principles, 198--212. Google ScholarDigital Library
- Barford, P. and Crovella, M. 1998. Generating representative Web workloads for network and server performance evaluation. In Proceedings of the ACM SIGMETRICS, 151--160. Google ScholarDigital Library
- Brown, A. D., Mowry, T. C., and Krieger, O. 2001. Compiler-Based I/O prefetching for out-of-core applications. ACM Trans. Comput. Syst. 19, 2, 111--170. Google ScholarDigital Library
- Byers, J., Considine, J., Mitzenmacher, M., and Rost, S. 2002. Informed content delivery across adaptive overlay networks. In Proceedings of the ACM SIGCOMM, 47--60. Google ScholarDigital Library
- Byers, J. W., Luby, M., Mitzenmacher, M., and Rege, A. 1998. A digital fountain approach to reliable distribution of bulk data. In Proceedings of the ACM SIGCOMM, 57--67. Google ScholarDigital Library
- Cao, P., Felten, E. W., Karlin, A., and Li, K. 1995. A study of integrated prefetching and caching strategies. In Proceedings of the SIGMETRICS/Peformance'95. Google ScholarDigital Library
- Chesire, M., Wolman, A., Voelker, G. M., and Levy, H. M. 2001. Measurement and analysis of a streaming-media workload. In Proceedings of the USENIX Symposium on Internet Technologies and Systems, 1--12. Google ScholarDigital Library
- Clark, D. D. and Tennenhouse, D. L. 1990. Architectural considerations for a new generation of protocols. In Proceedings of the ACM SIGCOMM, 200--208. Google ScholarDigital Library
- Coffman, K. and Odlyzko, A. M. 2002. Internet growth: Is there a “moore's law” for data traffic? In Proceedings of the Handbook of Massive Data Sets. Kluwer Academic, 47--93. Google ScholarDigital Library
- Cohen, B. 2003. Incentives build robustness in bittorrent. bitconjurer.org.Google Scholar
- Diot, C. and Gagnon, F. 1999. Impact of out-of-sequence processing on the performance of data transmission. Comput. Netw. 31, 475--492.Google ScholarCross Ref
- Doyle, R. P., Chase, J. S., Gadde, S., and Vahdat, A. M. 2001. The trickle-down effect: Web caching and server request distribut ion. In Proceedings of the International Workshop on Web Caching and Content Delivery.Google Scholar
- Eager, D., Vernon, M., and Zahorjan, J. 2001. Minimizing bandwidth requirements for on-demand data delivery. IEEE Trans. Knowl. Data Eng. 13, 5, 742--757. Google ScholarDigital Library
- Garey, M. R. and Johnson, D. S. 1979. Computers and Intractability. Freeman, New York. Google ScholarDigital Library
- Jin, S. and Bestavros, A. 2002. Scalability of multicast delivery for non-sequential streaming access. In Proceedings of the ACM SIGMETRICS, 97--107. Google ScholarDigital Library
- Luby, M. 2002. Lt codes. In Proceedings of the IEEE Symposium on Foundations of Computer Science, 271--282. Google ScholarDigital Library
- Megiddo, N. and Modha, D. S. 2003. Arc: A self-tuning, low overhead replacement cache. In Proceedings of the 2nd USENIX Conference on File and Storage Technologies (FAST'03). Google ScholarDigital Library
- Padhye, J., Firoiu, V., Towsley, D. F., and Kurose, J. F. 2000. Modeling TCP Reno performance: A simple model and its empirical validation. IEEE/ACM Trans. Netw. 8, 2, 133--145. Google ScholarDigital Library
- Pai, V. S., Aron, M., Banga, G., Svendsen, M., Druschel, P., Zwaenepoel, W., and Nahum, E. 1998. Locality-Aware request distribution in cluster-based network servers. In Proceedings of the ACM ASPLOS, 205--216. Google ScholarDigital Library
- Park, K. and Pai, V. S. 2006. Scale and performance in the Coblitz large-file distribution service. In Proceedings of the USENIX Symposium on Networked Systems Design & Implementation, 29--44. Google ScholarDigital Library
- Patterson, R. H., Gibson, G. A., Ginting, E., Stodolsky, D., and Zelenka, J. 1995. Informed prefetching and caching. In Proceedings of the ACM Symposium on Operating Systems Principles, 79--95. Google ScholarDigital Library
- Postel, J. and Reynolds, J. 1985. File transfer protocol (ftp). USC/ISI, Network Working Group RFC 959. Google ScholarDigital Library
- Raman, S., Balakrishnan, H., and Srinivasan, M. 2000. An image transport protocol for the internet. In Proceedings of the International Conference on Network Protocols, 209--219. Google ScholarDigital Library
- Rizzo, L. 1997. Dummynet: A simple approach to the evaluation of network protocol. ACM Commun. Rev. 47, 1, 31--41. Google ScholarDigital Library
- Rost, S., Byers, J., and Bestavros, A. 2001. The cyclone server architecture: Streamlining delivery of popular content. In Proceedings of the International Workshop on Web Caching and Content Distribution. Boston, MA.Google Scholar
- Saroiu, S., Gummadi, P. K., Dunn, R. J., Gribble, S. D., and Levy, H. M. 2002. An analysis of internet content delivery systems. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation, 315--328. Google ScholarDigital Library
- Saroiu, S., Gummadi, P. K., and Gribble, S. D. 2002. A measurement study of peer-to-peer file sharing systems. In Proceedings of the SPIE/ACM Multimedia Computing and Networking Conference.Google Scholar
- Steere, D. C. 1997. Exploiting the non-determinism and asynchrony of set iterators to reduce aggregate file I/O latency. In Proceedings of the ACM Symposium on Operating Systems Principles, 252--263. Google ScholarDigital Library
- Trivedi, K. S. 1982. Probability and Statistics with Reliability, Queuing and Computer Science Applications. Prentice-Hall, Englewood Cliffs, NJ. Google ScholarDigital Library
- Vitter, J. S. and Krishnan, P. 1996. Optimal prefetching via data compression. J. ACM 43, 5, 771--793. Google ScholarDigital Library
- Vogels, W. 1999. File system usage in windows nt 4.0. In Proceedings of the ACM Symposium on Operating Systems Principles, 93--109. Google ScholarDigital Library
- Wang, L., Pai, V. S., and Peterson, L. L. 2002. The effectiveness of request redirection on CDN robustness. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation, 345--360. Google ScholarDigital Library
- Zhang, Y., Breslau, L., Paxson, V., and Shenker, S. 2002. On the characteristics and origins of internet flow rates. In Proceedings of the ACM SIGCOMM. Google ScholarDigital Library
Index Terms
- Rethinking FTP: Aggressive block reordering for large file transfers
Recommendations
A Distributed File Transfer Protocol based on P-FTP
AsiaCSN '08: Proceedings of the Fifth IASTED International Conference on Communication Systems and NetworksIn this paper, we propose a Distributed File Transfer Protocol (DFTP) which is used to reduce the file download time. DFTP finds suitable mirror servers by the client itself, and calculates the size of transmission subfile for each found mirror server. ...
A Cost-effective Near-line Storage Server for Multimedia System
ICDE '95: Proceedings of the Eleventh International Conference on Data EngineeringWe consider a storage server architecture for multimedia information systems. While most other works on multimedia storage servers assume on-line disk storage, we consider a two-tier storage architecture with a robotic tape library as the vast near-line ...
DotDFS: A Grid-based high-throughput file transfer system
DotGrid platform is a Grid infrastructure integrated with a set of open and standard protocols recently implemented on the top of Microsoft .NET in Windows and MONO .NET in UNIX/Linux. DotGrid infrastructure along with its proposed protocols provides a ...
Comments