skip to main content
article

Interposed request routing for scalable network storage

Published:01 February 2002Publication History
Skip Abstract Section

Abstract

This paper explores interposed request routing in Slice, a new storage system architecture for high-speed networks incorporating network-attached block storage. Slice interposes a request switching filter---called a μproxy---along each client's network path to the storage service (e.g., in a network adapter or switch). The μproxy intercepts request traffic and distributes it across a server ensemble. We propose request routing schemes for I/O and file service traffic, and explore their effect on service structure. The Slice prototype uses a packet filter μproxy to virtualize the standard Network File System (NFS) protocol, presenting to NFS clients a unified shared file volume with scalable bandwidth and capacity. Experimental results from the industry-standard SPECsfs97 workload demonstrate that the architecture enables construction of powerful network-attached storage services by aggregating cost-effective components on a switched Gigabit Ethernet LAN.

References

  1. AMIRI, K., GIBSON,G.,AND GOLDING, R. 2000. Highly concurrent shared storage. In Proceedings of the IEEE International Conference on Distributed Computing Systems (ICDCS, April 2000).]] Google ScholarGoogle Scholar
  2. ANDERSON, D. 1999. Object based storage devices: a command set proposal. Technical report (Oct.), National Storage Industry Consortium.]]Google ScholarGoogle Scholar
  3. ANDERSON,D.C.AND CHASE, J. S. 2000. Failure-atomic file access in an interposed network storage system. In Proceedings of the Ninth IEEE International Symposium on High Performance Distributed Computing (HPDC, Aug. 2000).]] Google ScholarGoogle Scholar
  4. ANDERSON, T., DAHLIN, M., NEEFE, J., PATTERSON, D., ROSELLI,D.,AND WANG, R. 1995. Serverless network file systems. In Proceedings of the ACM Symposium on Operating Systems Principles (Dec. 1995). 109-126.]] Google ScholarGoogle Scholar
  5. ARPACI-DUSSEAU, R. H., ANDERSON, E., TREUHAFT, N., CULLER, D. E., HELLERSTEIN, J. M., PATTERSON, D. A., AND YELICK, K. 1999. Cluster I/O with River: Making the fast case common. In I/O i n Parallel and Distributed Systems (IOPADS, May 1999).]] Google ScholarGoogle Scholar
  6. BIRRELL,A.D.AND NEEDHAM, R. M. 1980. A universal file server. IEEE Trans. Softw. Eng. SE-6, 5 (Sept.), 450-453.]]Google ScholarGoogle Scholar
  7. CABRERA, L.-F. AND LONG, D. D. E. 1991. Swift: Using distributed disk striping to provide high I/O data rates. Comput. Syst. 4, 4 (Fall), 405-436.]]Google ScholarGoogle Scholar
  8. GIBSON, G. A., NAGLE,D.F.,AMIRI, K., CHANG,F.W.,FEINBERG, E. M., GOBIOFF, H., LEE, C., OZCERI, B., RIEDEL, E., ROCHBERG,D.,AND ZELENKA, J. 1997. File server scaling with network-attached secure disks. In Proceedings of the 1997 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (New York, NY, June 15-18 1997). ACM Press, New York, NY, 272-284. Also published in Perf. Eval. Rev. 25,1.]] Google ScholarGoogle Scholar
  9. GIBSON, G. A., NAGLE,D.F.,AMIRI, K., CHANG, F. W., GOBIOFF, H., HARDIN, C., RIEDEL, E., ROCHBERG,D., AND ZELENKA, J. 1998. A cost-effective, high-bandwidth storage architecture. In Proceedings of the Eighth International Conference on Architectural Support for Programming Languages and Operating Systems (Oct. 1998).]] Google ScholarGoogle Scholar
  10. HAGMANN, R. 1987. Reimplementing the Cedar file system using logging and group commit. In Proceedings of the 11th ACM Symposium on Operating Systems Principles (SOSP, Nov. 1987). 155-162.]] Google ScholarGoogle Scholar
  11. HAMILTON, G., POWELL, M. L., AND MITCHELL, J. J. 1993. Subcontract: A flexible base for distributed programming. In Proceedings of the Fourteenth ACMSymposium on Operating Systems Principles (Dec. 1993). 69-79.]] Google ScholarGoogle Scholar
  12. HARTMAN,J.H.AND OUSTERHOUT, J. K. 1995. The Zebra striped network file system. ACM Trans. Comput. Syst. 13, 3 (Aug.), 274-310.]] Google ScholarGoogle Scholar
  13. HEIDEMANN,J.S.AND POPEK, G. J. 1994. File-system development with stackable layers. ACM Trans. Comput. Syst. 12, 1 (Feb.), 58-89.]] Google ScholarGoogle Scholar
  14. JONES, M. B. 1993. Interposition agents: Transparently interposing user code at the system interface. In Proceedings of the Fourteenth Symposium on Operating Systems Principles (Dec. 1993). 80-93.]] Google ScholarGoogle Scholar
  15. KARGER, D., LEHMAN, E., LEIGHTON, T., LEVINE, M., LEWIN,D.,AND PANIGRAHY, R. 1997. Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web. In Proceedings of the Twenty-Ninth ACM Symposium on Theory of Computing (El Paso, TX, May 1997). 654-663.]] Google ScholarGoogle Scholar
  16. LEE,E.K.AND THEKKATH, C. A. 1996. Petal: Distributed virtual disks. In Proceedings of the Seventh Conference on Architectural Support for Programming Languages and Operating Systems (Cambridge, MA, Oct. 1996). 84-92.]] Google ScholarGoogle Scholar
  17. MACKLEM, R. 1994. Not quite NFS, soft cache consistency for NFS. In USENIX Association Conference Proceedings (Jan. 1994). 261-278.]] Google ScholarGoogle Scholar
  18. MALTZAHN, C., RICHARDSON, K., AND GRUNWALD, D. 1999. Reducing the disk I/O of Web proxy server caches. In USENIX Annual Technical Conference (June 1999).]] Google ScholarGoogle Scholar
  19. MCKUSICK, M. K., JOY, W., LEFFLER,S.,AND FABRY, R. 1984. A fast file system for UNIX. ACM Trans. Comput. Syst. 2, 3 (Aug.), 181-197.]] Google ScholarGoogle Scholar
  20. MILLS, D. 1985. Network Time Protocol (NTP). RFC 958, Internet Engineering Task Force.]] Google ScholarGoogle Scholar
  21. MOGUL, J., RASHID, R., AND ACCETTA, M. 1987. The packet filter: An efficient mechanism for userlevel network code. In Proceedings of the 11th ACMSymposium on Operating Systems Principles (SOSP, Nov. 1987). 39-51.]] Google ScholarGoogle Scholar
  22. PAI,V.S.,ARON, M., BANGA, G., SVENDSEN, M., DRUSCHEL, P., ZWAENOPOEL,W.,AND NAHUM, E. 1998. Locality-aware request distribution in cluster-based network servers. In Proceedings of the Eighth International Conference on Architectural Support for Programming Languages and Operating Systems (Oct. 1998).]] Google ScholarGoogle Scholar
  23. PAWLOWSKI, B., SHEPLER, S., BEAME, C., CALLAGHAN, B., EISLER, M., NOVECK, D., ROBINSON,D.,AND THURLOW, R. 2000. The NFS version 4 protocol. In Second International Systems and Networking (SANE) Conference (May 2000).]]Google ScholarGoogle Scholar
  24. PAXSON, V. 1997. End-to-end routing behavior in the Internet. IEEE/ACM Trans. Network. 5, 5 (Oct.), 601-615.]] Google ScholarGoogle Scholar
  25. PRESLAN, K., BARRY, A., BRASSOW, J., ERICKSON, G., NYGAARD, E., SABOL, C., SOLTIS, S., TEIGLAND,D.,AND O'KEEFE, M. 1999. A 64-bit, shared disk file system for Linux. In Sixteenth IEEE Mass Storage Systems Symposium (March 1999).]]Google ScholarGoogle Scholar
  26. RIVEST, R. L. 1992. The MD5 Message-Digest Algorithm. RFC 1321, Internet Engineering Task Force.]] Google ScholarGoogle Scholar
  27. SHAPIRO, M. 1986. Structure and encapsulation in distributed systems: The proxy principle. In Proceedings of the Sixth International Conference on Distributed Computing Systems (May 1986).]]Google ScholarGoogle Scholar
  28. THEKKATH, C., MANN,T.,AND LEE, E. 1997. Frangipani: A scalable distributed file system. In Ninth International Conference on Architectural Support for Programming Languages and Operating Systems (Oct. 1997). 224-237.]] Google ScholarGoogle Scholar
  29. VAN RENESSE, R., TANENBAUM, A., AND WILSCHUT, A. 1989. The design of a high-performance file server. In The 9th International Conference on Distributed Computing Systems (Newport Beach, CA, June 1989). IEEE Press, Piscataway, NJ, 22-27.]]Google ScholarGoogle Scholar
  30. VOELKER, G. M., ANDERSON,E.J.,KIMBREL, T., FEELEY,M.J.,CHASE,J.S.,KARLIN,A.R.,AND LEVY, H. M. 1998. Implementing cooperative prefetching and caching in a globally-managed memory system. In Proceedings of the ACM Conference on Measurement and Modeling of Computer Systems (SIGMETRICS '98, June 1998).]] Google ScholarGoogle Scholar

Index Terms

  1. Interposed request routing for scalable network storage

    Recommendations

    Reviews

    Vijay Shivshanker Gupta

    Slice is an architecture for a large-scale network-attached storage system in a LAN environment. This paper provides design information, implementation details, and an evaluation of Slice. The main theme of the paper is that, by adding a mu-proxy (an indirection layer), it is possible to facilitate scalable decentralized storage. The mu-proxy acts as a content-based router: it sees the content of requests and routes them to an appropriate server, based on the size of the request and other factors. The performance results provided by the authors seem to indicate that mu-proxies are not a bottleneck. Overall, this is a well-written paper, and the details included give good insight into how the design was translated into an implementation. It would have been nice if the authors had provided more comparisons with alternatives, such as Alteon switches, especially since Web applications are an important part of their motivation. The paper’s introduction also mentions that system administration cost is an important design factor in modern computer systems, but it is not made clear how Slice reduces the complexity of system administration. Admittedly, neither of these issues is easy to address completely in a research paper. As mentioned by the authors, support for recovery and reconfiguration is incomplete in the Slice prototype, but this is really the tougher part of the problem. A follow-up paper by Anderson and Chase [1] provides more details about this. The authors could have cautioned the reader more about security issues, since mu-proxies become a source of potential security holes. Also, cryptographic protection helps in making data tamperproof, but does not solve all security problems. Online Computing Reviews Service

    Access critical reviews of Computing literature here

    Become a reviewer for Computing Reviews.

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader