Abstract
With the surge of AI services, prediction serving systems (PSSes) have been widely deployed. PSSes are often run on many workers and thus are prone to stragglers (slowdowns or failures), so it is critical to design straggler-resilient PSSes for low latency of prediction. A traditional way is replication that assigns the same prediction job to multiple workers, while incurring significant resources overheads due to its replicated redundant jobs. Recently, coded distributed computation (CDC) has become a more resource-efficient way than replication, as it encodes the prediction job into parity units for prediction reconstruction via decoding. However, we find that state-of-the-art CDC methods either trade accuracy for low latency with the encoder and decoder both simple, or trade latency for high accuracy with the encoder and decoder both complicated, leading to unbalance between accuracy and latency due to the above symmetry between the encoder and decoder.
Our insight is that the encoder is always used in CDC-based prediction, while the decoder is only used when stragglers occur. In this paper, we first propose a new asymmetric CDC framework based on the insight, called AsymCDC, composed of a simple encoder but a complicated decoder, such that the encoder’s simplicity makes a low encoding time that reduces the latency largely, while the decoder’s complexity can be helpful for accuracy. Further, we design the decoder’s complexity in two steps: i) an exact decoding method that leverages an invertible neural network’s (INN) invertibility to make the decoding have no accuracy loss, and ii) a decoder compacting method that reshapes INN outputs to utilize knowledge distillation effectively that compacts the decoder for low decoding time. We prototype AsymCDC atop Clipper and experiments show that the prediction accuracy of AsymCDC is approximately the same as state-of-the-arts with the encoder and decoder both complicated, while the latency of AsymCDC only exceeds that of state-of-the-arts with the encoder and decoder both simple by no more than \(2.6\%\).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alibaba cloud (2024). https://www.aliyun.com/
Behrmann, J., Grathwohl, W., Chen, R.T., Duvenaud, D., Jacobsen, J.H.: Invertible residual networks. In: Proceedings of ICML (2019)
Crankshaw, D., Wang, X., Zhou, G., Franklin, M.J., Gonzalez, J.E., Stoica, I.: Clipper: a low-latency online prediction serving system. In: Proceedings of USENIX NSDI (2017)
Dean, J., Barroso, L.A.: The tail at scale. Commun. ACM 56(2), 74–80 (2013)
Finzi, M., Izmailov, P., Maddox, W., Kirichenko, P., Wilson, A.G.: Invertible convolutional networks. In: Proceedings of ICML (2019)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015)
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
Huang, C., et al.: Erasure coding in windows azure storage. In: Proceedings of USENIX ATC (2012)
Huang, K.H., Abraham, J.A.: Algorithm-based fault tolerance for matrix operations. IEEE Trans. Comput. 100(6), 518–528 (1984)
Jacobsen, J.H., Smeulders, A.W., Oyallon, E.: i-RevNet: deep invertible networks. In: Proceedings of ICLR (2018)
Jahani-Nezhad, T., Maddah-Ali, M.A.: Berrut approximated coded computing: straggler resistance beyond polynomial computing. IEEE Trans. Pattern Anal. Mach. Intell. 45(1), 111–122 (2022)
Kosaian, J., Rashmi, K., Venkataraman, S.: Parity models: erasure-coded resilience for prediction serving systems. In: Proceedings of ACM SOSP (2019)
Kosaian, J., Rashmi, K., Venkataraman, S.: Learning-based coded computation. IEEE J. Sel. Areas Inf. Theory (2020)
Lee, K., Lam, M., Pedarsani, R., Papailiopoulos, D., Ramchandran, K.: Speeding up distributed machine learning using codes. IEEE Trans. Inf. Theory 64(3), 1514–1529 (2017)
Phan, T.-D., Ibrahim, S., Zhou, A.C., Aupy, G., Antoniu, G.: Energy-driven straggler mitigation in MapReduce. In: Rivera, F.F., Pena, T.F., Cabaleiro, J.C. (eds.) Euro-Par 2017. LNCS, vol. 10417, pp. 385–398. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-64203-1_28
Radev, S.T., Mertens, U.K., Voss, A., Ardizzone, L., Köthe, U.: Bayesflow: learning complex stochastic models with invertible neural networks. IEEE Trans. Neural Netw. Learn. Syst. 33(4), 1452–1466 (2020)
Rashmi, K.V., Shah, N.B., Gu, D., Kuang, H., Borthakur, D., Ramchandran, K.: A solution to the network challenges of data recovery in erasure-coded distributed storage systems: a study on the Facebook warehouse cluster. In: USENIX Workshop on HotStorage (2013)
Reed, I.S., Solomon, G.: Polynomial codes over certain finite fields. J. Soc. Ind. Appl. Math. 8(2), 300–304 (1960)
Ren, K., Kwon, Y., Balazinska, M., Howe, B.: Hadoop’s adolescence: an analysis of hadoop usage in scientific workloads. Proc. VLDB Endow. 6(10), 853–864 (2013)
Rizzo, L.: Effective erasure codes for reliable computer communication protocols. ACM SIGCOMM Comput. Commun. Rev. 27(2), 24–36 (1997)
Soleymani, M., Ali, R.E., Mahdavifar, H., Avestimehr, A.S.: ApproxIFER: a model-agnostic approach to resilient and robust prediction serving systems. In: Proceedings of AAAI (2022)
Acknowledgments.
This work was supported by the Development Program of China for Young Scholars (No. 2021YFB0301400), and Key Laboratory of Information Storage System Ministry of Education of China.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, L., Hu, Y., Liu, Y., Xiao, R., Feng, D. (2024). Asymmetric Coded Distributed Computation for Resilient Prediction Serving Systems. In: Carretero, J., Shende, S., Garcia-Blas, J., Brandic, I., Olcoz, K., Schreiber, M. (eds) Euro-Par 2024: Parallel Processing. Euro-Par 2024. Lecture Notes in Computer Science, vol 14802. Springer, Cham. https://doi.org/10.1007/978-3-031-69766-1_33
Download citation
DOI: https://doi.org/10.1007/978-3-031-69766-1_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-69765-4
Online ISBN: 978-3-031-69766-1
eBook Packages: Computer ScienceComputer Science (R0)