Labeled Network Stack: A High-Concurrency and Low-Tail Latency Cloud Server Framework for Massive IoT Devices

Zhang, Wen-Li; Liu, Ke; Shen, Yi-Fan; Lan, Ya-Zhu; Song, Hui; Chen, Ming-Yu; Chen, Yuan-Fei

doi:10.1007/s11390-020-9651-x

Labeled Network Stack: A High-Concurrency and Low-Tail Latency Cloud Server Framework for Massive IoT Devices

Regular Paper
Published: 17 January 2020

Volume 35, pages 179–193, (2020)
Cite this article

Journal of Computer Science and Technology Aims and scope Submit manuscript

Wen-Li Zhang¹,
Ke Liu¹,
Yi-Fan Shen^1,2,
Ya-Zhu Lan¹,
Hui Song¹,
Ming-Yu Chen^1,2,3 &
…
Yuan-Fei Chen^1,4

282 Accesses
4 Citations
Explore all metrics

Abstract

Internet of Things (IoT) applications have massive client connections to cloud servers, and the number of networked IoT devices is remarkably increasing. IoT services require both low-tail latency and high concurrency in datacenters. This study aims to determine whether an order of magnitude improvement is possible in tail latency and concurrency in mainstream systems by proposing a hardware–software codesigned labeled network stack (LNS) for future datacenters. The key innovation is a cross-layered payload labeling mechanism that distinguishes different requests by payload across the full network stack, including application, TCP/IP, and Ethernet layers. This type of design enables prioritized data packet processing and forwarding along the full datapath, such that latency-insensitive requests cannot significantly interfere with high-priority requests. We build a prototype datacenter server to evaluate the LNS design against a commercial X86 server and the mTCP research, using a cloud-supported IoT application scenario. Experimental results show that the LNS design can provide an order of magnitude improvement in tail latency and concurrency. A single datacenter server node can support over 2 million concurrent long-living connections for IoT devices as a 99-percentile tail latency of 50 ms is maintained. In addition, the hardware–software codesign approach remarkably reduces the labeling and prioritization overhead and constrains the interference of high-priority requests to low-priority requests.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Labeled Network Stack: A Co-designed Stack for Low Tail-Latency and High Concurrency in Datacenter Services

Towards Seamless IoT Device-Edge-Cloud Continuum:

MCCBench: A C10M Benchmark Oriented to Interactive Network Services

References

Gubbi J, Buyya R, Marusic S et al. Internet of Things (IoT): A vision, architectural elements, and future directions. Future Generation Computer Systems, 2013, 29(7): 1645-1660.
Article Google Scholar
Botta A, De Donato W, Persico V et al. Integration of cloud computing and Internet of Things: A survey. Future Generation Computer Systems, 2016, 56: 684-700.
Article Google Scholar
Mohammadi M, Al-Fuqaha A, Sorour S et al. Deep learning for IoT big data and streaming analytics: A survey. IEEE Communications Surveys & Tutorials, 2018, 20(4): 2923-2960.
Article Google Scholar
Dean J, Barroso L A. The tail at scale. Communications of the ACM, 2013, 56(2): 74-80.
Article Google Scholar
Zats D, Das T, Mohan P, Borthakur D, Katz R. DeTail: Reducing the flow completion time tail in datacenter networks. ACM SIGCOMM Comput. Commun. Rev., 2012, 42: 139-150.
Article Google Scholar
Li J, Sharma N K, Ports D R et al. Tales of the tail: Hardware, OS, and application-level sources of tail latency. In Proc. the ACM Symposium on Cloud Computing, November 2014, Article No. 9.
Liu H. A measurement study of server utilization in public clouds. In Proc. the 9th IEEE International Conference on Dependable, Autonomic and Secure Computing, December 2011, pp.435-442.
Thekkath C A, Nguyen T D, Moy E et al. Implementing network protocols at user level. IEEE/ACM Transactions on Networking, 1993, 1(5): 554-565.
Article Google Scholar
ZhangW, Liu K, Song H et al. Labeled network stack: A co-designed stack for low tail-latency and high concurrency in datacenter services. In Proc. the 15th IFIP WG 10.3 International Conference on Network and Parallel Computing, November 2018, pp.132-136.
Google Scholar
Wu W, Feng X, Zhang W, Chen M. MCC: A predictable and scalable massive client load generator. In Proc. the 2019 BenchCouncil International Symposium on Benchmarking, Measuring and Optimizing, Nov. 2019.
Song H, Zhang W, Liu K et al. HCMonitor: An accurate measurement system for high concurrent network services. In Proc. the 2019 IEEE International Conference on Networking, Architecture and Storage, August 2019, Article No. 2.
Xu Z W, Li C D. Low-entropy cloud computing systems. SCIENTIA SINICA Informationis, 2017, 47(9): 1149-1163.
Article MathSciNet Google Scholar
Nowlan M F, Tiwari N, Iyengar J et al. Fitting square pegs through round pipes: Unordered delivery wire-compatible with TCP and TLS. In Proc. the 9th USENIX Symposium on Networked Systems Design and Implementation, April 2012, pp.383-398.
Moritz P, Nishihara R, Wang S et al. Ray: A distributed framework for emerging AI applications. In Proc. the 13th USENIX Symposium on Operating Systems Design and Implementation, October 2018, pp.561-577.
Nguyen M, Li Z, Duan F et al. The tail at scale: How to predict it? In Proc. the 8th USENIX Workshop on Hot Topics in Cloud Computing, June 2016, Article No. 17.
Delimitrou C, Kozyrakis C. Amdahl’s law for tail latency. Communications of the ACM, 2018, 61(8): 65-72.
Article Google Scholar
Xu Y, Musgrave Z, Noble B et al. Bobtail: Avoiding long tails in the cloud. In Proc. the 10th USENIX Symposium on Networked Systems Design & Implementation, April 2013, pp.329-342.
Lai Z, Cui Y, Li M et al. TailCutter: Wisely cutting tail latency in cloud CDN under cost constraints. In Proc. the 35th Annual IEEE International Conference on Computer Communications, April 2016.
Suresh L, Canini M, Schmid S et al. C3: Cutting tail latency in cloud data stores via adaptive replica selection. In Proc. the 12th USENIX Conference on Networked Systems Design & Implementation, May 2015, pp.513-527.
Kasture H, Sanchez D. Tailbench: A benchmark suite and evaluation methodology for latency-critical applications. In Proc. the 2016 IEEE International Symposium on Workload Characterization, September 2016, pp.3-12.
Cerrato I, Annarumma M, Risso F. Supporting fine-grained network functions through Intel DPDK. In Proc. the 3rd European Workshop on Software Defined Networks, September 2014, pp.1-6.
Shanmugalingam S, Ksentini A, Bertin P. DPDK Open vSwitch performance validation with mirroring feature. In Proc. the 23rd International Conference on Telecommunications, May 2016, Article No. 45.
Marinos I, Watson R N M, Handley M. Network stack specialization for performance. ACM SIGCOMM Computer Communication Review, 2014, 44(4): 175-186.
Article Google Scholar
Ousterhout A, Fried J, Behrens J et al. Shenango: Achieving high CPU efficiency for latency-sensitive datacenter workloads. In Proc. the 16th USENIX Symposium on Networked Systems Design and Implementation, February 2019, pp.361-378.
Kaffes K, Chong T, Humphries J T et al. Shinjuku: Preemptive scheduling for μ second-scale tail latency. In Proc. the 16th USENIX Symposium on Networked Systems Design and Implementation, February 2019, pp.345-360.
Jeong E, Woo S, Jamshed M, Jeong H, Ihm S, Han D, Park K. mTCP: A highly scalable user-level TCP stack for multicore systems. In Proc. the 11th USENIX Symposium on Networked Systems Design and Implementation, April 2014, pp.489-502.
Belay A, Prekas G, Klimovic A et al. IX: A protected data plane operating system for high throughput and low latency. In Proc. the 11th USENIX Symposium on Operating Systems Design and Implementation, Oct. 2014, pp.49-65.
Dragojevic A, Narayanan D, Hodson O, Castro M. FaRM: Fast remote memory. In Proc. the 11th Symposium on Networked Systems Design and Implementation, April 2014, pp.401-414.
Jose J, Subramoni H, Luo M et al. Memcached design on high performance RDMA capable interconnects. In Proc. the 2011 International Conference on Parallel Processing, September 2011, pp.743-752.
Mitchell C, Geng Y, Li J. Using one-sided RDMA reads to build a fast, CPU-efficient key value store. In Proc. the 2013 USENIX Annual Technical Conference, June 2013, pp.103-114.
Ongaro D, Rumble S M, Stutsman R, Ousterhout J K, Rosenblum M. Fast crash recovery in RAMCloud. In Proc. the 23rd ACM Symposium on Operating Systems Principles, October 2011, pp.29-41.
Nishtala R, Fugal H, Grimm S et al. Scaling Memcache at Facebook. In Proc. the 10th Symposium on Networked Systems Design and Implementation, April 2013, pp.385-398.
Han S, Marshall S, Chun B G, Ratnasamy S. MegaPipe: A new programming interface for scalable network I/O. In Proc. the 10th USENIX Symposium on Operating System Design and Implementation, October 2012, pp.135-148.
Bao Y G, Wang S. Labeled von Neumann architecture for software-defined cloud. J. Comput. Sci. Technol., 2017, 32(2): 219-223.
Article MathSciNet Google Scholar
Ma J, Sui X, Sun N H et al. Supporting differentiated services in computers via programmable architecture for resourcing-on-demand (PARD). In Proc. the 20th International Conference on Architectural Support for Programming Languages and Operating Systems, March 2015, pp.131-143.
Marian T, Lee K S, Weatherspoon H. NetSlices: Scalable multi-core packet processing in user-space. In Proc. the 8th ACM/IEEE Symposium on Architectures for Networking and Communication Systems, October 2012, pp.27-38.

Download references

Author information

Authors and Affiliations

State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China
Wen-Li Zhang, Ke Liu, Yi-Fan Shen, Ya-Zhu Lan, Hui Song, Ming-Yu Chen & Yuan-Fei Chen
University of Chinese Academy of Sciences, Beijing, 100049, China
Yi-Fan Shen & Ming-Yu Chen
Peng Cheng Laboratory, Shenzhen, 518000, China
Ming-Yu Chen
Zhongke Zhicheng Electronic Technology Company Limited, Jining, 272000, China
Yuan-Fei Chen

Authors

Wen-Li Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ke Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yi-Fan Shen
View author publications
You can also search for this author in PubMed Google Scholar
Ya-Zhu Lan
View author publications
You can also search for this author in PubMed Google Scholar
Hui Song
View author publications
You can also search for this author in PubMed Google Scholar
Ming-Yu Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yuan-Fei Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wen-Li Zhang.

Electronic supplementary material

ESM 1

(PDF 761 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, WL., Liu, K., Shen, YF. et al. Labeled Network Stack: A High-Concurrency and Low-Tail Latency Cloud Server Framework for Massive IoT Devices. J. Comput. Sci. Technol. 35, 179–193 (2020). https://doi.org/10.1007/s11390-020-9651-x

Download citation

Received: 20 April 2019
Revised: 08 November 2019
Published: 17 January 2020
Issue Date: January 2020
DOI: https://doi.org/10.1007/s11390-020-9651-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Labeled Network Stack: A High-Concurrency and Low-Tail Latency Cloud Server Framework for Massive IoT Devices

Abstract

Access this article

Similar content being viewed by others

Labeled Network Stack: A Co-designed Stack for Low Tail-Latency and High Concurrency in Datacenter Services

Towards Seamless IoT Device-Edge-Cloud Continuum:

MCCBench: A C10M Benchmark Oriented to Interactive Network Services

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

ESM 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Labeled Network Stack: A High-Concurrency and Low-Tail Latency Cloud Server Framework for Massive IoT Devices

Abstract

Access this article

Similar content being viewed by others

Labeled Network Stack: A Co-designed Stack for Low Tail-Latency and High Concurrency in Datacenter Services

Towards Seamless IoT Device-Edge-Cloud Continuum:

MCCBench: A C10M Benchmark Oriented to Interactive Network Services

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

ESM 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation