ABSTRACT
In intelligence image recognition application, tail latency becomes a core factor to affect user experience. Although the tail latency is only a high-percentile latency of the total latency, reducing the tail latency can significantly improve the user experience caused by waiting for the server to respond to the tail latency. In order to increase user experience, we chose IMG-DNN image recognition application from TailBench benchmark suite. The aim of this study is finding which parts of program causes tail latency. So we analyzed the source code of IMG-DNN and profiled the performance distribution in its runtime period. We built some models to demonstrate the relationship between source code and configuration. Through the model, we analyzed the source code to find out which models have tail latency. And by the help of program analysis, we found that most of the tail latency is caused by cache misses which happened at the configuration of producing requests.
- Harshad Kasture, Daniel Sanchez. 2016. TailBench: A Benchmark Suite and Evaluation Methodology for Latency-Critical Applications. Published in 2016 IEEE International Symposium on Workload Characterization (IISWC). IEEE, Providence, RI, USA. https://doi.org/10.1109/IISWC.2016.7581261Google Scholar
- Yunqi Zhang,David Meisner, Jason Mars, Lingjia Tang.2016. Treadmill: Attributing the Source of Tail Latency through Precise LoadTesting and Statistical Inference.Published in: 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).IEEE, Providence, RI, USA. https://ieeexplore.ieee.org/document/7551414Google Scholar
- Brendan Gregg“Linux Performance”, https://www.brendangregg.com/linuxperf.htmlGoogle Scholar
- Yann LeCun, Corinna Cortes, Christopher J.C. Burges, “THE MNIST DATABASE of handwritten digits”, http://yann.lecun.com/exdb/mnist/Google Scholar
- “TailBench source code”, http://tailbench.csail.mit.eduGoogle Scholar
Recommendations
Tales of the Tail: Hardware, OS, and Application-level Sources of Tail Latency
SOCC '14: Proceedings of the ACM Symposium on Cloud ComputingInteractive services often have large-scale parallel implementations. To deliver fast responses, the median and tail latencies of a service's components must be low. In this paper, we explore the hardware, OS, and application-level sources of poor tail ...
Tail Latency in Datacenter Networks
Modelling, Analysis, and Simulation of Computer and Telecommunication SystemsAbstractOne of the major challenges in cloud service data centers is to satisfy service-level agreements without significant over-provisioning. Achieving predictable performance is critical for many interactive applications. While the focus, particularly ...
Reducing tail latencies in micro-batch streaming workloads
SoCC '17: Proceedings of the 2017 Symposium on Cloud ComputingSpark Streaming discretizes streams of data into micro-batches, each of which is further sub-divided into tasks and processed in parallel to improve job throughput. Previous work [2, 3] has lowered end-to-end latency in Spark Streaming. However, two ...
Comments