Abstract
Ceph, an object-based distributed storage system, has a communication subsystem called Async messenger. In the Async messenger, a worker thread in the thread pool is assigned to each connection in a round-robin fashion and is allowed to process all the incoming or outgoing messages from the connection. Although this thread per connection strategy is easy to implement, it has an inherent problem such that when a connection is overloaded, it results in load imbalance problem among worker threads. In order to mitigate this problem, multiple worker threads can be assigned to a single connection to handle the traffic from the connection. However, this mapping structure induces another overhead related to lock contention since multiple threads contend to access the shared resources in the connection. In this paper, we propose lock contention aware messenger (Async-LCAM) , a messenger that assigns multiple worker threads per connection and is aware of lock contention generated from the threads. By keeping track of the lock contention of each connection every interval, the Async-LCAM dynamically adds or deletes assigned threads to/from the connection in order to balance the workloads among worker threads. The experimental results show that the Async-LCAM improves the throughput and latency of Ceph storage by up to 184 and 65%, respectively, compared to the original Async messenger.
Access this article
We’re sorry, something doesn't seem to be working properly.
Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-018-2832-5/MediaObjects/10586_2018_2832_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-018-2832-5/MediaObjects/10586_2018_2832_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-018-2832-5/MediaObjects/10586_2018_2832_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-018-2832-5/MediaObjects/10586_2018_2832_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-018-2832-5/MediaObjects/10586_2018_2832_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-018-2832-5/MediaObjects/10586_2018_2832_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-018-2832-5/MediaObjects/10586_2018_2832_Fig7_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-018-2832-5/MediaObjects/10586_2018_2832_Fig8_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-018-2832-5/MediaObjects/10586_2018_2832_Fig9_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-018-2832-5/MediaObjects/10586_2018_2832_Fig10_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-018-2832-5/MediaObjects/10586_2018_2832_Fig11_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-018-2832-5/MediaObjects/10586_2018_2832_Fig12_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-018-2832-5/MediaObjects/10586_2018_2832_Fig13_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-018-2832-5/MediaObjects/10586_2018_2832_Fig14_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-018-2832-5/MediaObjects/10586_2018_2832_Fig15_HTML.png)
Similar content being viewed by others
References
IDC. https://www.ibm.com/blogs/internet-of-things/ai-future-iot/
Weil, S.A., Brandt, S.A., Miller, E.L., Long, D.D., Maltzahn, C.: Ceph: a scalable, high-performance distributed file system. In: 7th Symposium on Operating Systems Design and Implementation, USENIX Association, pp. 307–320 (2006)
Gluster. http://www.gluster.org/
Lustre. http://lustre.org/
Ling, Y., Mullen, T., Lin, X.: Analysis of optimal thread pool size. In: ACM SIGOPS Operating Systems Review, vol. 34, no. 2, April 2000, pp. 42–55 (2000)
Fio Benchmark. http://freecode.com/projects/fio
Noel, R.R., Lama, P.: Taming performance hotspots in cloud storage with dynamic load redistribution. In: 2017 IEEE 10th International Conference on Cloud Computing (CLOUD). IEEE, pp. 42–49 (2017)
Chowdhury, M., Kandula, S., Stoica, I.: Leveraging endpoint flexibility in data-intensive clusters. In: ACM SIGCOMM Computer Communication Review. ACM, pp. 231–242 (2013)
Kettimuthu, R., Vardoyan, G., Agrawal, G., Sadayappan, P., Foster, I.: An elegant sufficiency: load-aware differentiated scheduling of data transfers. In: 2015 SC-International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, pp. 1–12 (2015)
Kim, Y., Atchley, S., Vallee, G.R., Lee, S., Shipman, G.M.: Optimizing end-to-end big data transfers over terabits network infrastructure. In: IEEE Transactions on Parallel and Distributed Systems, pp. 188–201 (2017)
Kumar, K., Rajiv, P., Laxmi, G., Bhuyan, N.: Shuffling: a framework for lock contention aware thread scheduling for multicore multiprocessor systems. In: 2014 23rd International Conference on Parallel Architecture and Compilation Techniques (PACT). IEEE, pp. 289–300 (2014)
Xian, F., Srisaan, W., Jiang, H.: Contention-aware scheduler: unlocking execution parallelism in multithreaded java programs. In: ACM Sigplan Notices. ACM, pp. 163–180 (2008)
Han, Y., Lee, K., Park, S.: A Dynamic Message-aware Communication Scheduler for Ceph Storage System. In: IEEE International Workshops on Foundations and Applications of Self* Systems. IEEE, pp. 60–65 (2016)
Song, U., Jeong, B., Park, S., Lee, K.: Performance optimization of communication subsystem in scale-out distributed storage. In: 2017 IEEE 2nd International Workshops on Foundations and Applications of Self*Systems (FAS* W). IEEE, pp. 263–268 (2017)
Provos, N., Lever, C.: Scalable network I/O in linux. In: USENIX Annual Technical Conference, FREENIX Track (2000)
Chandra, A., Mosberger, D.: Scalability of linux event-dispatch mechanisms. In: USENIX Annual Technical Conference, Boston(2001)
Acknowledgements
This research was supported by the MSIT (Ministry of Science, ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2018-2016-0-00465) supervised by the IITP (Institute for Information & communications Technology Promotion).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Jeong, B., Khan, A. & Park, S. Async-LCAM: a lock contention aware messenger for Ceph distributed storage system. Cluster Comput 22, 373–384 (2019). https://doi.org/10.1007/s10586-018-2832-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-018-2832-5