skip to main content
10.1145/3412841.3441992acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

Concurrent file metadata structure using readers-writer lock

Published: 22 April 2021 Publication History

Abstract

Linux file systems serialize threads when writing shared files. Recent studies have attempted to adopt range locks on shared files to solve this serialization problem, allowing file I/O to be executed concurrently. However, we have found that even with a range lock, I/O throughput no longer increases after a certain number of cores and decreases rapidly on a manycore server. Through extensive performance profiling, we found the cascading tree lock problem that serializes concurrent accesses to the file metadata structure. A mutex lock-based locking mechanism for each file metadata structure serializes I/O requests in modern Linux file systems such as F2FS. In this paper, we present nCache, a novel file metadata cache framework using readers-writer lock that allows concurrent I/O operations for the shared file. nCache solves the I/O scalability problem in the manycore server while ensuring consistent updates. We implemented nCache in F2FS and evaluated it using FxMark on a 120-core server with high-performance NVMe SSDs. Our extensive evaluations show that nCache achieves maximum device throughput in FxMark's shared file I/O workload. It also shows 4.1x higher throughput compared to the baseline F2FS with range locks for realistic workloads.

References

[1]
Zhichao Cao, Siying Dong, Sagar Vemuri, and David H.C. Du. 2020. Characterizing, Modeling, and Benchmarking RocksDB Key-Value Workloads at Facebook. In Proceedings of the 18th USENIX Conference on File and Storage Technologies (FAST 20). 209--223.
[2]
Avery Ching, Wei-keng Liao, Alok Choudhary, Robert Ross, and Lee Ward. 2007. Noncontiguous Locking Techniques for Parallel File Systems. In Proceedings of the 2007 ACM/IEEE Conference on Supercomputing. 1--12.
[3]
Leonardo Dagum and Ramesh Menon. 1998. OpenMP: An Industry Standard API for Shared-Memory Programming. IEEE Computational Science and Engineering 5, 1 (1998), 46--55.
[4]
Facebook. 2019. RocksDB. https://rocksdb.org/
[5]
Gluster. 2019. Gluster File System. http://www.gluster.org/
[6]
Salman Habib, Adrian Pope, Hal Finkel, Nicholas Frontiere, Katrin Heitmann, David Daniel, Patricia Fasel, Vitali Morozov, George Zagaris, Tom Peterka, et al. 2016. HACC: Simulating Sky Surveys on State-of-the-Art Supercomputing Architectures. New Astronomy 42 (2016), 49--65.
[7]
Intel. 2014. Intel Xeon Processor E7-8870 v2. https://ark.intel.com/content/www/us/en/ark/products/75255/intel-xeon-processor-e7-8870-v2-30m-cache-2-30-ghz.html
[8]
Intel. 2019. Intel Optane 900P Series. https://ark.intel.com/content/www/us/en/ark/products/123628/intel-optane-ssd-900p-series-280gb-1-2-height-pcie-x4-20nm-3d-xpoint.html
[9]
Intel. 2019. Intel SSD 750 Series. https://ark.intel.com/content/www/us/en/ark/products/86742/intel-ssd-750-series-400gb-2-5in-pcie-3-0-20nm-mlc.html
[10]
Ryan Johnson, Ippokratis Pandis, Nikos Hardavellas, Anastasia Ailamaki, and Babak Falsafi. 2009. Shore-MT: A Scalable Storage Manager for the Multicore Era. In Proceedings of the 12th International Conference on Extending Database Technology (EDBT). 24--35.
[11]
Awais Khan, Taeuk Kim, Hyunki Byun, and Youngjae Kim. 2019. SciSpace: A Scientific Collaboration Workspace for Geo-Distributed HPC Data Centers. Future Generation Computer Systems 101 (2019), 398 -- 409.
[12]
June-Hyung Kim, Jangwoong Kim, Hyeongu Kang, Changgyu Lee, Sungyong Park, and Youngjae Kim. 2019. pNOVA: Optimizing Shared File I/O Operations of NVM File System on Manycore Servers. In Proceedings of the 10th ACM SIGOPS Asia-Pacific Workshop on Systems (APSys). 1--7.
[13]
Hideaki Kimura. 2015. FOEDUS: OLTP Engine for a Thousand Cores and NVRAM. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD). 691--706.
[14]
Alex Kogan, Dave Dice, and Shady Issa. 2020. Scalable Range Locks for Scalable Address Spaces and Beyond. In Proceedings of the 15th European Conference on Computer Systems (EuroSys '20). 1--15.
[15]
Changman Lee, Dongho Sim, Joo Young Hwang, and Sangyeun Cho. 2015. F2FS: A New File System for Flash Storage. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST). 273--286.
[16]
Chang-Gyu Lee, Hyunki Byun, Sunghyun Noh, Hyeongu Kang, and Youngjae Kim. 2019. Write Optimization of Log-Structured Flash File System for Parallel I/O on Manycore Servers. In Proceedings of the 12th ACM International Conference on Systems and Storage (SYSTOR). 21--32.
[17]
Jianwei Li, Wei-keng Liao, Alok Choudhary, Robert Ross, Rajeev Thakur, William Gropp, Robert Latham, Andrew Siegel, Brad Gallagher, and Michael Zingale. 2003. Parallel netCDF: A High-Performance Scientific I/O Interface. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC). 39--48.
[18]
Yun Lin and Richard Sharpe. 2017. Using Byte-Range Locks to Manage Multiple Concurrent Accesses to A File in A Distributed Filesystem. US Patent 9,792,294.
[19]
Scott Meyers and Andrei Alexandrescu. 2004. C++ and the Perils of Double-Checked Locking. Dr. Dobb's Journal (2004), 46--49.
[20]
Changwoo Min, Sanidhya Kashyap, Steffen Maass, and Taesoo Kim. 2016. Understanding Manycore Scalability of File Systems. In Proceedings of the USENIX Conference on Usenix Annual Technical Conference (ATC). 71--85.
[21]
Samsung. 2020. Samsung 970 EVO Series SSD. https://www.samsung.com/us/business/products/computing/ssd/client/970-evo-plus-250gb-mz-v7s250b-am/
[22]
Douglas C Schmidt and Tim Harrison. 1997. Double-Checked Locking. Pattern languages of program design 3 (1997), 363--375.
[23]
Douglas C Schmidt, Michael Stal, Hans Rohnert, and Frank Buschmann. 2013. Pattern-Oriented Software Architecture, Patterns for Concurrent and Networked Objects. Vol. 2. John Wiley & Sons.
[24]
Philip Schwan. 2003. Lustre: Building a File System for 1000-node Clusters. In Proceedings of the Linux symposium. 380--386.
[25]
Min Si, Antonio J Peña, Pavan Balaji, Masamichi Takagi, and Yutaka Ishikawa. 2014. MT-MPI: Multithreaded MPI for Many-Core Environments. In Proceedings of the 28th ACM International Conference on Supercomputing (ICS). 125--134.
[26]
Venkatram Vishwanath. 2018. HACC I/O. https://github.com/glennklockwood/hacc-io
[27]
Jian Xu and Steven Swanson. 2016. NOVA: A Log-Structured File System for Hybrid Volatile/Non-volatile Main Memories. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST). 323--338.
[28]
Jian Yang, Juno Kim, Morteza Hoseinzadeh, Joseph Izraelevitz, and Steve Swanson. 2020. An Empirical Guide to the Behavior and Use of Scalable Persistent Memory. In Proceedings of the 18th USENIX Conference on File and Storage Technologies (FAST). 169--182.
[29]
Xiangyao Yu, George Bezerra, Andrew Pavlo, Srinivas Devadas, and Michael Stonebraker. 2014. Staring into the Abyss: An Evaluation of Concurrency Control with One Thousand Cores. Proceedings of the VLDB Endowment 8, 3 (2014), 209--220.
[30]
Xiangyao Yu, Andrew Pavlo, Daniel Sanchez, and Srinivas Devadas. 2016. TicToc: Time Traveling Optimistic Concurrency Control. In Proceedings of the 2016 ACM SIGMOD International Conference on Management of Data (SIGMOD). 1629--1642.
[31]
Yongen Yu, Douglas H Rudd, Zhiling Lan, Nickolay Y Gnedin, Andrey Kravtsov, and Jingjin Wu. 2012. Improving Parallel IO Performance of Cell-based AMR Cosmology Applications. In Proceedings of the 26th IEEE International Conference on Parallel and Distributed Processing Symposium (IPDPS). 933--944.

Cited By

View all
  • (2022)Future-Based Persistent Spatial Data Structure for NVM-Based Manycore MachinesIEEE Access10.1109/ACCESS.2022.321641010(114711-114724)Online publication date: 2022
  • (2022)Scalable NUMA-aware persistent B+-tree for non-volatile memory devicesCluster Computing10.1007/s10586-022-03766-126:5(2865-2881)Online publication date: 17-Nov-2022
  • (undefined)Design and Implementation of Deduplication on F2FSACM Transactions on Storage10.1145/3662735

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SAC '21: Proceedings of the 36th Annual ACM Symposium on Applied Computing
March 2021
2075 pages
ISBN:9781450381048
DOI:10.1145/3412841
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 April 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. concurrency
  2. file system
  3. operating system

Qualifiers

  • Research-article

Funding Sources

  • Korea government (MSIT)

Conference

SAC '21
Sponsor:
SAC '21: The 36th ACM/SIGAPP Symposium on Applied Computing
March 22 - 26, 2021
Virtual Event, Republic of Korea

Acceptance Rates

Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Upcoming Conference

SAC '25
The 40th ACM/SIGAPP Symposium on Applied Computing
March 31 - April 4, 2025
Catania , Italy

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)49
  • Downloads (Last 6 weeks)1
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Future-Based Persistent Spatial Data Structure for NVM-Based Manycore MachinesIEEE Access10.1109/ACCESS.2022.321641010(114711-114724)Online publication date: 2022
  • (2022)Scalable NUMA-aware persistent B+-tree for non-volatile memory devicesCluster Computing10.1007/s10586-022-03766-126:5(2865-2881)Online publication date: 17-Nov-2022
  • (undefined)Design and Implementation of Deduplication on F2FSACM Transactions on Storage10.1145/3662735

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media