skip to main content
research-article

Practical Quick File Server Migration

Published: 23 May 2020 Publication History

Abstract

Regular file server upgrades are indispensable to improve performance, robustness, and power consumption. In upgrading file servers, it is crucial to quickly migrate file-sharing services between heterogeneous servers with little downtime while minimizing performance interference. We present a practical quick file server migration scheme based on the postcopy approach that defers file copy until after switching servers. This scheme can (1) reduce downtime with on-demand file migration, (2) avoid performance interference using background migration, and (3) support heterogeneous servers with stub-based file management. We discuss several practical issues, such as intermittent crawling and traversal strategy, and present the solutions in our scheme. We also address several protocol-specific issues to achieve a smooth migration. This scheme is good enough to be adopted in production systems, as it has been demonstrated for several years in real operational environments. The performance evaluation demonstrates that the downtime is less than 3 seconds, and the first file access after switching servers does not cause a timeout in the default timeout settings; it takes less than 10 seconds in most cases and up to 84.55 seconds even in a large directory tree with a depth of 16 and a width of 1,000. Although the total migration time is approximately 3 times longer than the traditional precopy approach that copies all files in advance, our scheme allows the clients to keep accessing files with acceptable overhead. We also show that appropriate selection of traversal strategy reduces tail latency by 88%, and the overhead after the migration is negligible.

References

[1]
Ted Anderson, Leo Luan, Craig Everhart, Manuel Pereira, Ronnie Sarkar, and Jane Xu. 2004. Global namespace for files. IBM Systems Journal 43, 4 (2004), 702--722.
[2]
Alain Azagury, Michael E. Factor, Julian Satran, and William Micka. 2002. Point-in-time copy: Yesterday, today and tomorrow. In Proceedings of the 19th IEEE/10th NASA Goddard Conference on Mass Storage Systems and Technologies (MSST ’02). 259--270.
[3]
Ellie Berriman and Binguxe Cai. 2011. NetApp Storage System Multiprotocol Users Guide. Technical Report 3490. NetApp. Retrieved from http://www.netapp.com/us/media/tr-3490.pdf.
[4]
Tim Bisson, Yuvraj Patel, and Shankar Pasupathy. 2012. Designing a fast file system crawler with incremental differencing. ACM SIGOPS Operating Systems Review 46, 3 (Dec. 2012), 11--19.
[5]
Brent Callaghan, Brian Pawlowski, and Peter Staubach. 1995. Network File System (NFS) Version 3 Protocol Specification. Internet Requests for Comments. Retrieved from https://www.rfc-editor.org/rfc/rfc1813.txt.
[6]
Data Dynamics Inc.2017. StorageX 8.0. Retrieved Dec. 27, 2019, from https://www.datadynamicsinc.com/launch/.
[7]
Datadobi. 2018. DobiMigrate. Retrieved Dec. 27, 2019, from https://datadobi.com/migrate/.
[8]
Dell EMC. 2018. VNX: What tools does Dell EMC recommend to migrate data between arrays? Retrieved Dec. 27, 2019, from https://community.emc.com/docs/DOC-63414.
[9]
Dell Inc.2013. Dell FluidFS NAS Solutions Administrator’s Guide. Retrieved Dec. 27, 2019, from https://www.dell.com/support/manuals/jp/ja/jpbsd1/powervault-nx3610/pvfluidfsv3ag-v2/introduction.
[10]
Philippe Deniel, Thomas Leibovici, and Jacques-Charles Lafoucriere. 2007. GANESHA, a multi-usage with large cache NFSv4 server. In Proceedings of the Linux Symposium 2007, Vol. 1. 113--124.
[11]
John R. Douceur and William J. Bolosky. 1999. A large-scale study of file-system contents. In Proceedings of the 1999 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS ’99). 59--70.
[12]
Allen B. Downey. 2001. The structural cause of file size distributions. In Proceedings of the 9th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS ’01). 361--370.
[13]
EMC Corporation. 2009. EMC Rainfinity File Management Appliance Getting Started Guide. Retrieved Dec. 27, 2019, from https://www.dellemc.com/en-us/collaterals/unauth/technical-guides-support-information/products/storage-4/docu8823.pdf.
[14]
EMC Corporation. 2013. EMC VNX Series VNX File System Migration Version 2.0 for NFS and CIFS. Retrieved Dec. 27, 2019, from https://china.emc.com/collateral/TechnicalDocument/docu48478.pdf.
[15]
Marc Eshel, Roger Haskin, Dean Hildebrand, Manoj Naik, Frank Schmuck, and Renu Tewari. 2010. Panache: A parallel file system cache for global file access. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST ’10). 155--168.
[16]
F5 Networks Inc. 2013. ARX Series Datasheet. Retrieved Dec. 27, 2019, from https://www.f5.com/pdf/products/arx-series-ds.pdf.
[17]
Steven M. French. 2007. A new network file system is born: Comparison of SMB2, CIFS, and NFS. In Proceedings of the Linux Symposium 2007, Vol. 1. 131--140.
[18]
Thomas Haynes. 2016. Network File System (NFS) Version 4 Minor Version 2 Protocol. Internet Requests for Comments. Retrieved from https://www.rfc-editor.org/rfc/rfc7862.txt.
[19]
Hitachi Vantara. 2019. Hitachi Data Ingestor Cluster Administrator’s Guide. Retrieved Dec. 27, 2019, from https://knowledge.hitachivantara.com/@api/deki/files/73387/HDI_v6_4_7_Cluster_Admin_Guide_MK-90HDI038-30.pdf.
[20]
Hitachi Vantara. 2019. Hitachi NAS Platform Data Migrator Administration. Retrieved Dec. 27, 2019, from https://knowledge.hitachivantara.com/Documents/Storage/NAS_Platform/13.5/NAS_Administration/Data_Migrator_Administration.
[21]
Dave Hitz, Bridget Allison, Andrea Borr, Rob Hawley, and Mark Muhlestein. 1998. Merging NT and UNIX filesystem permissions. In Proceedings of the 2nd USENIX Windows NT Symposium. 87--96.
[22]
Ewa Huebner, Derek Bem, and Cheong Kai Wee. 2006. Data hiding in the NTFS file system. Digital Investigation 3, 4 (2006), 211--226.
[23]
IBM. 2020. Active file management. Retrieved from https://www.ibm.com/support/knowledgecenter/en/SSFKCN_4.1.0/com.ibm.cluster.gpfs.v4r1.gpfs200.doc/bl1adv_afm.htm.
[24]
IBM. 2020. IBM Spectrum Scale. Retrieved from https://www.ibm.com/marketplace/scale-out-file-and-object-storage.
[25]
Intel Inc. 2013. Planning Guide: Updating IT Infrastructure. Retrieved Dec. 27, 2019, from http://www.intel.com/content/dam/www/public/us/en/documents/guides/server-refresh-planning-guide.pdf.
[26]
Wataru Katsurashima, Satoshi Yamakawa, Takashi Torii, Jun Ishikawa, Yoshihide Kikuchi, Kouji Yamaguti, Kazuaki Fujii, and Toshihiro Nakashima. 2003. NAS switch: A novel CIFS server virtualization. In Proceedings of the 20th IEEE/11th NASA Goddard Conference on Mass Storage Systems and Technologies (MSST ’03). 82--86.
[27]
Andrew W. Leung, Shankar Pasupathy, Garth Goodson, and Ethan L. Miller. 2008. Measurement and analysis of large-scale network file system workloads. In Proceedings of the 2008 USENIX Annual Technical Conference (ATC ’08). 213--226. http://dl.acm.org/citation.cfm?id=1404014.1404030
[28]
LWN.net. 2019. The ZUFS zero-copy filesystem. Retrieved Dec. 27, 2019, from https://lwn.net/Articles/756625/
[29]
Keiichi Matsuzawa, Mitsuo Hayasaka, and Takahiro Shinagawa. 2018. The quick migration of file servers. In Proceedings of the 11th ACM International Systems and Storage Conference (SYSTOR ’18). 65--75.
[30]
Arpan Merchant. 2018. SnapMirror Configuration and Best Practices Guide for Clustered Data ONTAP. Technical Report 4015. NetApp. Retrieved from https://www.netapp.com/us/media/tr-4015.pdf.
[31]
Dutch T. Meyer and William J. Bolosky. 2011. A study of practical deduplication. In Proceedings of the 9th USENIX Conference on File and Storage Technologies (FAST ’11). 1--14.
[32]
Microsoft. 2019. Azure FXT Edge Filer. Retrieved Dec. 27, 2019, from https://azure.microsoft.com/en-us/services/fxt-edge-filer/.
[33]
Microsoft TechNet. 2009. Windows Server Migration Tools and Guides. Retrieved Dec. 27, 2019, from https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-server-2008-R2-and-2008/dd759159(v=ws.11).
[34]
Microsoft TechNet. 2015. Usage of File Server Migration Toolkit. Retrieved Dec. 27, 2019, from https://social.technet.microsoft.com/wiki/contents/articles/32299.usage-of-file-server-migration-toolkit.aspx.
[35]
Microsoft TechNet. 2016. Command-Line Reference Robocopy. Retrieved Dec. 27, 2019, from https://technet.microsoft.com/en-us/library/cc733145(v=ws.11).aspx.
[36]
Jun Nemoto, Atsushi Sutoh, and Masaaki Iwasaki. 2017. Directory-aware file system backup to object storage for fast on-demand restore. International Journal of Smart Computing and Artificial Intelligence 1, 1 (2017), 1--19.
[37]
NetApp. 2019. NetApp XCP Migration Tool. Retrieved Dec. 27, 2019, from https://xcp.netapp.com/.
[38]
Justin Parisi and Bikash Roy Choundhury. 2016. Parallel Network File System Configuration and Best Practices for Clustered Data ONTAP 8.2 and Later. Technical Report 4063. NetApp. Retrieved from https://www.netapp.com/us/media/tr-4063.pdf.
[39]
Naren Rajasingam and Ravikumar Ramaswamy. 2019. Data Migration Method using AFM. Retrieved Dec. 27, 2019, from https://www.ibm.com/developerworks/community/wikis/form/anonymous/api/wiki/fa32927c-e904-49cc-a4cc-870bcc8e307c/page/2ff0c6d7-a854-4d64-a98c-0dbfc611ffc6/attachment/31ac16b4-e95a-447d-8e5b-80e4899fc2a6/media/Data%20Migration%20Me%thods%20using%20AFM_v2.12.pdf.
[40]
Takashi Sato. 2007. ext4 online defragmentation. In Proceedings of the Linux Symposium 2007, Vol. 1. 179--186.
[41]
Bianca Schroeder and Garth A. Gibson. 2007. Understanding disk failure rates: What does an MTTF of 1,000,000 hours mean to you? ACM Transactions on Storage 3, 3, Article 8 (Oct. 2007), 31 pages.
[42]
Spencer Shepler, Mike Eisler, and David Noveck. 2010. Network File System (NFS) Version 4 Minor Version 1 Protocol. Internet Requests for Comments. Retrieved from http://www.rfc-editor.org/rfc/rfc5661.txt.
[43]
Standard Performance Evaluation Corporation. 2014. SPEC SFS 2014 SP2 User’s Guide. Retrieved Dec. 27, 2019, from https://www.spec.org/sfs2014/docs/usersguide.pdf.
[44]
syncsoft. 2017. The 2018 State of Resilience. Retrieved Dec. 27, 2019, from http://rc.visionsolutions.com/WP_2018_State-of-Resilience-Report.
[45]
Vasily Tarasov, Erez Zadok, and Spencer Shepler. 2016. Filebench: A flexible framework for file system benchmarking. ;login: The Usenix Magazine 41, 1 (2016), 6--12. https://www.usenix.org/publications/login.
[46]
TechTarget. 2017. NAS trifecta: Price, features and performance. Storage Magazine 16, 8 (2017), 14.
[47]
TechTarget. 2017. Snapshot 1: New NAS buys motivated by performance and outdated hardware. Storage Magazine 16, 2 (2017), 12.
[48]
Andrew Tridgell and Paul Mackerras. 1996. The rsync Algorithm. Technical Report TR-CS-96-05. ANU Research Publications.
[49]
Bharath Kumar Reddy Vangoor, Vasily Tarasov, and Erez Zadok. 2017. To FUSE or not to FUSE: Performance of user-space file systems. In Proceedings of the 15th Usenix Conference on File and Storage Technologies (FAST’17). 59--72. http://dl.acm.org/citation.cfm?id=3129633.3129640.
[50]
Michael Vrable, Stefan Savage, and Geoffrey M. Voelker. 2009. Cumulus: Filesystem backup to the cloud. ACM Transactions on Storage (TOS) 5, 4, Article 14 (Dec. 2009), 28 pages.
[51]
Yoshiko Yasuda, Shinichi Kawamoto, Atsushi Ebata, Jun Okitsu, and Tatsuo Higuchi. 2003. Concept and evaluation of X-NAS: A highly scalable NAS system. In Proceedings of the 20th IEEE/11th NASA Goddard Conference on Mass Storage Systems and Technologies (MSST ’03). 219--227.

Cited By

View all
  • (2021)Lightweight Dynamic Redundancy Control with Adaptive Encoding for Server-based StorageACM Transactions on Storage10.1145/345629217:4(1-38)Online publication date: 15-Oct-2021

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Storage
ACM Transactions on Storage  Volume 16, Issue 2
SOSP 2019 Special Section and Regular Papers
May 2020
194 pages
ISSN:1553-3077
EISSN:1553-3093
DOI:10.1145/3399155
  • Editor:
  • Sam H. Noh
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 May 2020
Online AM: 07 May 2020
Accepted: 01 December 2019
Revised: 01 September 2019
Received: 01 April 2019
Published in TOS Volume 16, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. File server
  2. migration
  3. postcopy

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)21
  • Downloads (Last 6 weeks)2
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2021)Lightweight Dynamic Redundancy Control with Adaptive Encoding for Server-based StorageACM Transactions on Storage10.1145/345629217:4(1-38)Online publication date: 15-Oct-2021

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media