Abstract
The Burrows-Wheeler transform (BWT) and the suffix array (SA) of an input string are important data structures widely used in modern bioinformatics researches such as full-text search, alignment etc. In this paper, we present a lightweight external memory algorithm for computing the BWT from a given suffix array and the input string. The algorithm has a linear I/O complexity O(n) and a workspace of at most n/2 integers. An experiment study is conducted to evaluate the time and space performance of the proposed algorithm on a number of realistic datasets. The experimental results are consistent with the theoretical complexities of the algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Beller, T., Zwerger, M., Gog, S., Ohlebusch, E.: Space-efficient construction of the burrows-wheeler transform. In: Kurland, O., Lewenstein, M., Porat, E. (eds.) SPIRE 2013. LNCS, vol. 8214, pp. 5–16. Springer, Cham (2013). doi:10.1007/978-3-319-02432-5_5
Burrows, M., Wheeler, D.J.: A block-sorting lossless data compression algorithm (1994)
Dementiev, R., Kettner, L., Sanders, P.: STXXL: standard template library for XXL data sets. Softw. Pract. Exp. 38(6), 589–638 (2008)
Kärkkäinen, J., Kempa, D., et al.: Faster sparse suffix sorting. In: LIPIcs-Leibniz International Proceedings in Informatics, vol. 25. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2014)
Manber, U., Myers, G.: Suffix arrays: a new method for on-line string searches. SIAM J. Comput. 22(5), 935–948 (1993)
Manzini, G.: An analysis of the Burrows-Wheeler transform. J. ACM 48(3), 407–430 (2001)
Navarro, G., Mäkinen, V.: Compressed full-text indexes. ACM Comput. Surv. 39(1), 2 (2007)
Nong, G., Chan, W.H., Hu, S.Q., Wu, Y.: Induced sorting suffixes in external memory. ACM Trans. Inf. Syst. 33(3), 12 (2015)
Nong, G., Zhang, S., Chan, W.H.: Two efficient algorithms for linear time suffix array construction. IEEE Trans. Comput. 60(10), 1471–1484 (2011)
Weiner, P.: Linear pattern matching algorithms. In: IEEE Conference Record of 14th Annual Symposium on Switching and Automata Theory, 1973. SWAT’08, pp. 1–11. IEEE (1973)
Wu, Y., Nong, G., Chan, W.H., Han, L.B.: Checking big suffix and lcp arrays by probabilistic methods. IEEE Trans. Comput. 1, 1 (2017)
Acknowledgments
The work of G. Nong was supported by the Guangzhou Science and Technology Program grant 201707010165 and the Project of DEGP grant 2014KTSCX007.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Nature Singapore Pte Ltd
About this paper
Cite this paper
Xie, J.Y., Lao, B., Nong, G. (2017). A Lightweight Algorithm for Computing BWT from Suffix Array in Disk. In: Chen, G., Shen, H., Chen, M. (eds) Parallel Architecture, Algorithm and Programming. PAAP 2017. Communications in Computer and Information Science, vol 729. Springer, Singapore. https://doi.org/10.1007/978-981-10-6442-5_25
Download citation
DOI: https://doi.org/10.1007/978-981-10-6442-5_25
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6441-8
Online ISBN: 978-981-10-6442-5
eBook Packages: Computer ScienceComputer Science (R0)