A Space and Time Efficient Algorithm for Constructing Compressed Suffix Arrays

Hon, Wing-Kai; Lam, Tak-Wah; Sadakane, Kunihiko; Sung, Wing-Kin; Yiu, Siu-Ming

doi:10.1007/s00453-006-1228-8

A Space and Time Efficient Algorithm for Constructing Compressed Suffix Arrays

Published: 22 March 2007

Volume 48, pages 23–36, (2007)
Cite this article

Algorithmica Aims and scope Submit manuscript

Wing-Kai Hon¹,
Tak-Wah Lam¹,
Kunihiko Sadakane²,
Wing-Kin Sung³ &
…
Siu-Ming Yiu¹

617 Accesses
48 Citations
6 Altmetric
Explore all metrics

Abstract

With the first human DNA being decoded into a sequence of about 2.8 billion characters, much biological research has been centered on analyzing this sequence. Theoretically speaking, it is now feasible to accommodate an index for human DNA in the main memory so that any pattern can be located efficiently. This is due to the recent breakthrough on compressed suffix arrays, which reduces the space requirement from O(n log n) bits to O(n) bits. However, constructing compressed suffix arrays is still not an easy task because we still have to compute suffix arrays first and need a working memory of O(n log n) bits (i.e., more than 13 gigabytes for human DNA). This paper initiates the study of constructing compressed suffix arrays directly from the text. The main contribution is a construction algorithm that uses only O(n) bits of working memory, and the time complexity is O(n log n). Our construction algorithm is also time and space efficient for texts with large alphabets such as Chinese or Japanese. Precisely, when the alphabet size is |Σ|, the working space is O(n log |Σ|) bits, and the time complexity remains O(n log n), which is independent of |Σ|.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Author information

Authors and Affiliations

Department of Computer Science, The University of Hong Kong, Pokfulam, Hong Kong
Wing-Kai Hon, Tak-Wah Lam & Siu-Ming Yiu
Department of Computer Science and Communication Engineering, Kyushu University, Kyushu, Japan
Kunihiko Sadakane
School of Computing, National University of Singapore, Singapore, Singapore
Wing-Kin Sung

Authors

Wing-Kai Hon
View author publications
You can also search for this author in PubMed Google Scholar
Tak-Wah Lam
View author publications
You can also search for this author in PubMed Google Scholar
Kunihiko Sadakane
View author publications
You can also search for this author in PubMed Google Scholar
Wing-Kin Sung
View author publications
You can also search for this author in PubMed Google Scholar
Siu-Ming Yiu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Wing-Kai Hon, Tak-Wah Lam, Kunihiko Sadakane, Wing-Kin Sung or Siu-Ming Yiu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hon, WK., Lam, TW., Sadakane, K. et al. A Space and Time Efficient Algorithm for Constructing Compressed Suffix Arrays. Algorithmica 48, 23–36 (2007). https://doi.org/10.1007/s00453-006-1228-8

Download citation

Published: 22 March 2007
Issue Date: May 2007
DOI: https://doi.org/10.1007/s00453-006-1228-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Space and Time Efficient Algorithm for Constructing Compressed Suffix Arrays

Abstract

Access this article

Similar content being viewed by others

Sparse Suffix Tree Construction in Small Space

Faster Repetition-Aware Compressed Suffix Trees Based on Block Trees

Space-Efficient Construction Algorithm for the Circular Suffix Tree

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Space and Time Efficient Algorithm for Constructing Compressed Suffix Arrays

Abstract

Access this article

Similar content being viewed by others

Sparse Suffix Tree Construction in Small Space

Faster Repetition-Aware Compressed Suffix Trees Based on Block Trees

Space-Efficient Construction Algorithm for the Circular Suffix Tree

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation