Loading [a11y]/accessibility-menu.js
Checking Big Suffix and LCP Arrays by Probabilistic Methods | IEEE Journals & Magazine | IEEE Xplore

Checking Big Suffix and LCP Arrays by Probabilistic Methods


Abstract:

For full-text indexing of massive data, the suffix and LCP (longest common prefix) arrays have been recognized as fundamental data structures, and there are at least two ...Show More

Abstract:

For full-text indexing of massive data, the suffix and LCP (longest common prefix) arrays have been recognized as fundamental data structures, and there are at least two needs in practice for checking their correctness, i.e., program debugging and verifying the arrays constructed by probabilistic algorithms. Two probabilistic methods are proposed to check the suffix and LCP arrays of constant or integer alphabets in external memory using a Karp-Rabin fingerprinting technique, where the checking is wrong only with a negligible error probability. The first method checks the lexicographical order and the LCP-value of two suffixes by computing and comparing the fingerprints of their LCPs. This method is general in terms of that it can verify any full or sparse suffix/LCP array of any order. The second method uses less space, it first employs the fingerprinting technique to verify a subset of the given suffix and LCP arrays, from which two new suffix and LCP arrays are induced and compared with the given arrays for verification, where the induced suffix and LCP arrays can be removed for constant alphabets to save space.
Published in: IEEE Transactions on Computers ( Volume: 66, Issue: 10, 01 October 2017)
Page(s): 1667 - 1675
Date of Publication: 09 May 2017

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.