Efficient techniques on retrieving bio-information for active U-healthcare

Park, Young-Ho

doi:10.1007/s00779-012-0569-3

Efficient techniques on retrieving bio-information for active U-healthcare

Original Article
Published: 04 July 2012

Volume 17, pages 1349–1356, (2013)
Cite this article

Personal and Ubiquitous Computing Aims and scope Submit manuscript

Young-Ho Park¹

1333 Accesses
2 Citations
2 Altmetric
1 Mention
Explore all metrics

Abstract

Recently, active prevention healthcares are needed for potential patients to be suffered in the future as the forecasted diseases inherited from ancestors. We call active U-healthcare, for providing active, periodic, and continuous medical treatments depending on inherited heterogeneous states in DNAs of patients, such as diabetes, heart diseases, and female diseases. However, the bottleneck of the aggressive active U-healthcare is memory overhead in DNA sequence analysis of each patient since the sequences of DNAs have massive volume. Thus, the efficient retrieve of the many disease patterns in originally recorded on DNAs of potential patients is a major problem. This paper focuses on a novel method for efficient retrieving of disease patterns using a suffix tree in memory. The suffix tree is widely used in the similarity search for sequences consisting of limited characters. It is efficient when the occurrence frequency of a common prefix is high. Since in-memory suffix tree construction algorithms do not scale up, a large-scale disk-based suffix tree construction algorithm, TRELLIS, has been proposed recently. However, the algorithm requires a large amount of memory, disk space, and disk I/Os in order to merge sub-trees having a common prefix. In this paper, we propose a new non-merging method, called NST. The experimental results show that NST constructs an index using less memory than TRELLIS.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Ubiquitous Health Profile (UHPr): a big data curation platform for supporting health data interoperability

Article 19 August 2020

Fahad Ahmed Satti, Taqdir Ali, … Sungyoung Lee

Parallel and private generalized suffix tree construction and query on genomic data

Article Open access 17 June 2022

Md Momin Al Aziz, Parimala Thulasiraman & Noman Mohammed

Privacy-Preserving Genomic Data Publishing via Differentially-Private Suffix Tree

References

Pyshkin E, Kuznetsov A (2010) Approaches for web search user interfaces: how to improve the search quality for various types of information. J Converg 1(1):1–8 (ISSN 2093-7741)
Google Scholar
Mirceva G, Mirchev M, Davcev D (2010) Hidden Markov Models for classifying protein secondary and tertiary structures. J Converg 1(1):157–164 (ISSN 2093-7741)
Google Scholar
Dominguez-Sal D, Perez-Casany M, Larriba-Pey JL (2010) Cooperative cache analysis for distributed search engines. IJITCC (Int J Inf Technol Commun Converg) 1(1):41–65 (ISSN 2042-3217)
Google Scholar
Klyuev V, Oleshchuk V (2011) Semantic retrieval: an approach to representing, searching and summarising text documents. IJITCC (Int J Inf Technol Commun Converg) 1(2):221–234 (ISSN 2042-3217)
Google Scholar
McCreight E (1976) A space-economical suffix tree construction algorithm. J Assoc Comput Mach 23(2):262–272
Article MathSciNet MATH Google Scholar
Colussi L, de Alessia C (1996) A time and space efficient data structure for string searching on large texts. Inf Proc Lett 58(5):217–222
Article MATH Google Scholar
Grossi R, Vitter JS (2000) Compressed suffix arrays and suffix trees with applications to text indexing and string matching. In: Proceeding of the thirty-second annual ACM symposium on theory of computing, pp 397–406
Lee W, Arbee L, Chen P (2000) Efficient multi-feature index structures for music data retrieval. Storage Retr Media Datab 3972:177–188
Google Scholar
Hsu JL, Liu CC, Chen ALP (1998) Efficient repeating pattern finding in music databases. In: Proceedings of the ACM international conference on information and knowledge management
Lew MS, Sebe N, Djeraba C, Jain R (2006) Content-based multimedia information retrieval: state of the art and challenges. ACM Trans Multimedia Comput Commun Appl 2(1):1–19
Article Google Scholar
Hsu J-L, Liu C–C, Chen ALP (2001) Discovering nontrivial repeating patterns in music data. IEEE Trans Multimedia 3(3):311–325
Article Google Scholar
Karydis I, Nanopoulos A, Manolopoulos Y (2007) Finding maximum-length repeating patterns in music databases. Multimedia Tools Appl 32(1):49–71
Article Google Scholar
Phoophakdee B, Zaki MJ (2007) Genome-scale disk-based suffix tree indexing. In: Proceedings of the ACM SIGMOD international conference on management of data, pp 833–844
Hunt E, Atkinson MP, Irving RW (2002) Database indexing for large DNA and protein sequence collections. VLDB J 11(3):256–271
Article MATH Google Scholar
Hunt E, Atkinson M, Irving R (2001) A database index to large biological sequences. In: Proceedings of the VLDB international conference 7(3):139–148
Tian Y (2005) Practical methods for constructing suffix trees. VLDB J 14(3):281–299
Article Google Scholar
Tata S, Hankins R, Patel J (2004) Practical suffix tree construction. In: Proceedings of the VLDB international conference 23(2):36–47
Halachev M, Shiri N, Thamildurai A (2007) Efficient and scalable indexing techniques for biological sequence data. Bioinf Res Develop Lect Notes Comp Sci 4414:464–479
Article Google Scholar
Farach-Colton M, Ferragina P, Muthukrishnan S (2007) Overcoming the memory bottleneck in suffix tree construction. J ACM 47(6):987–1011
Article MathSciNet Google Scholar
Giegerich R, Kurtz S, Stoye J (2003) Efficient implementation of lazy suffix trees. Softw Pract Exp 33(11):1035–1049
Article Google Scholar
Cheung CF, Yu JX, Lu H (2005) Constructing suffix tree for gigabyte sequences with megabyte memory. IEEE Trans Knowl Data Eng 17(1):90–105
Article Google Scholar
Kasai T, Lee G, Arimura H, Arikawa S, Park K (2001) Linear-time longest-common prefix computation in suffix arrays and its applications. In: Proceedings of the 12th annual symposium on combinatorial pattern matching. Lecture Notes in Computer Science 2089:181–192
Arimura H, Arikawa S, Shimozono S (2000) Efficient discovery of optimal word association patterns in large text databases. New Gener Comput 18(1):49–60
Article Google Scholar
Ukkonen E, Karkkainen J (1995) On-line construction of suffix trees. J Assoc Comp Mach 14(3):262–272
Google Scholar
Ukkonen E (1992) Approximate string-matching over suffix trees. Comb Pattern Match 92(1):228–242
MathSciNet Google Scholar

Download references

Acknowledgments

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (No. 20110002707).

Author information

Authors and Affiliations

Department of Multimedia Science, Sookmyung Women’s University, Seoul, Korea
Young-Ho Park

Authors

Young-Ho Park
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Young-Ho Park.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Park, YH. Efficient techniques on retrieving bio-information for active U-healthcare. Pers Ubiquit Comput 17, 1349–1356 (2013). https://doi.org/10.1007/s00779-012-0569-3

Download citation

Received: 24 August 2011
Accepted: 30 December 2011
Published: 04 July 2012
Issue Date: October 2013
DOI: https://doi.org/10.1007/s00779-012-0569-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Efficient techniques on retrieving bio-information for active U-healthcare

Abstract

Access this article

Similar content being viewed by others

Ubiquitous Health Profile (UHPr): a big data curation platform for supporting health data interoperability

Parallel and private generalized suffix tree construction and query on genomic data

Privacy-Preserving Genomic Data Publishing via Differentially-Private Suffix Tree

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Efficient techniques on retrieving bio-information for active U-healthcare

Abstract

Access this article

Similar content being viewed by others

Ubiquitous Health Profile (UHPr): a big data curation platform for supporting health data interoperability

Parallel and private generalized suffix tree construction and query on genomic data

Privacy-Preserving Genomic Data Publishing via Differentially-Private Suffix Tree

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation