skip to main content
article

On the visualization of the DNA sequence and its nucleotide content

Published: 01 December 2005 Publication History

Abstract

Visual inspection can help reveal patterns that would be computationally rather difficult to reveal. We consider three different algorithms for visualizations of a DNA sequence and its nucleotide content: random walk, fractal and visualization based on the entropy-like parameters calculated using a sliding window. We present a program that uses these three methods and visualizes either the whole of a given sequence, or specified fragments. It also provides facilities to compare visualizations obtained for different sequences/fragments. Random walk visualization considers the sequence symbol-by-symbol; the other two methods also take into account how well nucleotides are "mixed" in the sequence. It allows an easy visualization of repeated patterns, segments with a high/low content of some nucleotides, such as CG-islands, etc. The program also helps to identify regions of interest for further study.

References

[1]
Salvatore Paxia, Archisman Rudra, Yi Zhou, and Bud Mishra, "A random walk down the genomes: Dna evolution in valis," Computer, vol. 35 (7), pp. 73--79, 2002.
[2]
HJ Jeffrey, "Chaos game representation of gene structure," Nucleic Acids Research, vol. 18 (8), pp. 2163--2170, 1990.
[3]
Bai lin Hao, H. C. Lee, and Shu yu Zhang, "Fractals related to long dna sequences and complete genomes," Chaos, Solitons and Fractals, vol. 11, pp. 825--836, 2000.
[4]
Dan Ashlock and Jim Golden, "Evolutionary computation and fractal visualization of sequence data," in Evolutionary Computation in Bioinformatics, Gary B. Fogel and David W. Corne, Eds. 2003, Morgan Kaufmann Publishers.
[5]
Heinz-Otto Peitgen, Hartmut Jurgens, and Dietmar Saupe, Chaos and Fractals. New Frontiers of Science, Springer-Verlag, New York, 1992.
[6]
Michael Barnsley, Fractals Everywhere, Morgan Kaufmann, 2000.
[7]
A. Bird, "Cpg islands as gene markers in the vertebrate nucleus," Trends in Genetics, vol. 3, pp. 342--347, 1987.
[8]
R. Durbin, S. Eddy, A. Krogh, and G. Mitchison, Biological Sequence Analysis. Probabilistic models of proteins and nucleic acids, Cambridge University press, Cambridge, 2000.
[9]
Pavel A. Pevzner, Computational molecular byology, The MIT Press, Cambridge, Massachusets, London, England, 2000.
[10]
L. Gatlin, "The information content of dna," J. Theor. Biol., vol. 10, pp. 281, 1966.
[11]
G. W. Rowe, "On the informational content of viral dna," J. Theor. Biol., vol. 101, no. 4, pp. 151, 1983.
[12]
Lipman D. J. and Maizel J., "Comparative analysis of nucleic acid sequences by their general constraints," Nucl. Acids Res., vol. 10, pp. 2723, 1982.
[13]
Olga V. Kirillova, "Entropy concepts and dna investigations," PLA, vol. 273.
[14]
Hsuan T. Chang, Neng-Wen Lo, Wei C. Lu, and Chung J. Kuo, "Visualization and comparison of dna sequences by use of three-dimensional trajectories, in CRPITS '03: Proceedings of the First Asia-Pacific bioinformatics conference on Bioinformatics 2003, Darlinghurst, Australia, Australia, 2003, pp. 81--85, Australian Computer Society, Inc.

Cited By

View all
  • (2020)Visualization of repeated patterns in multivariate discrete sequencesProceedings of the 12th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining10.1109/ASONAM49781.2020.9381316(862-869)Online publication date: 7-Dec-2020
  • (2014)Identification and retrieval of DNA genomes using binary image representations produced by cellular automata2014 IEEE International Conference on Imaging Systems and Techniques (IST) Proceedings10.1109/IST.2014.6958460(134-137)Online publication date: Oct-2014
  • (2009)Probabilistic NeuroScale for Uncertainty VisualisationProceedings of the 2009 13th International Conference Information Visualisation10.1109/IV.2009.106(74-79)Online publication date: 15-Jul-2009
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGSAM Bulletin
ACM SIGSAM Bulletin  Volume 39, Issue 4
December 2005
41 pages
ISSN:0163-5824
DOI:10.1145/1140378
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 December 2005
Published in SIGSAM Volume 39, Issue 4

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 08 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2020)Visualization of repeated patterns in multivariate discrete sequencesProceedings of the 12th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining10.1109/ASONAM49781.2020.9381316(862-869)Online publication date: 7-Dec-2020
  • (2014)Identification and retrieval of DNA genomes using binary image representations produced by cellular automata2014 IEEE International Conference on Imaging Systems and Techniques (IST) Proceedings10.1109/IST.2014.6958460(134-137)Online publication date: Oct-2014
  • (2009)Probabilistic NeuroScale for Uncertainty VisualisationProceedings of the 2009 13th International Conference Information Visualisation10.1109/IV.2009.106(74-79)Online publication date: 15-Jul-2009
  • (2008)A hybrid visualization Hidden Markov Model approach to identifying CG-islands in DNA sequencesIEEE SoutheastCon 200810.1109/SECON.2008.4494244(1-6)Online publication date: Apr-2008
  • (2007)Three Dimensional Chaos Game Representation of Genomic SequencesProceedings of the 2007 Frontiers in the Convergence of Bioscience and Information Technologies10.1109/FBIT.2007.13(219-223)Online publication date: 11-Oct-2007

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media