Biological Complexity and Biochemical Information

Adami, Christoph

doi:10.1007/978-0-387-30440-3_33

Biological Complexity and Biochemical Information

Christoph Adami²

Reference work entry

379 Accesses
1 Citations

Definition of the Subject

Biological complexity refers to a measure of the intricateness, or complication, of a biological organism that is directly related to thatorganism's ability to successfully function in a complex environment. Because organismal complexity is difficult to define, several differentmeasures of complexity are often used as proxies for biological complexity, such as structural, functional, or sequence complexity. While the complexityof single proteins can be estimated using tools from information theory, a whole organism's biological complexity is reflected in its set ofexpressed proteins and its interactions, whereas the complexity of an ecosystem is summarized by the network of interacting species and their interactionwith the environment.

Introduction

Mankind's need to classify the world around him is perhaps nowhere more apparent than in our zeal to attach to each and every living organisma tag that reveals its relationship to ourselves. The idea that all forms...

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 3,499.99; Price excludes VAT (USA)

Hardcover Book: USD 549.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Abbreviations

C-value:: The haploid genome size of an organism, measured either in picograms (pg) or base pairs (bp).
Degree distribution:: The probability distribution P(d) to find a node with d edges in a network.
Entropic profile:: A graph of the per-site entropy along the sites of a biomolecular sequence, such as a DNA, RNA, or protein sequence.
Epistasis:: Generally, an interaction between genes, where the fitness effect of the modification of one gene influences the fitness effect of the modification of another gene. More specifically, an interaction between mutations that can be either positive (reinforcing or synergistic), or negative (mitigating or antagonistic).
Erdös–Rényi network:: A random graph with a binomial degree distribution.
Fitness:: A numerical measure predicting the long-term success of a lineage.
Jensen–Shannon divergence:: In probability and statistics, a measure for the similarity of probability distributions, given by the symmetrized relative entropy of the distributions.
Module:: In network theory, a group of nodes that is closely associated in connections or function, but only weakly associated to other such groups.
Motif:: In network theory, a subgraph of small size.
Network diameter:: For networks, the average geodesic distance between nodes, defined as \( D= 1 / m \sum_{i=1}^n\sum_{j=1}^n d(i,j) \), where m is the number of edes of the graph, n is the number of nodes, and \( { d(i } \), \( { j) } \) is the shortest path distance between nodes i and j.
Phylogenetic depth:: A measure of the genetic distance between a genome and its ancestor on the same line of descent, given by the number of genetically different genomes on the line between the genomes plus one.
Random variable:: In probability and statistics, a mathematical object with discrete or continuous states that the object takes on with probabilities drawn from a probability distribution associated to the random variable.
Source entropy:: The entropy of a sequence generated by a process that generates symbols with a given probability distribution.
Wright–Fisher process:: In population genetics, a stochastic process that describes how genes are transmitted from one generation to the next.
Turing machine:: In mathematics, an abstract automaton that manipulates symbols on a tape directed by a finite set of rules.
Watson-Crick pairing:: In biochemistry, the pairing between nucleotides adenine and thymine (A-T), and guanine and cytosine (G-C).
Zipf's law:: A relationship between the frequency f and the rank k of words in a text, of the form \( { f(k)\sim k^s } \), where s is the exponent of the distribution.

Bibliography

Lovejoy AO (1936) The great chain of being: A study of the history of theidea. Harvard University Press, Cambridge
Google Scholar
Gould S (1996) Full house: The spread of excellence from Plato toDarwin. Harmony Books, New York
Google Scholar
Nee S (2005) The great chain of being. Nature435:429
ADS Google Scholar
Gould S, Lewontin R (1979) The spandrels of San Marco and the Panglossianparadigm: A critique of the adaptationist programme. Proc R Soc London B 205:581–598
ADS Google Scholar
McShea DW (1996) Metazoan complexity and evolution: Is there a trend?Evolution 50:477–492
Google Scholar
Valentine J, Collins A, Meyer C (1994) Morphological complexity increase inmetazoans. Paleobiology, 20:131–142
Google Scholar
Bell G, Mooers A (1997) Size and complexity among multicellularorganisms. Biol J Linnean Soc 60:345–363
Google Scholar
Nehaniv CL, Rhodes JL (2000) The evolution and understanding of hierarchicalcomplexity in biology from an algebraic perspective. Artif Life 6:45–67
Google Scholar
McShea D (2001) The hierarchical structure of organisms: A scale anddocumentation of a trend in the maximum. Paleobiology 27:405–423
Google Scholar
Szostak JW (2003) Functional information: Molecular messages. Nature423:689
ADS Google Scholar
McShea DW (2000) Functional complexity in organisms: Parts as proxies. BiolPhilosoph 15:641–668
Google Scholar
Britten RJ, Davidson EH (1971) Repetitive and non‐repetitive DNAsequences and a speculation on the origins of evolutionary novelty. Q Rev Biol 46:111–138
Google Scholar
Cavalier-Smith T (1985) Eukaryotic gene numbers, non-coding DNA and genomesize. In: Cavalier-Smith T (ed) The evolution of genome size. Wiley, New York, pp. 69–103
Google Scholar
Gregory TR (2004) Macroevolution, hierarchy theory, and the c-value enigma.Paleobiology 30:179–202
MathSciNet Google Scholar
Gregory TR (2005) Genome size evolution in animals. In: Gregory TR (ed) Theevolution of the genome. Elsevier, San Diego, pp. 3–87
Google Scholar
Badii R, Politi A (1997) Complexity: Hierarchical structures and scalingin physics, Cambridge Nonlinear Science Series, vol. 6. Cambridge University Press, Cambridge (UK)
Google Scholar
Kolmogorov A (1965) Three approaches to the quantitative definition ofinformation. Probl Inf Transm 1:4
MathSciNet Google Scholar
Li M, Vitanyi P (1997) An introduction to Kolmogorov complexity and itsapplications. Springer, New York
MATH Google Scholar
Adami C, Cerf NJ (2000) Physical complexity of symbolic sequences. Physica D137:62–69
MathSciNet ADS MATH Google Scholar
Gell-Mann M, Lloyd S (1996) Information measures, effective complexity, andtotal information. Complexity 2:44–52
MathSciNet Google Scholar
Shannon C, Weaver W (1949) The mathematical theory ofcommunication. University of Illinois Press, Urbana
MATH Google Scholar
Quastler H (ed) (1953) Information theory in biology. University of IllinoisPress, Urbana
Google Scholar
Gatlin L (1972) Information theory and the living system. Columbia UniversityPress, New York
Google Scholar
Mantegna RN, Buldyrev SV, Goldberger AL, Havlin S, Peng CK, et al (1994)Linguistic features of noncoding DNA sequences. Phys Rev Lett, 73:3169–3172
ADS Google Scholar
Schmitt AO, Herzel H (1997) Estimating the entropy of DNA sequences. J TheorBiol 188:369–377
Google Scholar
Weiss O, Jimenez‐Montaño MA, Herzel H (2000) Information content ofprotein sequences. J theor Biol, 206:379–386
Google Scholar
Herzel H, Ebeling W, Schmitt AO (1994) Entropy of biosequences: The role ofrepeats. Phys Rev E 50:5061–5071
ADS Google Scholar
Shannon C (1948) A mathematical theory of communication. Bell Syst Tech J27:379–423, 623–656
MathSciNet Google Scholar
MacKay DJC (2002) Information theory, inference and learning algorithms.Cambridge University Press, Cambridge
Google Scholar
Adami C (2004) Information theory in molecular biology. Phys LifeRev 1:3–22
ADS Google Scholar
Grassberger P (1986) Toward a quantitative theory of self‐generatedcomplexity. Int J Theor Phys 25:907–938
MathSciNet MATH Google Scholar
Bernaola‐Galvan P, Roman-Roldan R, Oliver J (1996) Compositionalsegmentation and long-range fractal correlations in DNA sequences. Phys Rev E 53:5181–5189
Google Scholar
Sprinzl M, Horn C, Brown M, Ioudovitch A, Steinberg S (1998) Compilation oftRNA sequences and sequences of tRNA genes. Nucleic Acids Res, 26:148–153
Google Scholar
Eddy SR, Durbin R (1994) RNA sequence analysis using covariance models. NuclAcids Res, 22:2079–2088
Google Scholar
Korber BT, Farber RM, Wolpert DH, Lapedes AS (1993) Covariation of mutationsin the V3 loop of human immunodeficiency virus type 1 envelope protein: an information theoretic analysis. Proc Natl Acad Sci USA90:7176–7180
ADS Google Scholar
Clarke ND (1995) Covariation of residues in the homeodomain sequence family.Protein Sci, 4:2269–2278
Google Scholar
Atchley WR, Wollenberg KR, Fitch WM, Terhalle W, Dress AW (2000) Correlationsamong amino acid sites in bhlh protein domains: an information theoretic analysis. Mol Biol Evol17:164–178
Google Scholar
Wang LY (2005) Covariation analysis of local amino acid sequences in recurrentprotein local structures. J Bioinform Comput Biol 3:1391–1409
Google Scholar
Swofford DL, Olsen GJ, Waddell PJ, Hillis DM (1996) Phylogeneticinference. In: Hillis DM, Moritz C, Mable BK (eds) Molecular systematic, 2nd edn, Sinauer, Sunderland, pp. 407–514
Google Scholar
Wolf JB, Brodie III ED, Wade MJ (eds) (2000) Epistasis and theevolutionary process. Oxford University Press, Oxford
Google Scholar
Bridgham JT, Carroll SM, Thornton JW (2006) Evolution ofhormone‐receptor complexity by molecular exploitation. Science 312:97–101
ADS Google Scholar
Cowperthwaite MC, Bull JJ, Ancel Meyers L (2006) From bad to good:Fitness reversals and the ascent of deleterious mutations. PLoS Comput Biol 2:e141
ADS Google Scholar
Finn RD et al (2006) Pfam: Clans, web tools and services. Nucleic Acids Res34:D247–D251
Google Scholar
Brenner SE, Chothia C, Hubbard TJP (1998) Assessing sequence comparisonmethods with reliable structurally identified distant evolutionary relationships. Proc Natl Acad Sci USA, 95:6073–6078
ADS Google Scholar
Miller GA, Madow WG (1954) On the maximum likelihood estimate of theShannon-Wiener measure of information. Technical Report 54–75, Air Force Cambridge Research Center, Bedford
Google Scholar
Basharin GP (1959) On a statistical estimate for the entropy ofa sequence of independent random variables. Theory Probab Appl 4:333–337
MathSciNet Google Scholar
Zurek WH (1990) Algorithmic information content, Church-Turing thesis,physical entropy, and Maxwell's demon. In: Zurek WH (ed) Complexity, entropy, and the physics of information. SFI Studies in the Sciences of Complexity,vol. 8 Addison‐Wesley. Redwood City pp. 73–89
Google Scholar
Cover TM, Thomas JA (1991) Elements of Information Theory. John Wiley, NewYork
MATH Google Scholar
Adami C (1998) Introduction to Artificial Life. Springer, NewYork
MATH Google Scholar
Adami C (2006) Digital genetics: Unravelling the genetic basis of evolution.Nat Rev Genet 7:109–118
Google Scholar
Adami C, Ofria C, Collier T (1999) Evolution of biologicalcomplexity. Proc Natl Acad Sci USA 97:4463–4468
ADS Google Scholar
Ofria C, Huang W, Torng E (2008) On the gradual evolution of complexity andthe sudden emergence of complex features. Artif Life 14, to appear
Google Scholar
Carothers JM, Oestreich SC, Davis JH, Szostak JW (2004) Informationalcomplexity and functional activity of RNA structures. J Amer Chem Soc, 126:5130–5137
Google Scholar
Hazen RM, Griffin PL, Carothers JM, Szostak JW (2007) Functional informationand the emergence of biocomplexity. Proc Natl Acad Sci USA 104:8574–8581
ADS Google Scholar
Barabasi AL, Oltvai ZN (2004) Network biology: understanding the cell'sfunctional organization. Nat Rev Genet 5:101–113
Google Scholar
Schlosser G, Wagner GP (eds) (2004) Modularity in development and evolution.University of Chicago Press, Chicago, IL
Google Scholar
Callebaut W, Rasskin‐Gutman D (eds) (2005) Modularity: Understanding thedevelopment and evolution of natural complex systems. MIT Press, Cambridge, Mass
Google Scholar
Reigl M, Alon U, Chklovskii DB (2004) Search for computational modules in theC. elegans brain. BMC Biol 2:25
Google Scholar
Hintze A, Adami C (2008) Evolution of complex modular biologicalnetworks. PLoS Comput Biol 4:e23
MathSciNet ADS Google Scholar
Batagelj V, Mrvar A (2003) Pajek: Analysis and visulaization of largenetworks. In: M Jünger PM (ed) Graph Drawing Software. Springer, Berlin, pp. 77–103
Google Scholar
Huang W, Ofria C, Torng E (2004) Measuring biological complexity in digitalorganisms. In: Pollack J, Bedau MA, Husbands P, Ikegami T, Watson R (eds) Proceedings of Artificial Life IX, MIT Press, Cambridge,pp. 315–321
Google Scholar
Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’networks. Nature 393:440–442
ADS Google Scholar
Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D et al. (2002)Network motifs: simple building blocks of complex networks. Science 298:824–827
ADS Google Scholar
Tishby N, Pereira F, Bialek W (1999) The information bottleneck method. In:Hajek B, Sreenivas RS (eds) Proceedings of the 37th Annual Allerton Conference on Communication, Control and Computing, University of Illinois Press,Champaign, IL, pp. 368–377
Google Scholar
Ziv E, Middendorf M, Wiggins CH (2005) Information‐theoretic approach tonetwork modularity. Phys Rev E 71:046117
MathSciNet ADS Google Scholar
Milo R, Itzkovitz S, Kashtan N, Levitt R, Shen-Orr S et al. (2004)Superfamilies of evolved and designed networks. Science, 303:1538–1542
ADS Google Scholar
Zipf GK (1935) The psycho‐biology of languages. Houghton‐Mifflin,Boston
Google Scholar
Shannon CE (1951) Prediction and entropy of printed English. Bell System TechJ 30:50–64
MATH Google Scholar

Download references

Acknowledgments

I am grateful to Arend Hintze for the collaborative work in Sect. “ NetworkComplexity”, as well as numerous discussions. I am also indebted to Matthew Rupp for the analysis shown in Figs. 5 and 6. This work was supported in part by the National Science FoundationsFrontiers in Integrative Biological Research grant FIBR-0527023, a Templeton Foundation research grant, and DARPA's FunBio initiative.

Author information

Authors and Affiliations

Keck Graduate Institute of Applied Life Sciences, State University of New York, Claremont, USA
Christoph Adami

Authors

Christoph Adami
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

RAMTECH LIMITED, 122 Escalle Lane, Larkspur, CA, 94939, USA
Robert A. Meyers Ph. D. (Editor-in-Chief) (Editor-in-Chief)

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Adami, C. (2009). Biological Complexity and Biochemical Information . In: Meyers, R. (eds) Encyclopedia of Complexity and Systems Science. Springer, New York, NY. https://doi.org/10.1007/978-0-387-30440-3_33

Download citation

DOI: https://doi.org/10.1007/978-0-387-30440-3_33
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-75888-6
Online ISBN: 978-0-387-30440-3
eBook Packages: Physics and AstronomyReference Module Physical and Materials ScienceReference Module Chemistry, Materials and Physics

Publish with us

Policies and ethics