Skip to main content

Biological Complexity and Biochemical Information

  • Reference work entry

Definition of the Subject

Biological complexity refers to a measure of the intricateness, or complication, of a biological organism that is directly related to thatorganism's ability to successfully function in a complex environment. Because organismal complexity is difficult to define, several differentmeasures of complexity are often used as proxies for biological complexity, such as structural, functional, or sequence complexity. While the complexityof single proteins can be estimated using tools from information theory, a whole organism's biological complexity is reflected in its set ofexpressed proteins and its interactions, whereas the complexity of an ecosystem is summarized by the network of interacting species and their interactionwith the environment.

Introduction

Mankind's need to classify the world around him is perhaps nowhere more apparent than in our zeal to attach to each and every living organisma tag that reveals its relationship to ourselves. The idea that all forms...

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   3,499.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   549.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Abbreviations

C-value:

The haploid genome size of an organism, measured either in picograms (pg) or base pairs (bp).

Degree distribution:

The probability distribution P(d) to find a node with d edges in a network.

Entropic profile:

A graph of the per-site entropy along the sites of a biomolecular sequence, such as a DNA, RNA, or protein sequence.

Epistasis:

Generally, an interaction between genes, where the fitness effect of the modification of one gene influences the fitness effect of the modification of another gene. More specifically, an interaction between mutations that can be either positive (reinforcing or synergistic), or negative (mitigating or antagonistic).

Erdös–Rényi network:

A random graph with a binomial degree distribution.

Fitness:

A numerical measure predicting the long-term success of a lineage.

Jensen–Shannon divergence:

In probability and statistics, a measure for the similarity of probability distributions, given by the symmetrized relative entropy of the distributions.

Module:

In network theory, a group of nodes that is closely associated in connections or function, but only weakly associated to other such groups.

Motif:

In network theory, a subgraph of small size.

Network diameter:

For networks, the average geodesic distance between nodes, defined as \( D= 1 / m \sum_{i=1}^n\sum_{j=1}^n d(i,j) \), where m is the number of edes of the graph, n is the number of nodes, and \( { d(i } \), \( { j) } \) is the shortest path distance between nodes i and j.

Phylogenetic depth:

A measure of the genetic distance between a genome and its ancestor on the same line of descent, given by the number of genetically different genomes on the line between the genomes plus one.

Random variable:

In probability and statistics, a mathematical object with discrete or continuous states that the object takes on with probabilities drawn from a probability distribution associated to the random variable.

Source entropy:

The entropy of a sequence generated by a process that generates symbols with a given probability distribution.

Wright–Fisher process:

In population genetics, a stochastic process that describes how genes are transmitted from one generation to the next.

Turing machine:

In mathematics, an abstract automaton that manipulates symbols on a tape directed by a finite set of rules.

Watson-Crick pairing:

In biochemistry, the pairing between nucleotides adenine and thymine (A-T), and guanine and cytosine (G-C).

Zipf's law:

A relationship between the frequency f and the rank k of words in a text, of the form \( { f(k)\sim k^s } \), where s is the exponent of the distribution.

Bibliography

  1. Lovejoy AO (1936) The great chain of being: A study of the history of theidea. Harvard University Press, Cambridge

    Google Scholar 

  2. Gould S (1996) Full house: The spread of excellence from Plato toDarwin. Harmony Books, New York

    Google Scholar 

  3. Nee S (2005) The great chain of being. Nature435:429

    ADS  Google Scholar 

  4. Gould S, Lewontin R (1979) The spandrels of San Marco and the Panglossianparadigm: A critique of the adaptationist programme. Proc R Soc London B 205:581–598

    ADS  Google Scholar 

  5. McShea DW (1996) Metazoan complexity and evolution: Is there a trend?Evolution 50:477–492

    Google Scholar 

  6. Valentine J, Collins A, Meyer C (1994) Morphological complexity increase inmetazoans. Paleobiology, 20:131–142

    Google Scholar 

  7. Bell G, Mooers A (1997) Size and complexity among multicellularorganisms. Biol J Linnean Soc 60:345–363

    Google Scholar 

  8. Nehaniv CL, Rhodes JL (2000) The evolution and understanding of hierarchicalcomplexity in biology from an algebraic perspective. Artif Life 6:45–67

    Google Scholar 

  9. McShea D (2001) The hierarchical structure of organisms: A scale anddocumentation of a trend in the maximum. Paleobiology 27:405–423

    Google Scholar 

  10. Szostak JW (2003) Functional information: Molecular messages. Nature423:689

    ADS  Google Scholar 

  11. McShea DW (2000) Functional complexity in organisms: Parts as proxies. BiolPhilosoph 15:641–668

    Google Scholar 

  12. Britten RJ, Davidson EH (1971) Repetitive and non‐repetitive DNAsequences and a speculation on the origins of evolutionary novelty. Q Rev Biol 46:111–138

    Google Scholar 

  13. Cavalier-Smith T (1985) Eukaryotic gene numbers, non-coding DNA and genomesize. In: Cavalier-Smith T (ed) The evolution of genome size. Wiley, New York, pp. 69–103

    Google Scholar 

  14. Gregory TR (2004) Macroevolution, hierarchy theory, and the c-value enigma.Paleobiology 30:179–202

    MathSciNet  Google Scholar 

  15. Gregory TR (2005) Genome size evolution in animals. In: Gregory TR (ed) Theevolution of the genome. Elsevier, San Diego, pp. 3–87

    Google Scholar 

  16. Badii R, Politi A (1997) Complexity: Hierarchical structures and scalingin physics, Cambridge Nonlinear Science Series, vol. 6. Cambridge University Press, Cambridge (UK)

    Google Scholar 

  17. Kolmogorov A (1965) Three approaches to the quantitative definition ofinformation. Probl Inf Transm 1:4

    MathSciNet  Google Scholar 

  18. Li M, Vitanyi P (1997) An introduction to Kolmogorov complexity and itsapplications. Springer, New York

    MATH  Google Scholar 

  19. Adami C, Cerf NJ (2000) Physical complexity of symbolic sequences. Physica D137:62–69

    MathSciNet  ADS  MATH  Google Scholar 

  20. Gell-Mann M, Lloyd S (1996) Information measures, effective complexity, andtotal information. Complexity 2:44–52

    MathSciNet  Google Scholar 

  21. Shannon C, Weaver W (1949) The mathematical theory ofcommunication. University of Illinois Press, Urbana

    MATH  Google Scholar 

  22. Quastler H (ed) (1953) Information theory in biology. University of IllinoisPress, Urbana

    Google Scholar 

  23. Gatlin L (1972) Information theory and the living system. Columbia UniversityPress, New York

    Google Scholar 

  24. Mantegna RN, Buldyrev SV, Goldberger AL, Havlin S, Peng CK, et al (1994)Linguistic features of noncoding DNA sequences. Phys Rev Lett, 73:3169–3172

    ADS  Google Scholar 

  25. Schmitt AO, Herzel H (1997) Estimating the entropy of DNA sequences. J TheorBiol 188:369–377

    Google Scholar 

  26. Weiss O, Jimenez‐Montaño MA, Herzel H (2000) Information content ofprotein sequences. J theor Biol, 206:379–386

    Google Scholar 

  27. Herzel H, Ebeling W, Schmitt AO (1994) Entropy of biosequences: The role ofrepeats. Phys Rev E 50:5061–5071

    ADS  Google Scholar 

  28. Shannon C (1948) A mathematical theory of communication. Bell Syst Tech J27:379–423, 623–656

    MathSciNet  Google Scholar 

  29. MacKay DJC (2002) Information theory, inference and learning algorithms.Cambridge University Press, Cambridge

    Google Scholar 

  30. Adami C (2004) Information theory in molecular biology. Phys LifeRev 1:3–22

    ADS  Google Scholar 

  31. Grassberger P (1986) Toward a quantitative theory of self‐generatedcomplexity. Int J Theor Phys 25:907–938

    MathSciNet  MATH  Google Scholar 

  32. Bernaola‐Galvan P, Roman-Roldan R, Oliver J (1996) Compositionalsegmentation and long-range fractal correlations in DNA sequences. Phys Rev E 53:5181–5189

    Google Scholar 

  33. Sprinzl M, Horn C, Brown M, Ioudovitch A, Steinberg S (1998) Compilation oftRNA sequences and sequences of tRNA genes. Nucleic Acids Res, 26:148–153

    Google Scholar 

  34. Eddy SR, Durbin R (1994) RNA sequence analysis using covariance models. NuclAcids Res, 22:2079–2088

    Google Scholar 

  35. Korber BT, Farber RM, Wolpert DH, Lapedes AS (1993) Covariation of mutationsin the V3 loop of human immunodeficiency virus type 1 envelope protein: an information theoretic analysis. Proc Natl Acad Sci USA90:7176–7180

    ADS  Google Scholar 

  36. Clarke ND (1995) Covariation of residues in the homeodomain sequence family.Protein Sci, 4:2269–2278

    Google Scholar 

  37. Atchley WR, Wollenberg KR, Fitch WM, Terhalle W, Dress AW (2000) Correlationsamong amino acid sites in bhlh protein domains: an information theoretic analysis. Mol Biol Evol17:164–178

    Google Scholar 

  38. Wang LY (2005) Covariation analysis of local amino acid sequences in recurrentprotein local structures. J Bioinform Comput Biol 3:1391–1409

    Google Scholar 

  39. Swofford DL, Olsen GJ, Waddell PJ, Hillis DM (1996) Phylogeneticinference. In: Hillis DM, Moritz C, Mable BK (eds) Molecular systematic, 2nd edn, Sinauer, Sunderland, pp. 407–514

    Google Scholar 

  40. Wolf JB, Brodie III ED, Wade MJ (eds) (2000) Epistasis and theevolutionary process. Oxford University Press, Oxford

    Google Scholar 

  41. Bridgham JT, Carroll SM, Thornton JW (2006) Evolution ofhormone‐receptor complexity by molecular exploitation. Science 312:97–101

    ADS  Google Scholar 

  42. Cowperthwaite MC, Bull JJ, Ancel Meyers L (2006) From bad to good:Fitness reversals and the ascent of deleterious mutations. PLoS Comput Biol 2:e141

    ADS  Google Scholar 

  43. Finn RD et al (2006) Pfam: Clans, web tools and services. Nucleic Acids Res34:D247–D251

    Google Scholar 

  44. Brenner SE, Chothia C, Hubbard TJP (1998) Assessing sequence comparisonmethods with reliable structurally identified distant evolutionary relationships. Proc Natl Acad Sci USA, 95:6073–6078

    ADS  Google Scholar 

  45. Miller GA, Madow WG (1954) On the maximum likelihood estimate of theShannon-Wiener measure of information. Technical Report 54–75, Air Force Cambridge Research Center, Bedford

    Google Scholar 

  46. Basharin GP (1959) On a statistical estimate for the entropy ofa sequence of independent random variables. Theory Probab Appl 4:333–337

    MathSciNet  Google Scholar 

  47. Zurek WH (1990) Algorithmic information content, Church-Turing thesis,physical entropy, and Maxwell's demon. In: Zurek WH (ed) Complexity, entropy, and the physics of information. SFI Studies in the Sciences of Complexity,vol. 8 Addison‐Wesley. Redwood City pp. 73–89

    Google Scholar 

  48. Cover TM, Thomas JA (1991) Elements of Information Theory. John Wiley, NewYork

    MATH  Google Scholar 

  49. Adami C (1998) Introduction to Artificial Life. Springer, NewYork

    MATH  Google Scholar 

  50. Adami C (2006) Digital genetics: Unravelling the genetic basis of evolution.Nat Rev Genet 7:109–118

    Google Scholar 

  51. Adami C, Ofria C, Collier T (1999) Evolution of biologicalcomplexity. Proc Natl Acad Sci USA 97:4463–4468

    ADS  Google Scholar 

  52. Ofria C, Huang W, Torng E (2008) On the gradual evolution of complexity andthe sudden emergence of complex features. Artif Life 14, to appear

    Google Scholar 

  53. Carothers JM, Oestreich SC, Davis JH, Szostak JW (2004) Informationalcomplexity and functional activity of RNA structures. J Amer Chem Soc, 126:5130–5137

    Google Scholar 

  54. Hazen RM, Griffin PL, Carothers JM, Szostak JW (2007) Functional informationand the emergence of biocomplexity. Proc Natl Acad Sci USA 104:8574–8581

    ADS  Google Scholar 

  55. Barabasi AL, Oltvai ZN (2004) Network biology: understanding the cell'sfunctional organization. Nat Rev Genet 5:101–113

    Google Scholar 

  56. Schlosser G, Wagner GP (eds) (2004) Modularity in development and evolution.University of Chicago Press, Chicago, IL

    Google Scholar 

  57. Callebaut W, Rasskin‐Gutman D (eds) (2005) Modularity: Understanding thedevelopment and evolution of natural complex systems. MIT Press, Cambridge, Mass

    Google Scholar 

  58. Reigl M, Alon U, Chklovskii DB (2004) Search for computational modules in theC. elegans brain. BMC Biol 2:25

    Google Scholar 

  59. Hintze A, Adami C (2008) Evolution of complex modular biologicalnetworks. PLoS Comput Biol 4:e23

    MathSciNet  ADS  Google Scholar 

  60. Batagelj V, Mrvar A (2003) Pajek: Analysis and visulaization of largenetworks. In: M Jünger PM (ed) Graph Drawing Software. Springer, Berlin, pp. 77–103

    Google Scholar 

  61. Huang W, Ofria C, Torng E (2004) Measuring biological complexity in digitalorganisms. In: Pollack J, Bedau MA, Husbands P, Ikegami T, Watson R (eds) Proceedings of Artificial Life IX, MIT Press, Cambridge,pp. 315–321

    Google Scholar 

  62. Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’networks. Nature 393:440–442

    ADS  Google Scholar 

  63. Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D et al. (2002)Network motifs: simple building blocks of complex networks. Science 298:824–827

    ADS  Google Scholar 

  64. Tishby N, Pereira F, Bialek W (1999) The information bottleneck method. In:Hajek B, Sreenivas RS (eds) Proceedings of the 37th Annual Allerton Conference on Communication, Control and Computing, University of Illinois Press,Champaign, IL, pp. 368–377

    Google Scholar 

  65. Ziv E, Middendorf M, Wiggins CH (2005) Information‐theoretic approach tonetwork modularity. Phys Rev E 71:046117

    MathSciNet  ADS  Google Scholar 

  66. Milo R, Itzkovitz S, Kashtan N, Levitt R, Shen-Orr S et al. (2004)Superfamilies of evolved and designed networks. Science, 303:1538–1542

    ADS  Google Scholar 

  67. Zipf GK (1935) The psycho‐biology of languages. Houghton‐Mifflin,Boston

    Google Scholar 

  68. Shannon CE (1951) Prediction and entropy of printed English. Bell System TechJ 30:50–64

    MATH  Google Scholar 

Download references

Acknowledgments

I am grateful to Arend Hintze for the collaborative work in Sect. “ NetworkComplexity”, as well as numerous discussions. I am also indebted to Matthew Rupp for the analysis shown in Figs. 5 and 6. This work was supported in part by the National Science FoundationsFrontiers in Integrative Biological Research grant FIBR-0527023, a Templeton Foundation research grant, and DARPA's FunBio initiative.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag

About this entry

Cite this entry

Adami, C. (2009). Biological Complexity and Biochemical Information . In: Meyers, R. (eds) Encyclopedia of Complexity and Systems Science. Springer, New York, NY. https://doi.org/10.1007/978-0-387-30440-3_33

Download citation

Publish with us

Policies and ethics