Simulations of SELEX against complex receptors with a condensed statistical model

https://doi.org/10.1016/j.compchemeng.2006.08.015Get rights and content

Abstract

Systematic evolution of ligands by exponential enrichment (SELEX) is an in vitro combinatorial engineering approach to enrich aptamers from a library of nucleic acids ligands by iterative extraction and amplification of receptor-bound ligands. Aptamers are the selected nucleic acid ligands with high receptor-binding affinity. Typically, they are obtained in single-receptor SELEX experiments where the ligand library is incubated with receptor molecules of a single identity, e.g., a purified protein. For years, aptamers have been shown to be valuable for biomedical applications and research. To further explore the power of SELEX technology, the idea of complex SELEX was proposed to obtain multiple aptamers by incubating the ligand library with multiple species of receptors. However, the reports on complex SELEX have been few, possible due to the ignorance of the effects of experimental variables. To address this problem, computer simulations should be useful. A major task of simulating complex SELEX is to solve interdependent binding equilibrium equations for binding events among heterogeneous ligands and receptors. Although a detailed subpooling model was developed, that model could be useful to simulate complex SELEX against at most four species of receptors, because the demand of computer memory grew exponentially with the number of receptor species. Here we develop a novel, condensed subpooling model where ligands of similar characteristic affinity are first pooled together regardless of receptor-specificity, and then divided into partial subpools receptor-specifically. With this model, the need of computer memory grows only linearly with receptor number. In the simulation of SELEX against four receptors, our results are the same or very similar to earlier work. We have further simulated SELEX against 100 heterogeneous receptors. We suggest that our computation method can be applied to other research fields where binding events between heterogeneous ligands and receptors are involved.

Introduction

Engineering through iterative selections of specific individuals from a population of variants is the basis of combinatorial engineering approach. It has been applied by nature and recently in various fields of sciences and technologies. A simple example is the evolution of biological traits and functionalities of organisms through the mechanism of natural selection. Another case is the purification of hydrocarbon compounds from crude oil by using fractional distillation columns in oil refineries. Although the phenomena had been understood for some time, the widespread and advanced implementation of combinatorial approaches did not occur until the invention of SELEX (systematic evolution of ligands by exponential enrichment (Tuerk & Gold, 1990)) or in vitro selection (Ellington & Szostak, 1990) technology in 1990.

SELEX is an in vitro experimental protocol to screen for and engineer desired ligands from a library of single-stranded nucleic acids, either RNA or single stranded DNA (ssDNA), (Fig. 1). Without going to historical background, it is fair to say that SELEX technology combines the discovery of the dual roles of single-stranded nucleic acids, and the idea of phage-display experiments. In the 1980s it was found that, in addition to serving as genetic materials, single stranded nucleic acids could also form unique tertiary structures to catalyze chemical reactions (Guerrier-Takada, Gardiner, Marsh, Pace, & Altman, 1983; Zaug & Cech, 1980). That is, the single-stranded nucleic acids could carry both genotypes and phenotypes at the same time. The concept of phage-display technology (Smith, 1985) is the same as SELEX, except that the experiments are not completely done in vitro (in test tubes) and the genotypes and phenotypes are carried by phage genes and proteins, respectively.

To perform traditional SELEX experiments (Conrad, Giver, Tian, & Ellington, 1996), a library of nucleic acid ligands with more than 1013 different sequences is chemically synthesized. The nucleic acid ligands are incubated in test tubes with receptor molecules of a single species (such as a purified protein). The ligands, which bind receptors, are separated from free ligands by electrophoresis, affinity chromatography or other means. The free ligands are discarded, and the receptor-binders are amplified by DNA polymerase in test tubes to form a new pool of ligands for the next round of selection. To make ligands compete for receptors, the molar ratio of receptors and total ligands is usually 1–10 (or higher). The processes of incubation, selection, and amplification are repeated until the desired receptor-binding ligands are enriched. It has been estimated that in the initial library of ligands the possibility of using a randomly picked nucleic acids to bind tightly and specifically the designated receptor could be lower than 10−9 (Ellington & Szostak, 1990; Rosenwald, Kafri, & Lancet, 2002). However, as the incompetent nucleic acids being removed and the receptor-binders being amplified, the chance of finding receptor-binders will increase exponentially in the successive rounds of screening. Thus, the rare receptor-binding ligands in the first nucleic acid library will evolve to become the prominent species in the later pools. Many studies have shown that after 5–12 rounds of selection and amplification the majority of ligands could bind with receptors specifically and tightly (affinity constant K equal or higher than 109 M−1). The selected nucleic acid ligands, which are called aptamers, have many potential applications in biomedical sciences and biotechnology.

The significance of SELEX technology is that the simplicity and power of the extracellular Darwinian approach have inspired the developments and applications of combinatorial engineering approaches. In biological and biomedical sciences (for recent reviews, see Becker & Becker, 2006; Blank & Blind, 2005; Breaker, 2002; Burgstaller, Jenne, & Blind, 2002; Clark & Remcho, 2002; Famulok, Blind, & Mayer, 2001; James, 2001, Mandal and Breaker, 2004; Mayer & Jenne, 2004; Sen, 2002; Wilson & Szostak, 1999), SELEX experiments have been performed to look for aptamers which could bind divalent metal cations, amino acids, antibiotics, peptides, organic dyes, bacteria spores, etc. Aptamers have the potential of replacing antibodies in diagnosis using protein chips. In 2004, the FDA of USA approved the first aptamer drug to treat age-related macular degeneration (Doggrell, 2005). There are other aptamer-based drugs under clinical trials. SELEX has also been used to screen for novel nucleic acids, which can catalyze chemical reactions (Breaker, 1997; Griffiths & Tawfik, 2000; Jaschke, 2001, Jaschke and Seelig, 2000; Joyce, 2004, Wilson and Szostak, 1999). On the other hand, similar techniques such as mRNA-display and ribosomal display were developed to screen for protein or peptide ligands (Amstutz, Forrer, Zahnd, & Pluckthun, 2001). Like aptamers, the peptide ligands have various applications.

In chemical sciences, combinatorial chemistry (Szostak, 1997) is becoming a popular approach to develop and engineer the desired chemicals without prior knowledge. A library of chemicals is prepared by introducing functional groups to various positions at the scaffold of bare molecules. These chemicals are subjected to functional tests, and those with interesting properties are selected for further modifications and tests. The ways of preparing the libraries of chemicals, performing screening and identifying the selected chemicals are being actively studied. The main goal is to rapidly discover potential drug leads for pharmaceutical industry.

The concepts behind SELEX have also infiltrated to mathematical and computer sciences. DNA and RNA have been used as hardware of molecular computers (Gibbons, Amos, & Hodgson, 1997; Parker, 2003; Tabor & Ellington, 2003) to solve mathematic problems, such as Hamilton Path problem (Adleman, 1994), chess problem (Faulhammer, Cukras, Lipton, & Landweber, 2000), SAT problem (Yang & Yang, 2005), etc. On the other hand, the idea of SELEX has also been used to generate computer software. For example, genetic algorithm (GA) and evolutionary computation is developed to solve optimization problems in which solutions are sought to optimize (maximize or minimize) given objective (fitness) functions. For GA, the algorithm originally from Matlab software (Mathworks Inc., MA, USA.) documentations can be briefly restated as the following:

  • 1.

    The GA first creates a random initial library (generation) of potential solutions for the optimization problem.

  • 2.

    The GA then sequentially generates new libraries (generations) of potential solutions by performing the following steps:

    • i.

      Scores each member of the current library by computing its fitness value.

    • ii.

      Selects individuals (called parents) based on their fitness.

    • iii.

      Produces children from the parents: children are produced either by making random changes to a single parent (mutation) or by combining two parents by certain rules (crossover).

    • iv.

      Replaces the current population with the children to form the next generation.

    The above steps repeat and terminate when a stopping criterion is met. Members of the final library are (approximate) solutions for the optimization problem.

The GA has been applied for solving a variety of problems, such as to synthesize electric circuits, to analyze protein databases, etc. (Banzhaf, Nordin, Keller, & Francone, 1997; Koza et al., 2005).

We are interested in using the complex SELEX technology to screen for aptamers of multiple receptors simultaneously. The idea of complex SELEX was first postulated by Larry Gold, and a preliminary study showed that aptamers of several proteins of red blood cells can be obtained by using complex SELEX (Morris, Jensen, Julin, Weil, & Gold, 1998). The same research group also used numerical studies (Vant-Hull, Payano-Baez, Davis, & Gold, 1998) to show that the complex SELEX should be a possible approach to enrich simultaneously aptamers of multiple receptors. However, there have been only limited reports (Daniels, Chen, Hicke, Swiderek, & Gold, 2003; Hicke et al., 2001) on complex SELEX afterward, possibly for the ignorance of experimental variables. To understand factors, which may affect the results of complex SELEX experiments, we develop a new computer model by which we simulate the experiment against up to several 100 receptors. The goal is to provide practical guidance for conducting complex SELEX experiment in laboratories, and stimulate further advancements of combinatorial engineering technologies in other fields.

As the selection of aptamers is carried out through the binding of ligands (ω) and receptors (P), one of the main components of developing SELEX simulations is formulating a computer model of the formation of ligand–receptor complexes (ωP) in each round of screening. Since the experiment is done in the test tube (close system), the binding equilibrium (ω + P  ωP) is quantitatively characterized by the equilibrium equation k(ω)[ω][P] = [ωP], where k(ω) is the affinity constant of ω for P, and the parentheses denote the molar concentrations. In building the computer model to simulate the single-receptor SELEX experiment, Irvine, Tuerk, and Gold (1991) subdivided the library into N disjoint subpools litot, where i = 1, …, N, of ligands of similar affinities for the receptor. By simplifying the affinities of ligands from litot with a constant Ki, Irvine et al. (1991) approximated the binding equilibrium of ligands and the receptor by N equilibrium equations Ki[li][P] = [liP]. By setting N = 10, Irvine et al. (1991) showed that their simulations could reproduce ideal experiments (Fig. 1b), where only and all receptor-binding ligands were amplified. They further studied by simulations the effects of contamination by false receptor-binders and imperfect recovery of ligand–receptor complexes in real experiments. In a real experiment, a certain amount of false receptor-binding ligands, e.g., nucleic acids that bind on the surface of test tube or lab ware but not receptor molecules, can be co-purified and amplified with true receptor-binders. On the other hand, not all of the nucleic acids in ligand–receptor complexes are recovered and amplified, depending on the chosen methods and the skillfulness of handling the experiments. Irvine et al. (1991) showed as long as the amount of false receptor-binders does not exceed that of true binders, aptamers can be eventually enriched in single-receptor SELEX experiments, regardless the yield of recovered ligand–receptor complexes.

The simulation of complex SELEX was first and only attempted by Vant-Hull et al. (1998) in the past. Generalizing the strategy of Irvine et al. (1991), Vant-Hull et al. (1998) approximated the binding equilibrium of ligands and the receptor mixture through dividing the library into N affinity subpools for each of the M individual receptors. This method yielded NM affinity subpools involved in the binding equilibrium equations (see Section 2). As the number of receptors increase, the demand of computer memory for their simulations also grows exponentially with the number of total subpools. Vant-Hull et al. (1998) could only simulate complex SELEX experiments against at most M = 4 receptors with N = 10. It is unclear if their results (Vant-Hull et al., 1998) are applicable in real complex SELEX experiments where tens or hundreds of receptors are commonplaces in biological and engineering sciences.

In the present study, we develop a novel statistical model suitable for simulating complex SELEX against up to hundreds of receptors. We examine the effects of receptor numbers, receptor concentrations, the recovery of receptor-binders, and contamination of false receptor-binders on the results of complex SELEX experiments. The progression of the paper is as follows. Section 2 presents the development of our new computer model of complex SELEX. Section 3 presents the computational methods. The results of simulations are in Section 4. This is followed by remarks in Section 5. Also notice that in the following sections, we use the term “aptamers” to mean the nucleic acids that bind receptors tightly (K  109 M−1). Symbols used in this paper are summarized in the following table.

Section snippets

Preliminary

We first describe the framework of modeling complex SELEX by giving the underlying hypothesis. Let P1, …, PM represent M different receptors. Assume that the binding stoichiometry between ligands and receptors is 1-to-1. For a ligand ω from the ligand library, the affinity kj(ω) of ω for Pj is defined by the relationship kj(ω)[ω][Pj] = [ωPj], where [ω], [Pj] and [ωPj] are the molar concentrations of unbound ligand ω, unbound receptor Pj and ligand–receptor complex ωPj at the binding equilibrium.

Simulation of initial ligand library

The numerical simulation of complex SELEX begins with computing pi, pij of the initial ligand library under the assumed probability distributions of kj. In the present work, by using gridpoints 0 = ξ0 < ξ1 = 101.5 <  <ξ9 = 109.5 < ξ10 = ∞, we discretize the library into N = 10 subpools with constant characteristic affinities 101, …, 1010 M−1 for the receptor mixture (Ki = 10i M−1 with i = 1, …, 10). Ligands from the subpool L10tot={ω:K(ω)>109.5} are aptamers. According to Vant-Hull et al. (1998) and Rosenwald et al.

The qualitative soundness of the condensed subpooling method—simulating complex SELEX against two receptors

We begin with simulating a complex SELEX experiment against two receptors P1 and P2, where the frequency of P1 aptamers in the initial ligand library is 1000 times of that of P2 aptamers. To evaluate the quality of our simulation, we focus on the progress of ligand evolution. We plot for j = 1, 2 the simulated concentrations of binders to Pj in Litot, i.e., [Lijtot], where i = 1, …, 10, at the 0th, 10th, and 20th round of screening in the upper panels of Fig. 3. For the purpose of comparison, we

Discussion

The studies of complex SELEX experiments are interesting, because the complex SELEX recapitulates the basics of evolution of interacting individuals (e.g., molecules) from heterogeneous populations in nature and engineering sciences. In spit of the appeal, such studies are challenging, mainly owing to the ignorance of the effects of experimental variables, and the difficulties of determining optimal experimental conditions. To solve the above problems, we developed a condensed statistical model

Acknowledgements

We are thankful to Drs. Larry Gold and Dom Zichi of SomaLogic Inc. and the anonymous reviewers for valuable comments on our work. This research was supported partly by grants of National Science Council, Taiwan (NSC 93-2115-M-005-007 to CKC, NSC 94-2115-M-005-004 to CKC, NSC 95-2118-M-005-004 to CKC, and NSC 94-2218-E038-001 to TCK), Taipei Medical Univeristy (TMU94-AE1-B05 to TCK), and Industrial Technology Research Institute (A1110VDD20 to TCK).

References (41)

  • W. James

    Nucleic acid and polypeptide aptamers: A powerful approach to ligand discovery

    Current Opinion in Pharmacology

    (2001)
  • A. Jaschke

    Artificial ribozymes and deoxyribozymes

    Current Opinion in Structural Biology

    (2001)
  • A. Jaschke et al.

    Evolution of DNA and RNA as catalysts for chemical reactions

    Current Opinion in Chemical Biology

    (2000)
  • S. Rosenwald et al.

    Test of a statistical model for molecular recognition in biological repertoires

    Journal of Theoretical Biology

    (2002)
  • D. Sen

    Aptamer rivalry

    Chemistry and Biology

    (2002)
  • B. Vant-Hull et al.

    The mathematics of SELEX against complex targets

    Journal of Molecular Biology

    (1998)
  • C.N. Yang et al.

    A DNA solution of SAT problem by a modified sticker model

    Biosystems

    (2005)
  • A.J. Zaug et al.

    In vitro splicing of the ribosomal RNA precursor in nuclei of Tetrahymena

    Cell

    (1980)
  • L.M. Adleman

    Molecular computation of solutions to combinatorial problems

    Science

    (1994)
  • W. Banzhaf et al.

    Genetic programming: An introduction on the automatic evolution of computer programs and its applications

    (1997)
  • Cited by (10)

    • High affinity extremes in combinatorial libraries and repertoires

      2009, Journal of Theoretical Biology
      Citation Excerpt :

      The limited pace of organismal evolution is due partly to long generation times and typically small population sizes. Mathematical models have been developed to understand the process of SELEX systems (Vant-Hull et al., 1998; Irvine et al., 1991; Chen and Kuo, 2007; Levine and Nilsen-Hamilton, 2007). These efforts have generated models for quantifying the selection process in generating aptamers.

    • Complex SELEX against target mixture: Stochastic computer model, simulation, and analysis

      2007, Computer Methods and Programs in Biomedicine
      Citation Excerpt :

      In this work, we develop computer model and computational strategy to numerically mimic the complex SELEX. Complementing to the earlier mean-field model-based simulations [15,25], our stochastic simulations allows us to further insight the ligand evolution and aptamer enrichment with the considerations of random properties (e.g., ligand section and mutation) of SELEX process. A major challenge presented by the stochastic simulations is the enormous number of ligands (e.g., ≈1012) in the ligand library.

    View all citing articles on Scopus
    1

    These two authors contributed equally to this work.

    View full text