Simulations of SELEX against complex receptors with a condensed statistical model

doi:10.1016/j.compchemeng.2006.08.015

Computers & Chemical Engineering

Volume 31, Issue 9, September 2007, Pages 1007-1019

https://doi.org/10.1016/j.compchemeng.2006.08.015 Get rights and content

Abstract

Systematic evolution of ligands by exponential enrichment (SELEX) is an in vitro combinatorial engineering approach to enrich aptamers from a library of nucleic acids ligands by iterative extraction and amplification of receptor-bound ligands. Aptamers are the selected nucleic acid ligands with high receptor-binding affinity. Typically, they are obtained in single-receptor SELEX experiments where the ligand library is incubated with receptor molecules of a single identity, e.g., a purified protein. For years, aptamers have been shown to be valuable for biomedical applications and research. To further explore the power of SELEX technology, the idea of complex SELEX was proposed to obtain multiple aptamers by incubating the ligand library with multiple species of receptors. However, the reports on complex SELEX have been few, possible due to the ignorance of the effects of experimental variables. To address this problem, computer simulations should be useful. A major task of simulating complex SELEX is to solve interdependent binding equilibrium equations for binding events among heterogeneous ligands and receptors. Although a detailed subpooling model was developed, that model could be useful to simulate complex SELEX against at most four species of receptors, because the demand of computer memory grew exponentially with the number of receptor species. Here we develop a novel, condensed subpooling model where ligands of similar characteristic affinity are first pooled together regardless of receptor-specificity, and then divided into partial subpools receptor-specifically. With this model, the need of computer memory grows only linearly with receptor number. In the simulation of SELEX against four receptors, our results are the same or very similar to earlier work. We have further simulated SELEX against 100 heterogeneous receptors. We suggest that our computation method can be applied to other research fields where binding events between heterogeneous ligands and receptors are involved.

Introduction

Engineering through iterative selections of specific individuals from a population of variants is the basis of combinatorial engineering approach. It has been applied by nature and recently in various fields of sciences and technologies. A simple example is the evolution of biological traits and functionalities of organisms through the mechanism of natural selection. Another case is the purification of hydrocarbon compounds from crude oil by using fractional distillation columns in oil refineries. Although the phenomena had been understood for some time, the widespread and advanced implementation of combinatorial approaches did not occur until the invention of SELEX (systematic evolution of ligands by exponential enrichment (Tuerk & Gold, 1990)) or in vitro selection (Ellington & Szostak, 1990) technology in 1990.

SELEX is an in vitro experimental protocol to screen for and engineer desired ligands from a library of single-stranded nucleic acids, either RNA or single stranded DNA (ssDNA), (Fig. 1). Without going to historical background, it is fair to say that SELEX technology combines the discovery of the dual roles of single-stranded nucleic acids, and the idea of phage-display experiments. In the 1980s it was found that, in addition to serving as genetic materials, single stranded nucleic acids could also form unique tertiary structures to catalyze chemical reactions (Guerrier-Takada, Gardiner, Marsh, Pace, & Altman, 1983; Zaug & Cech, 1980). That is, the single-stranded nucleic acids could carry both genotypes and phenotypes at the same time. The concept of phage-display technology (Smith, 1985) is the same as SELEX, except that the experiments are not completely done in vitro (in test tubes) and the genotypes and phenotypes are carried by phage genes and proteins, respectively.

To perform traditional SELEX experiments (Conrad, Giver, Tian, & Ellington, 1996), a library of nucleic acid ligands with more than 10¹³ different sequences is chemically synthesized. The nucleic acid ligands are incubated in test tubes with receptor molecules of a single species (such as a purified protein). The ligands, which bind receptors, are separated from free ligands by electrophoresis, affinity chromatography or other means. The free ligands are discarded, and the receptor-binders are amplified by DNA polymerase in test tubes to form a new pool of ligands for the next round of selection. To make ligands compete for receptors, the molar ratio of receptors and total ligands is usually 1–10 (or higher). The processes of incubation, selection, and amplification are repeated until the desired receptor-binding ligands are enriched. It has been estimated that in the initial library of ligands the possibility of using a randomly picked nucleic acids to bind tightly and specifically the designated receptor could be lower than 10⁻⁹ (Ellington & Szostak, 1990; Rosenwald, Kafri, & Lancet, 2002). However, as the incompetent nucleic acids being removed and the receptor-binders being amplified, the chance of finding receptor-binders will increase exponentially in the successive rounds of screening. Thus, the rare receptor-binding ligands in the first nucleic acid library will evolve to become the prominent species in the later pools. Many studies have shown that after 5–12 rounds of selection and amplification the majority of ligands could bind with receptors specifically and tightly (affinity constant K equal or higher than 10⁹ M⁻¹). The selected nucleic acid ligands, which are called aptamers, have many potential applications in biomedical sciences and biotechnology.

The significance of SELEX technology is that the simplicity and power of the extracellular Darwinian approach have inspired the developments and applications of combinatorial engineering approaches. In biological and biomedical sciences (for recent reviews, see Becker & Becker, 2006; Blank & Blind, 2005; Breaker, 2002; Burgstaller, Jenne, & Blind, 2002; Clark & Remcho, 2002; Famulok, Blind, & Mayer, 2001; James, 2001, Mandal and Breaker, 2004; Mayer & Jenne, 2004; Sen, 2002; Wilson & Szostak, 1999), SELEX experiments have been performed to look for aptamers which could bind divalent metal cations, amino acids, antibiotics, peptides, organic dyes, bacteria spores, etc. Aptamers have the potential of replacing antibodies in diagnosis using protein chips. In 2004, the FDA of USA approved the first aptamer drug to treat age-related macular degeneration (Doggrell, 2005). There are other aptamer-based drugs under clinical trials. SELEX has also been used to screen for novel nucleic acids, which can catalyze chemical reactions (Breaker, 1997; Griffiths & Tawfik, 2000; Jaschke, 2001, Jaschke and Seelig, 2000; Joyce, 2004, Wilson and Szostak, 1999). On the other hand, similar techniques such as mRNA-display and ribosomal display were developed to screen for protein or peptide ligands (Amstutz, Forrer, Zahnd, & Pluckthun, 2001). Like aptamers, the peptide ligands have various applications.

In chemical sciences, combinatorial chemistry (Szostak, 1997) is becoming a popular approach to develop and engineer the desired chemicals without prior knowledge. A library of chemicals is prepared by introducing functional groups to various positions at the scaffold of bare molecules. These chemicals are subjected to functional tests, and those with interesting properties are selected for further modifications and tests. The ways of preparing the libraries of chemicals, performing screening and identifying the selected chemicals are being actively studied. The main goal is to rapidly discover potential drug leads for pharmaceutical industry.

The concepts behind SELEX have also infiltrated to mathematical and computer sciences. DNA and RNA have been used as hardware of molecular computers (Gibbons, Amos, & Hodgson, 1997; Parker, 2003; Tabor & Ellington, 2003) to solve mathematic problems, such as Hamilton Path problem (Adleman, 1994), chess problem (Faulhammer, Cukras, Lipton, & Landweber, 2000), SAT problem (Yang & Yang, 2005), etc. On the other hand, the idea of SELEX has also been used to generate computer software. For example, genetic algorithm (GA) and evolutionary computation is developed to solve optimization problems in which solutions are sought to optimize (maximize or minimize) given objective (fitness) functions. For GA, the algorithm originally from Matlab software (Mathworks Inc., MA, USA.) documentations can be briefly restated as the following:

1.
The GA first creates a random initial library (generation) of potential solutions for the optimization problem.
2.
The GA then sequentially generates new libraries (generations) of potential solutions by performing the following steps:
- i.
  Scores each member of the current library by computing its fitness value.
- ii.
  Selects individuals (called parents) based on their fitness.
- iii.
  Produces children from the parents: children are produced either by making random changes to a single parent (mutation) or by combining two parents by certain rules (crossover).
- iv.
  Replaces the current population with the children to form the next generation.
The above steps repeat and terminate when a stopping criterion is met. Members of the final library are (approximate) solutions for the optimization problem.

The GA has been applied for solving a variety of problems, such as to synthesize electric circuits, to analyze protein databases, etc. (Banzhaf, Nordin, Keller, & Francone, 1997; Koza et al., 2005).

We are interested in using the complex SELEX technology to screen for aptamers of multiple receptors simultaneously. The idea of complex SELEX was first postulated by Larry Gold, and a preliminary study showed that aptamers of several proteins of red blood cells can be obtained by using complex SELEX (Morris, Jensen, Julin, Weil, & Gold, 1998). The same research group also used numerical studies (Vant-Hull, Payano-Baez, Davis, & Gold, 1998) to show that the complex SELEX should be a possible approach to enrich simultaneously aptamers of multiple receptors. However, there have been only limited reports (Daniels, Chen, Hicke, Swiderek, & Gold, 2003; Hicke et al., 2001) on complex SELEX afterward, possibly for the ignorance of experimental variables. To understand factors, which may affect the results of complex SELEX experiments, we develop a new computer model by which we simulate the experiment against up to several 100 receptors. The goal is to provide practical guidance for conducting complex SELEX experiment in laboratories, and stimulate further advancements of combinatorial engineering technologies in other fields.

As the selection of aptamers is carried out through the binding of ligands (ω) and receptors (P), one of the main components of developing SELEX simulations is formulating a computer model of the formation of ligand–receptor complexes (ωP) in each round of screening. Since the experiment is done in the test tube (close system), the binding equilibrium (ω + P ⇄ ωP) is quantitatively characterized by the equilibrium equation k(ω)[ω][P] = [ωP], where k(ω) is the affinity constant of ω for P, and the parentheses denote the molar concentrations. In building the computer model to simulate the single-receptor SELEX experiment, Irvine, Tuerk, and Gold (1991) subdivided the library into $N$ disjoint subpools $l_{i}^{tot}$ , where i = 1, …, N, of ligands of similar affinities for the receptor. By simplifying the affinities of ligands from $l_{i}^{tot}$ with a constant K_i, Irvine et al. (1991) approximated the binding equilibrium of ligands and the receptor by N equilibrium equations K_i[l_i][P] = [l_iP]. By setting N = 10, Irvine et al. (1991) showed that their simulations could reproduce ideal experiments (Fig. 1b), where only and all receptor-binding ligands were amplified. They further studied by simulations the effects of contamination by false receptor-binders and imperfect recovery of ligand–receptor complexes in real experiments. In a real experiment, a certain amount of false receptor-binding ligands, e.g., nucleic acids that bind on the surface of test tube or lab ware but not receptor molecules, can be co-purified and amplified with true receptor-binders. On the other hand, not all of the nucleic acids in ligand–receptor complexes are recovered and amplified, depending on the chosen methods and the skillfulness of handling the experiments. Irvine et al. (1991) showed as long as the amount of false receptor-binders does not exceed that of true binders, aptamers can be eventually enriched in single-receptor SELEX experiments, regardless the yield of recovered ligand–receptor complexes.

The simulation of complex SELEX was first and only attempted by Vant-Hull et al. (1998) in the past. Generalizing the strategy of Irvine et al. (1991), Vant-Hull et al. (1998) approximated the binding equilibrium of ligands and the receptor mixture through dividing the library into N affinity subpools for each of the $M$ individual receptors. This method yielded $N^{M}$ affinity subpools involved in the binding equilibrium equations (see Section 2). As the number of receptors increase, the demand of computer memory for their simulations also grows exponentially with the number of total subpools. Vant-Hull et al. (1998) could only simulate complex SELEX experiments against at most M = 4 receptors with N = 10. It is unclear if their results (Vant-Hull et al., 1998) are applicable in real complex SELEX experiments where tens or hundreds of receptors are commonplaces in biological and engineering sciences.

In the present study, we develop a novel statistical model suitable for simulating complex SELEX against up to hundreds of receptors. We examine the effects of receptor numbers, receptor concentrations, the recovery of receptor-binders, and contamination of false receptor-binders on the results of complex SELEX experiments. The progression of the paper is as follows. Section 2 presents the development of our new computer model of complex SELEX. Section 3 presents the computational methods. The results of simulations are in Section 4. This is followed by remarks in Section 5. Also notice that in the following sections, we use the term “aptamers” to mean the nucleic acids that bind receptors tightly (K ≥ 10⁹ M⁻¹). Symbols used in this paper are summarized in the following table.

Section snippets

Preliminary

We first describe the framework of modeling complex SELEX by giving the underlying hypothesis. Let P₁, …, P_M represent M different receptors. Assume that the binding stoichiometry between ligands and receptors is 1-to-1. For a ligand ω from the ligand library, the affinity k_j(ω) of ω for P_j is defined by the relationship k_j(ω)[ω][P_j] = [ωP_j], where [ω], [P_j] and [ωP_j] are the molar concentrations of unbound ligand ω, unbound receptor P_j and ligand–receptor complex ωP_j at the binding equilibrium.

Simulation of initial ligand library

The numerical simulation of complex SELEX begins with computing p_i, p_ij of the initial ligand library under the assumed probability distributions of $k_{j}$ . In the present work, by using gridpoints 0 = ξ₀ < ξ₁ = 10^1.5 < ⋯ <ξ₉ = 10^9.5 < ξ₁₀ = ∞, we discretize the library into N = 10 subpools with constant characteristic affinities 10¹, …, 10¹⁰ M⁻¹ for the receptor mixture (K_i = 10ⁱ M⁻¹ with i = 1, …, 10). Ligands from the subpool $L_{10}^{tot} = {ω : K (ω) > 10^{9.5}}$ are aptamers. According to Vant-Hull et al. (1998) and Rosenwald et al.

The qualitative soundness of the condensed subpooling method—simulating complex SELEX against two receptors

We begin with simulating a complex SELEX experiment against two receptors P₁ and P₂, where the frequency of P₁ aptamers in the initial ligand library is 1000 times of that of P₂ aptamers. To evaluate the quality of our simulation, we focus on the progress of ligand evolution. We plot for j = 1, 2 the simulated concentrations of binders to P_j in $L_{i}^{tot}$ , i.e., $[L_{i j}^{tot}]$ , where i = 1, …, 10, at the 0th, 10th, and 20th round of screening in the upper panels of Fig. 3. For the purpose of comparison, we

Discussion

The studies of complex SELEX experiments are interesting, because the complex SELEX recapitulates the basics of evolution of interacting individuals (e.g., molecules) from heterogeneous populations in nature and engineering sciences. In spit of the appeal, such studies are challenging, mainly owing to the ignorance of the effects of experimental variables, and the difficulties of determining optimal experimental conditions. To solve the above problems, we developed a condensed statistical model

Acknowledgements

We are thankful to Drs. Larry Gold and Dom Zichi of SomaLogic Inc. and the anonymous reviewers for valuable comments on our work. This research was supported partly by grants of National Science Council, Taiwan (NSC 93-2115-M-005-007 to CKC, NSC 94-2115-M-005-004 to CKC, NSC 95-2118-M-005-004 to CKC, and NSC 94-2218-E038-001 to TCK), Taipei Medical Univeristy (TMU94-AE1-B05 to TCK), and Industrial Technology Research Institute (A1110VDD20 to TCK).

References (41)

P. Amstutz et al.
In vitro display technologies: Novel developments and applications
Current Opinion in Biotechnology
(2001)
M. Blank et al.
Aptamers as tools for target validation
Current Opinion in Chemical Biology
(2005)
R.R. Breaker
Engineered allosteric ribozymes as biosensor components
Current Opinion in Biotechnology
(2002)
R.C. Conrad et al.
In vitro selection of nucleic acid aptamers that bind proteins
Methods in Enzymology
(1996)
M. Famulok et al.
Intramers as promising new tools in functional proteomics
Chemistry and Biology
(2001)
A. Gibbons et al.
DNA computing
Current Opinion in Biotechnology
(1997)
A.D. Griffiths et al.
Man-made enzymes from design to in vitro compartmentalisation
Current Opinion in Biotechnology
(2000)
C. Guerrier-Takada et al.
The RNA moiety of ribonuclease P is the catalytic subunit of the enzyme
Cell
(1983)
B.J. Hicke et al.
Tenascin-C aptamers are generated using tumor cells and purified protein
Journal of Biological Chemistry
(2001)
D. Irvine et al.
SELEXION. Systematic evolution of ligands by exponential enrichment with integrated optimization by non-linear analysis
Journal of Molecular Biology
(1991)

W. James

Nucleic acid and polypeptide aptamers: A powerful approach to ligand discovery

Current Opinion in Pharmacology

(2001)

A. Jaschke

Artificial ribozymes and deoxyribozymes

Current Opinion in Structural Biology

(2001)

A. Jaschke et al.

Evolution of DNA and RNA as catalysts for chemical reactions

Current Opinion in Chemical Biology

(2000)

S. Rosenwald et al.

Test of a statistical model for molecular recognition in biological repertoires

Journal of Theoretical Biology

(2002)

D. Sen

Aptamer rivalry

Chemistry and Biology

(2002)

B. Vant-Hull et al.

The mathematics of SELEX against complex targets

Journal of Molecular Biology

(1998)

C.N. Yang et al.

A DNA solution of SAT problem by a modified sticker model

Biosystems

(2005)

A.J. Zaug et al.

In vitro splicing of the ribosomal RNA precursor in nuclei of Tetrahymena

Cell

(1980)

L.M. Adleman

Molecular computation of solutions to combinatorial problems

Science

(1994)

W. Banzhaf et al.

Genetic programming: An introduction on the automatic evolution of computer programs and its applications

(1997)

Cited by (10)

Estimation of statistical binding properties of ligand population during in vitro selection based on population dynamics theory
2014, Mathematical Biosciences
During in vitro selection process, it is very valuable to monitor the binding properties of the ligand population in real time, particularly the population average of the association constant in the population. If this monitoring can be realized, the selection process can be controlled in a rational way. In this paper, we present a simple method to estimate the binding properties of the ligand population during in vitro selection. The framework of the method is as follows. First, the number of all the collected ligand molecules, which are eluted after incubation and washing, is measured. Ideally, this number corresponds to the number of all the ligand molecules bound with the target–receptor or other materials in a test tube. This measurement is performed through several successive rounds of selection. Second, the measured numbers of molecules are subjected to a theoretical analysis, based on the mathematical theory of population dynamics in the selection process. Then, we can estimate the probability density of the binding free energy in the ligand population. The validity of our method was confirmed by several computer simulations based on a physicochemical model.
Theoretical consideration of selective enrichment in in vitro selection: Optimal concentration of target molecules
2012, Mathematical Biosciences
We considered an in vitro selection system composed of a peptide-ligand library and a single target protein receptor, and examined effective strategies to realize maximum efficiency in selection. In the system, a ligand molecule with sequence s binds to a target receptor with probability of $[R] / (K_{d s} + [R])$ (specific binding) or binds to non-target materials with probability of q (non-specific binding), where $[R]$ and $K_{d s}$ represent the free target-receptor concentration at equilibrium and dissociation constant $K_{d}$ of the ligand sequence s with the receptor, respectively. Focusing on the fittest sequence with the highest affinity (represented by $K_{d 1} \equiv \min {K_{d s} | s = 1, 2, \dots, M}$ ) in the ligand library with a library size N and diversity M, we examined how the target concentration $[R]$ should be set in each round to realize the maximum enrichment of the fittest sequence. In conclusion, when $N ≫ M$ (that realizes a deterministic process), it is desirable to adopt $[R] = K_{d 1}$ , and when $N = M$ (that realizes a stochastic process), $[R] = \sqrt{K_{d 1} 〈 K_{d}^{- 1} 〉^{- 1} q}$ only in the first round (where $〈 * 〉$ represents the population average) and $[R] = K_{d 1}$ in the subsequent rounds. Based on this strategy, the mole fraction of the fittest increases by $(2 q)^{- r}$ times after the rth round. With realistic parameters, we calculated several quantities such as the optimal $[R]$ values and number of rounds needed. These values were quite reasonable and consistent with observations, suggesting the validity of our theory.
High affinity extremes in combinatorial libraries and repertoires
2009, Journal of Theoretical Biology
Citation Excerpt :
The limited pace of organismal evolution is due partly to long generation times and typically small population sizes. Mathematical models have been developed to understand the process of SELEX systems (Vant-Hull et al., 1998; Irvine et al., 1991; Chen and Kuo, 2007; Levine and Nilsen-Hamilton, 2007). These efforts have generated models for quantifying the selection process in generating aptamers.
By generating a large diversity of molecules, the immune system selects antibodies that bind antigens. Sharing the same approach, combinatorial biotechnologies use a large library of compounds to screen for molecules of high affinity to a given target. Understanding the properties of the best binders in the pool aids the design of the library. In particular, how does the maximum affinity increase with the size of the library or repertoire? We consider two alternative models to examine the properties of extreme affinities. In the first model, affinities are distributed lognormally, while in the second, affinities are determined by the number of matches to a target sequence. The second model more explicitly models nucleic acids (DNA or RNA) and proteins such as antibodies. Using extreme value theory we show that the logarithm of the mean of the highest affinity in a combinatorial library grows linearly with the square root of the log of the library size. When there is an upper bound to affinity, this “absolute maximum” is also approached approximately linearly with root log library size, reaching the upper limit abruptly. The design of libraries may benefit from considering how this plateau is reached as the library size is increased.
Complex SELEX against target mixture: Stochastic computer model, simulation, and analysis
2007, Computer Methods and Programs in Biomedicine
Citation Excerpt :
In this work, we develop computer model and computational strategy to numerically mimic the complex SELEX. Complementing to the earlier mean-field model-based simulations [15,25], our stochastic simulations allows us to further insight the ligand evolution and aptamer enrichment with the considerations of random properties (e.g., ligand section and mutation) of SELEX process. A major challenge presented by the stochastic simulations is the enormous number of ligands (e.g., ≈1012) in the ligand library.
Systematic evolution of ligands by exponential enrichment (SELEX) is an important technology in combinatorial chemistry and molecular biology of developing high affinity target-binding molecules (aptamers) from highly complex nucleic acid ligand libraries. Schematically, the SELEX is a series of iterative rounds of operations where in each operational round ligands are incubated with the target (e.g., a purified protein), and target-binding ligands are extracted and amplified. In the recent development of biological study and drug discovery, by incubating ligand libraries with complex target mixtures (e.g., cell fragments), the SELEX experiments have been explored to simultaneously develop aptamers for targets embedded in target mixtures: the complex SELEX. While holding the considerable advantages of saving experimental resources, practicing the complex SELEX has often accompanied with unstable experimental performances. It is therefore important to understand the behaviors of the new application. In this paper, we develop stochastic computer model, and customized computational algorithm to numerically mimic the complex SELEX. We model the ligand selection through the probability of ligand binding to complex targets at the binding equilibrium, and efficiency of separating target–binders for amplification. The customized computational algorithm allows us to simulate real experiments that operate on huge ligand libraries. We evaluate the ligand evolution, and aptamer enrichment of complex SELEX under various experimental conditions by stochastic simulations, and theorize the simulated results. We argue that the stochastic effects, which were not previously captured in the studies of complex SELEX, may significantly affect the results of experiments.
Theoretical analysis of the aptamer enrichment in SELEX-based selection
2017, Scientia Sinica Chimica
The Effects of SELEX Conditions on the Resultant Aptamer Pools in the Selection of Aptamers Binding to Bacterial Cells
2015, Journal of Molecular Evolution

View all citing articles on Scopus

¹: These two authors contributed equally to this work.

View full text

Simulations of SELEX against complex receptors with a condensed statistical model

Abstract

Introduction

Section snippets

Preliminary

Simulation of initial ligand library

The qualitative soundness of the condensed subpooling method—simulating complex SELEX against two receptors

Discussion

Acknowledgements

Current Opinion in Biotechnology

Current Opinion in Chemical Biology

Current Opinion in Biotechnology

Methods in Enzymology

Chemistry and Biology

Current Opinion in Biotechnology

Current Opinion in Biotechnology

Cell

Journal of Biological Chemistry

Journal of Molecular Biology

Current Opinion in Pharmacology

Current Opinion in Structural Biology

Current Opinion in Chemical Biology

Journal of Theoretical Biology

Chemistry and Biology

Journal of Molecular Biology

Biosystems

Cell

Molecular computation of solutions to combinatorial problems

Science

Genetic programming: An introduction on the automatic evolution of computer programs and its applications