Solving satisfiability problems using a novel microarray-based DNA computer

doi:10.1016/j.biosystems.2006.08.009

Biosystems

Volume 90, Issue 1, July–August 2007, Pages 242-252

https://doi.org/10.1016/j.biosystems.2006.08.009 Get rights and content

Abstract

An algorithm based on a modified sticker model accompanied with an advanced MEMS-based microarray technology is demonstrated to solve SAT problem, which has long served as a benchmark in DNA computing. Unlike conventional DNA computing algorithms needing an initial data pool to cover correct and incorrect answers and further executing a series of separation procedures to destroy the unwanted ones, we built solutions in parts to satisfy one clause in one step, and eventually solve the entire Boolean formula through steps. No time-consuming sample preparation procedures and delicate sample applying equipment were required for the computing process. Moreover, experimental results show the bound DNA sequences can sustain the chemical solutions during computing processes such that the proposed method shall be useful in dealing with large-scale problems.

Introduction

DNA computing uses molecular biology laboratory procedures to solve mathematical problems. The concept of DNA computing was first developed by Adleman in 1994 (Adleman, 1994), who solved the traveling salesperson problem (TSP), in which it is necessary to determine a path for a seven-vertex directed graph with designated start and end nodes. Inspired by Adleman's success, Lipton proposed a method for solving the satisfiability (SAT) problems (Lipton, 1995), whose goal is to determine appropriate assignments of a set of Boolean variables with values of either “true” or “false” such that the output of the whole Boolean formula is true (Cormen et al., 2001). The SAT instance solved by Lipton is $F = (x \lor y) \land (\bar{x} \lor \bar{y})$ , where x and y are the Boolean variables, (x ∨ y) and $(\bar{x} \lor \bar{y})$ are two clauses, ∨ the logical operation OR and ∧ the logical operation AND, and the notation $\bar{x}$ is the negation of x, which dedicates that $\bar{x} = 0$ if x = 1, and $\bar{x} = 1$ if x = 0.

Of the various computational problems which have been successfully solved using Adleman's approach (Braich et al., 2002, Chen and Ramachandran, 2001, Diaz et al., 2001, Liu et al., 2000, Liu et al., 2002, Ouyang et al., 1997, Pirrung et al., 2000, Rozenberg and Spaink, 2003, Sakamoto et al., 2000, Schmidt et al., 2004, Su and Smith, 2004, Wang et al., 2000, Yoshida and Suyama, 1999) the SAT problem has emerged as a useful benchmark for exploring the feasibility of DNA-based algorithms and experiments. For example, Sakamoto et al. used autonomous molecular computing with the delicate hairpin formation to solve an instance with four variables (Sakamoto et al., 2000); Liu et al. demonstrated the feasibility of DNA computing with surface-bound sequences rather than a test tube approach by solving an instance with six variables (Liu et al., 2000); Braich et al. employed a sticker model (Paun and Rozenberg, 1998) to solve an instance with 20 variables (Braich et al., 2002).

In general, DNA computing algorithms commence by constructing a data pool of all possible candidate solutions, expressed in the form of either single- or double-stranded DNA (ssDNA or dsDNA). The incorrect solutions are progressively eliminated from the pool by performing biological separation processes. The surviving sequences are then considered to represent the correct solutions of the problem. Adopting this approach, the n-variable SAT problem can be solved as follows: (1) create a data pool containing the 2ⁿ sequences corresponding to all of the possible variable assignments (note that a total of 2ⁿ possible solutions exist because each variable can be assigned by only value of either “true” or “false”); (2) eliminate the incorrect variable assignments by applying logic restrictions; (3) collect and read out the surviving sequences. In the traditional computation model, this strategy corresponds to a complete binary decision tree of n + 1 levels in which every one node has two branches, attached with two children, each representing the true value or false value of one variable. Each path starting from the root to one leaf, consisting of n edges, corresponds to one possible assignment of the n variables, as indicated in Fig. 1a for an instance with three variables.

However, to find all true assignments or to eliminate all false assignments, we may not need to examine all possible 2ⁿ assignments. In our approach, we examine the input Boolean formula by considering its falsity one clause by one clause. For instance, consider $F = (x \lor \bar{y}) \land (\bar{x} \lor z) \land (\bar{y} \lor \bar{z})$ . The decision tree now is transformed into an AND/OR tree, as shown in Fig. 1b. The AND/OR tree consists of m + 1 levels if m clauses are considered. The branches on level i correspond to the literals of clause i, where the level of the root is counted as 1. In other words, the number of branches on level i is d_i, which is the number of literals included in clause i. In addition, the branches attached to a node perform the OR operation, and the path from the root to one leaf should perform the AND operation. Fig. 1b shows the complete AND/OR tree for this instance. However, the tree can be trimmed by transforming some conflict branches and then terminating some falsified branches, as shown in Fig. 1c. When a new coming clause i is checked, the following two procedures are applied to expand the tree branches: (1) transformation: if a newly assigned value of variable x in clause i is complementary to the previously assigned one, the value of x stays unchanged since x has been assigned and it cannot be assigned with two values; (2) termination: extract the values assigned for the variables on the path from the root to the current leaf, check whether these assigned values satisfy at least one literal of clause i. If none of literals of clause i can be satisfied, then this branch is terminated since it becomes a falsified branch. Accordingly, in Fig. 1c, the first left node in level 3 is terminated due to the falsity of the second clause $(\bar{x} \lor z)$ and the second left node in level 4 is terminated due to falsity of the third clause $(\bar{y} \lor \bar{z})$ . The leaf nodes in level 4 represent correct solutions to the formula and from left to right they are (x = true, y = false, z = true), (x = false, y = false, z = don’t care), (x = false, y = false, z = false), (x = don’t care, y = false, z = true), and (x = don’t care, y = false, z = true), respectively. In all, the satisfiable assignments are concluded as $(x, \bar{y}, z)$ , $(\bar{x}, \bar{y}, \bar{z})$ , and $(\bar{x}, \bar{y}, z)$ .

In this present work, we wish to provide an implementation of the above approach to solve SAT problems in which the requirement to construct an initial data pool is removed. Two instances of the SAT problems are considered. The first instance is that of the relatively straightforward 2-SAT (k-SAT is identified by k literals in each clause) $F = (x \lor \bar{y}) \land (\bar{x} \lor z) \land (\bar{y} \lor \bar{z})$ to demonstrate the associated experimental procedures. The second instance, $F = (x \lor y \lor z) \land (\bar{x} \lor \bar{w} \lor z) \land (\bar{y} \lor \bar{z}) \land (w \lor \bar{x} \lor y) \land (\bar{y} \lor z)$ , has four variables and five clauses. This problem is presented to verify the feasibility of using the proposed method to solve a more complicate instance. An advanced MEMS-based microarray technology is chosen for the platform since not too many DNA computing methods were explored on surface in spite of its high through put property. The algorithm is based on our previously proposed theory termed modified sticker model (Yang and Yang, 2005) in spirit of building solutions in steps to satisfy one clause at a time and eventually the solutions satisfying the entire Boolean formula. That is, computation begins with a blank data pool, which is capable of carrying added information but carries no information at first place. In solving a 2-SAT instance, the blank pool is first divided into two parts and each part will get assigned one literal in the first clause. Combining these parts completes the OR operation in the first clause. The combined updated pool is again divided into two parts and each part will be assigned with one literal in the second clause. Further combining and dividing will be repeated to carry literal assignments in the remaining clauses. Yet a probing procedure is required after every combined pool forms to exclude members falsifying the considered clause. In the end, the surviving members represent the solutions assignments. In the present study, the blank pool is played by an array with immobilized DNA strands, whereas, the literal information is played by fluorescent dye labeled DNA strands complementary to the immobilized ones.

Section snippets

Computation concept

To compute the 2-SAT problem $F = (x \lor \bar{y}) \land (\bar{x} \lor z) \land (\bar{y} \lor \bar{z})$ , a 6 × 4 matrix array (six rows, four columns) is fabricated on a glass plate. In this array, each set of vertically adjacent x-, y- and z-sites corresponds to a Boolean variable assignment, and hence the array can accommodate a maximum of eight different assignments. The relationship between a given Boolean formula and the necessary array size is characterized later in this study. In the computations, the three variables, i.e. x, y and z, are

Discussion

Although the current study uses a DNA array as a platform, the essential logic of the algorithm is based on the modified sticker model, which was originally designed for solution phase experiments (Yang and Yang, 2005). The original sticker model uses a memory strand to carry the information conveyed by stickers. In essence, the sticker model comprises a long single memory strand and a number of stickers. Each memory strand consists of l non-overlapping substrands, each of which has a length of

Summary

This study has presented a novel computing procedure for solving SAT problems utilizing a MEMS-based microarray technique. Although a fully automatic system to carry out the computing process has not been developed, the results of this study nevertheless confirm the feasibility of solving SAT problems on a solid surface using a modified sticker model. The proposed method has the advantage that as the size of the problem is scaled-up, it is necessary only to linearly increase the variety of

Acknowledgement

The authors gratefully acknowledge the financial support provided to this study by the National Science Council of Taiwan under grant no. NSC 93-2113-M-214-001.

References (21)

C.R. Graham et al.
Gene probe assays on a fibre-optic evanescent wave biosensor
Biosens. Bioelectron.
(1992)
G. Paun et al.
Sticker systems
Theor. Comput. Sci.
(1998)
G. Rozenberg et al.
DNA computing by blocking
Theor. Comput. Sci.
(2003)
C.N. Yang et al.
A DNA solution of SAT problem by a modified sticker model
Biosystems
(2005)
L. Adleman
Molecular computation of solutions to combinatorial problems
Science
(1994)
R.S. Braich et al.
Solution of a 20-variable 3-SAT problem on a DNA computer
Science
(2002)
S. Britland et al.
Micropatterning proteins and synthetic peptides on solid supports: a novel application for microelectronics fabrication technology
Biotechnol. Progr.
(1992)
Chen, K., Ramachandran, V., 2001. A space-efficient randomized DNA algorithm for k-SAT....
T.H. Cormen et al.
Introduction to Algorithms
(2001)
S. Diaz et al.
A DNA-Based Random Walk Method for Solving k-SAT
(2001)

There are more references available in the full text version of this article.

Cited by (14)

DNA based computing for understanding complex shapes
2014, BioSystems
Citation Excerpt :
This remarkable feature has been brought into the attention by Adleman in 1990s who showed how to solve a traveling salesman problem by using the DNA strands in wet-media. Since then, DBC has been found effective in solving such computational problems as NP-hard, pattern recognition, scheduling, clustering, in developing such structures as nano-scale mechanisms, self-repairing/adaptive robots, logic gates, futuristic computers, in generating random numbers, in processing natural language and image, in developing cryptographic systems, and so on (Adleman, 1994; Lipton, 1995; Sakamoto et al., 2000; Wasiewicz et al., 2001; Guo et al., 2005; Hsieh et al., 2008; Ran et al., 2009; Ullah et al., 1997; Sakakibara, 2003; Xu et al., 2006; Nie and Zhong, 2012; Muhammad et al., 2006; Stojanovic et al., 2002; Murata and Stojanovic, 2008; Gerasimova and Kolpashchikov, 2012; Murata et al., 2013; Chen and Yang, 2010; Gearheart et al., 2010; Wu et al., 2009; Bakar et al., 2008; Lin et al., 2007; Komiya et al., 2006; Yeh et al., 2006; Lee et al., 2011; Babaei, 2013; Xiao et al., 2006). In most cases, the DBC is performed in wet-media (in vitro) through the hybridization of the relatively short strands of genetic molecules (short DNA, m/tRNA, protein strands).
This study deals with a computing method called DNA based computing (DBC) that takes inspiration from the Central Dogma of Molecular Biology. The proposed DBC uses a set of user-defined rules to create a DNA-like sequence from a given piece of problem-relevant information (e.g., image data) in a dry-media (i.e., in an ordinary computer). It then uses another set of user-defined rules to create an mRNA-like sequence from the DNA. Finally, it uses the genetic code to translate the mRNA (or directly the DNA) to a protein-like sequence (a sequence of amino acids). The informational characteristics of the protein (entropy, absence, presence, abundance of some selected amino acids, and relationships among their likelihoods) can be used to solve problems (e.g., to understand complex shapes from their image data). Two case studies ((1) fractal geometry generated shape of a fern-leaf and (2) machining experiment generated shape of the worn-zones of a cutting tool) are presented elucidating the shape understanding ability of the proposed DBC in the presence of a great deal of variability in the image data of the respective shapes. The implication of the proposed DBC from the context of Internet-aided manufacturing system is also described. Further study can be carried out in solving other complex computational problems by using the proposed DBC and its derivatives.
M-State and N-Color (M-N = 1-1, 2-1, and 1-2) Turing Algorithms Demonstrated via DNA Self-Assembly
2023, ACS Omega
Microfabricated Isothermal Eg-Fet Sensor For Lamp Mediated Crispr/Cas12a Detection Of Hepatitis C Virus
2023, Proceedings of the IEEE International Conference on Micro Electro Mechanical Systems (MEMS)
Multirule-Combined Algorithmic Assembly Demonstrated by DNA Tiles
2022, ACS Applied Polymer Materials
Demonstration of Big Bang-like patterns through logic-implemented DNA algorithmic assembly
2022, AIP Advances
Solution to Satisfiability Problem Based on Molecular Beacon Microfluidic Chip Computing Model
2022, Communications in Computer and Information Science

View all citing articles on Scopus

¹: Present address: No. 70, Lien Hai Road, Kaohsiung 804, Taiwan. Tel.: +886 7 5252000x4240; fax: +886 7 525 4299.

²: Present address: No. 70, Lien Hai Road, Kaohsiung 804, Taiwan. Tel.: +886 7 5252000x4333.

View full text

Solving satisfiability problems using a novel microarray-based DNA computer

Abstract

Introduction

Section snippets

Computation concept

Discussion

Summary

Acknowledgement

Biosens. Bioelectron.

Theor. Comput. Sci.

Theor. Comput. Sci.

Biosystems

Molecular computation of solutions to combinatorial problems

Science

Solution of a 20-variable 3-SAT problem on a DNA computer

Science

Micropatterning proteins and synthetic peptides on solid supports: a novel application for microelectronics fabrication technology

Biotechnol. Progr.

Introduction to Algorithms

A DNA-Based Random Walk Method for Solving k-SAT