Solving satisfiability problems using a novel microarray-based DNA computer
Introduction
DNA computing uses molecular biology laboratory procedures to solve mathematical problems. The concept of DNA computing was first developed by Adleman in 1994 (Adleman, 1994), who solved the traveling salesperson problem (TSP), in which it is necessary to determine a path for a seven-vertex directed graph with designated start and end nodes. Inspired by Adleman's success, Lipton proposed a method for solving the satisfiability (SAT) problems (Lipton, 1995), whose goal is to determine appropriate assignments of a set of Boolean variables with values of either “true” or “false” such that the output of the whole Boolean formula is true (Cormen et al., 2001). The SAT instance solved by Lipton is , where x and y are the Boolean variables, (x ∨ y) and are two clauses, ∨ the logical operation OR and ∧ the logical operation AND, and the notation is the negation of x, which dedicates that if x = 1, and if x = 0.
Of the various computational problems which have been successfully solved using Adleman's approach (Braich et al., 2002, Chen and Ramachandran, 2001, Diaz et al., 2001, Liu et al., 2000, Liu et al., 2002, Ouyang et al., 1997, Pirrung et al., 2000, Rozenberg and Spaink, 2003, Sakamoto et al., 2000, Schmidt et al., 2004, Su and Smith, 2004, Wang et al., 2000, Yoshida and Suyama, 1999) the SAT problem has emerged as a useful benchmark for exploring the feasibility of DNA-based algorithms and experiments. For example, Sakamoto et al. used autonomous molecular computing with the delicate hairpin formation to solve an instance with four variables (Sakamoto et al., 2000); Liu et al. demonstrated the feasibility of DNA computing with surface-bound sequences rather than a test tube approach by solving an instance with six variables (Liu et al., 2000); Braich et al. employed a sticker model (Paun and Rozenberg, 1998) to solve an instance with 20 variables (Braich et al., 2002).
In general, DNA computing algorithms commence by constructing a data pool of all possible candidate solutions, expressed in the form of either single- or double-stranded DNA (ssDNA or dsDNA). The incorrect solutions are progressively eliminated from the pool by performing biological separation processes. The surviving sequences are then considered to represent the correct solutions of the problem. Adopting this approach, the n-variable SAT problem can be solved as follows: (1) create a data pool containing the 2n sequences corresponding to all of the possible variable assignments (note that a total of 2n possible solutions exist because each variable can be assigned by only value of either “true” or “false”); (2) eliminate the incorrect variable assignments by applying logic restrictions; (3) collect and read out the surviving sequences. In the traditional computation model, this strategy corresponds to a complete binary decision tree of n + 1 levels in which every one node has two branches, attached with two children, each representing the true value or false value of one variable. Each path starting from the root to one leaf, consisting of n edges, corresponds to one possible assignment of the n variables, as indicated in Fig. 1a for an instance with three variables.
However, to find all true assignments or to eliminate all false assignments, we may not need to examine all possible 2n assignments. In our approach, we examine the input Boolean formula by considering its falsity one clause by one clause. For instance, consider . The decision tree now is transformed into an AND/OR tree, as shown in Fig. 1b. The AND/OR tree consists of m + 1 levels if m clauses are considered. The branches on level i correspond to the literals of clause i, where the level of the root is counted as 1. In other words, the number of branches on level i is di, which is the number of literals included in clause i. In addition, the branches attached to a node perform the OR operation, and the path from the root to one leaf should perform the AND operation. Fig. 1b shows the complete AND/OR tree for this instance. However, the tree can be trimmed by transforming some conflict branches and then terminating some falsified branches, as shown in Fig. 1c. When a new coming clause i is checked, the following two procedures are applied to expand the tree branches: (1) transformation: if a newly assigned value of variable x in clause i is complementary to the previously assigned one, the value of x stays unchanged since x has been assigned and it cannot be assigned with two values; (2) termination: extract the values assigned for the variables on the path from the root to the current leaf, check whether these assigned values satisfy at least one literal of clause i. If none of literals of clause i can be satisfied, then this branch is terminated since it becomes a falsified branch. Accordingly, in Fig. 1c, the first left node in level 3 is terminated due to the falsity of the second clause and the second left node in level 4 is terminated due to falsity of the third clause . The leaf nodes in level 4 represent correct solutions to the formula and from left to right they are (x = true, y = false, z = true), (x = false, y = false, z = don’t care), (x = false, y = false, z = false), (x = don’t care, y = false, z = true), and (x = don’t care, y = false, z = true), respectively. In all, the satisfiable assignments are concluded as , , and .
In this present work, we wish to provide an implementation of the above approach to solve SAT problems in which the requirement to construct an initial data pool is removed. Two instances of the SAT problems are considered. The first instance is that of the relatively straightforward 2-SAT (k-SAT is identified by k literals in each clause) to demonstrate the associated experimental procedures. The second instance, , has four variables and five clauses. This problem is presented to verify the feasibility of using the proposed method to solve a more complicate instance. An advanced MEMS-based microarray technology is chosen for the platform since not too many DNA computing methods were explored on surface in spite of its high through put property. The algorithm is based on our previously proposed theory termed modified sticker model (Yang and Yang, 2005) in spirit of building solutions in steps to satisfy one clause at a time and eventually the solutions satisfying the entire Boolean formula. That is, computation begins with a blank data pool, which is capable of carrying added information but carries no information at first place. In solving a 2-SAT instance, the blank pool is first divided into two parts and each part will get assigned one literal in the first clause. Combining these parts completes the OR operation in the first clause. The combined updated pool is again divided into two parts and each part will be assigned with one literal in the second clause. Further combining and dividing will be repeated to carry literal assignments in the remaining clauses. Yet a probing procedure is required after every combined pool forms to exclude members falsifying the considered clause. In the end, the surviving members represent the solutions assignments. In the present study, the blank pool is played by an array with immobilized DNA strands, whereas, the literal information is played by fluorescent dye labeled DNA strands complementary to the immobilized ones.
Section snippets
Computation concept
To compute the 2-SAT problem , a 6 × 4 matrix array (six rows, four columns) is fabricated on a glass plate. In this array, each set of vertically adjacent x-, y- and z-sites corresponds to a Boolean variable assignment, and hence the array can accommodate a maximum of eight different assignments. The relationship between a given Boolean formula and the necessary array size is characterized later in this study. In the computations, the three variables, i.e. x, y and z, are
Discussion
Although the current study uses a DNA array as a platform, the essential logic of the algorithm is based on the modified sticker model, which was originally designed for solution phase experiments (Yang and Yang, 2005). The original sticker model uses a memory strand to carry the information conveyed by stickers. In essence, the sticker model comprises a long single memory strand and a number of stickers. Each memory strand consists of l non-overlapping substrands, each of which has a length of
Summary
This study has presented a novel computing procedure for solving SAT problems utilizing a MEMS-based microarray technique. Although a fully automatic system to carry out the computing process has not been developed, the results of this study nevertheless confirm the feasibility of solving SAT problems on a solid surface using a modified sticker model. The proposed method has the advantage that as the size of the problem is scaled-up, it is necessary only to linearly increase the variety of
Acknowledgement
The authors gratefully acknowledge the financial support provided to this study by the National Science Council of Taiwan under grant no. NSC 93-2113-M-214-001.
References (21)
- et al.
Gene probe assays on a fibre-optic evanescent wave biosensor
Biosens. Bioelectron.
(1992) - et al.
Sticker systems
Theor. Comput. Sci.
(1998) - et al.
DNA computing by blocking
Theor. Comput. Sci.
(2003) - et al.
A DNA solution of SAT problem by a modified sticker model
Biosystems
(2005) Molecular computation of solutions to combinatorial problems
Science
(1994)- et al.
Solution of a 20-variable 3-SAT problem on a DNA computer
Science
(2002) - et al.
Micropatterning proteins and synthetic peptides on solid supports: a novel application for microelectronics fabrication technology
Biotechnol. Progr.
(1992) - Chen, K., Ramachandran, V., 2001. A space-efficient randomized DNA algorithm for k-SAT....
- et al.
Introduction to Algorithms
(2001) - et al.
A DNA-Based Random Walk Method for Solving k-SAT
(2001)
Cited by (14)
DNA based computing for understanding complex shapes
2014, BioSystemsCitation Excerpt :This remarkable feature has been brought into the attention by Adleman in 1990s who showed how to solve a traveling salesman problem by using the DNA strands in wet-media. Since then, DBC has been found effective in solving such computational problems as NP-hard, pattern recognition, scheduling, clustering, in developing such structures as nano-scale mechanisms, self-repairing/adaptive robots, logic gates, futuristic computers, in generating random numbers, in processing natural language and image, in developing cryptographic systems, and so on (Adleman, 1994; Lipton, 1995; Sakamoto et al., 2000; Wasiewicz et al., 2001; Guo et al., 2005; Hsieh et al., 2008; Ran et al., 2009; Ullah et al., 1997; Sakakibara, 2003; Xu et al., 2006; Nie and Zhong, 2012; Muhammad et al., 2006; Stojanovic et al., 2002; Murata and Stojanovic, 2008; Gerasimova and Kolpashchikov, 2012; Murata et al., 2013; Chen and Yang, 2010; Gearheart et al., 2010; Wu et al., 2009; Bakar et al., 2008; Lin et al., 2007; Komiya et al., 2006; Yeh et al., 2006; Lee et al., 2011; Babaei, 2013; Xiao et al., 2006). In most cases, the DBC is performed in wet-media (in vitro) through the hybridization of the relatively short strands of genetic molecules (short DNA, m/tRNA, protein strands).
Microfabricated Isothermal Eg-Fet Sensor For Lamp Mediated Crispr/Cas12a Detection Of Hepatitis C Virus
2023, Proceedings of the IEEE International Conference on Micro Electro Mechanical Systems (MEMS)Multirule-Combined Algorithmic Assembly Demonstrated by DNA Tiles
2022, ACS Applied Polymer MaterialsSolution to Satisfiability Problem Based on Molecular Beacon Microfluidic Chip Computing Model
2022, Communications in Computer and Information Science
- 1
Present address: No. 70, Lien Hai Road, Kaohsiung 804, Taiwan. Tel.: +886 7 5252000x4240; fax: +886 7 525 4299.
- 2
Present address: No. 70, Lien Hai Road, Kaohsiung 804, Taiwan. Tel.: +886 7 5252000x4333.