ABSTRACT
Given a set of strings S of equal lengths over an alphabet σ, the closest string problem seeks a string over σ whose maximum Hamming distance to any of the given strings is as small as possible. A data-based coding of strings for evolutionary search represents candidate closest strings as sequences of indexes of the given strings. The string such a chromosome represents consists of the symbols in the corresponding positions of the indexed strings.
A genetic algorithm using this coding was compared with two GAs that encoded candidate strings directly as strings over σ. In trials on twenty-five instances of the closest string problem with alphabets ranging is size from 2 to 30, the algorithm that used the data-based representation of candidate strings consistently returned the best results, and its advantage increased with the sizes of the test instances' alphabets.
- M. Frances and A. Litman. On covering problems of codes. Theory of Computing Systems, 30(2):113--119, 1997.Google ScholarCross Ref
- L. Gasieniec, J. Jansson, and A. Lingas. Approximation algorithms for Hamming clustering problems. Journal of Discrete Algorithms, 2:289--301, 2003.Google ScholarCross Ref
- F.C. Gomes, C.N. Meneses, P.M. Pardalos, and G.V.R. Viana. Parallel algorithm for the closest string problem. In R. Mondaini, editor, Proceedings of the Fourth Brazilian Symposium on Mathematical and Computational Biology, volume II, pages 326--332, 2005.Google Scholar
- J. Gramm, R. Niedermeier, and P. Rossmanith. Exact solutions for closest string and related problems. In P. Eades and T. Takaoka, editors, Proceedings of the 12th International Symposium on Algorithms and Computation, volume 2223 of LNCS, pages 441--453, Berlin/Heidelberg, 2001. Springer. Christchurch, New Zealand. Google ScholarDigital Library
- J.K. Lanctot, M. Li, B. Ma, S. Wang, and L. Zhang. Distinguishing string selection problems. Information and Computation, 185:41--55, 2003. Google ScholarDigital Library
- M. Li, B. Ma, and L. Wang. On the closest string and substring problems. Journal of the ACM, 49:157--171, 2002. Google ScholarDigital Library
- X. Liu, H. He, and O. Sykora. Parallel genetic algorithm and parallel simulated annealing for the closest string problem. In X. Li, S. Wang, and Z. Y. Dong, editors, Proceedings of the First International Conference on Advanced Data Mining and Applications, volume 3584 of LNAI, pages 591--597, Berlin/Heidelberg, 2005. Springer. Wuhan, China. Google ScholarDigital Library
- X. Liu, H. Mauch, Z. Hao, and G. Wu. A compounded genetic and simulated annealing algorithm for the closest string problem. In Proceedings of the 2nd International Conference on Bioinformatics and Biomedical Engineering, pages 702--705, 2008. Shanghai.Google ScholarCross Ref
- H. Mauch, M.J. Melzer, and J.S. Hu. Genetic algorithm approach for the closest string problem. In Proceedings of the 2003 IEEE Bioinformatics Conference, pages 560--561. IEEE Press, 2003. Google ScholarDigital Library
- C.N. Meneses, Z. Lu, C.A.S. Oliveira, and P.M. Pardalos. Optimal solutions for the closest string problem via integer programming. INFORMS Journal on Computing, 16(4):419--429, 2004. Google ScholarDigital Library
- S. Roman. Coding and Information Theory, volume 134 of Graduate Texts in Mathematics. Springer, Berlin/Heidelberg, 1992. Google ScholarDigital Library
- G. Syswerda. Uniform crossover in genetic algorithms. In J.D. Schaffer, editor, Proceedings of the Third International Conference on Genetic Algorithms, pages 2--9, San Mateo, CA, 1989. Morgan Kaufmann Publishers. Google ScholarDigital Library
Index Terms
- A data-based coding of candidate strings in the closest string problem
Recommendations
A closer look at the closest string and closest substring problem
ALENEX '11: Proceedings of the Meeting on Algorithm Engineering & ExpermimentsLet S be a set of k strings over an alphabet Σ each string has a length between ℓ and n. The Closest Substring Problem (CSSP) is to find a minimal integer d (and a corresponding string t of length ℓ) such that each string s ∈ S has a substring of length ...
On the closest string and substring problems
The problem of finding a center string that is "close" to every given string arises in computational molecular biology and coding theory. This problem has two versions: the Closest String problem and the Closest Substring problem. Given a set of strings S = ...
The Selective Fixing Algorithm for the closest string problem
A hybrid heuristic algorithm based on integer linear programming is proposed for the closest string problem (CSP). The algorithm takes a rough feasible solution in input and iteratively selects variables to be fixed at their initial value until the ...
Comments