Skip to main content

Advertisement

Log in

ACOHAP: an efficient ant colony optimization for the haplotype inference by pure parsimony problem

  • Published:
Swarm Intelligence Aims and scope Submit manuscript

Abstract

Haplotype information plays an important role in many genetic analyses. However, the identification of haplotypes based on sequencing methods is both expensive and time consuming. Current sequencing methods are only efficient to determine conflated data of haplotypes, that is, genotypes. This raises the need to develop computational methods to infer haplotypes from genotypes.

Haplotype inference by pure parsimony is an NP-hard problem and still remains a challenging task in bioinformatics. In this paper, we propose an efficient ant colony optimization (ACO) heuristic method, named ACOHAP, to solve the problem. The main idea is based on the construction of a binary tree structure through which ants can travel and resolve conflated data of all haplotypes from site to site. Experiments with both small and large data sets show that ACOHAP outperforms other state-of-the-art heuristic methods. ACOHAP is as good as the currently best exact method, RPoly, on small data sets. However, it is much better than RPoly on large data sets. These results demonstrate the efficiency of the ACOHAP algorithm to solve the haplotype inference by pure parsimony problem for both small and large data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Algorithm 2

Similar content being viewed by others

Notes

  1. http://sat.inesc-id.pt/~assg/rpoly/.

  2. http://www.iasi.cnr.it/~liuzzi/BIOCOMP/SNP/.

  3. http://doc.aporc.org/wiki/PTG.

  4. http://www.stats.ox.ac.uk/~marchini/phaseoff.html.

  5. Even though Table 3 of Benedettini et al. (2008) presents results of ACO-HI+ on the four SU data sets, only results concerning SU2 can be used for our comparisons. This is because of two reasons. First, results of ACO-HI+ for the SU1 data set are incorrectly reported in Benedettini et al. (2008); in fact, the sum of optimal solution values for the SU1 data set is 11961, which is much higher than 2453 as reported in Table 3 of Benedettini et al. (2008). Second, Table 3 of Benedettini et al. (2008) does not report the sum of the solution values of ACO-HI+ for all problem instances of the SU-100kb and SU3 data sets; this makes a comparison with our results impossible.

References

  • Benedettini, S., Roli, A., & Di Gaspero, L. (2008). Two-level ACO for haplotype inference under pure parsimony. In M. Dorigo, M. Birattari, C. Blum, M. Clerc, T. Stützle, & A. Winfield (Eds.), Lecture notes in computer science: Vol. 5217. Ant colony optimization and swarm intelligence (pp. 179–190). The 6th international workshop, ANTS 2008. Berlin/Germany: Springer.

    Chapter  Google Scholar 

  • Brown, D. G., & Harrower, I. M. (2006). Integer programming approaches to haplotype inference by pure parsimony. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 3(2), 141–154.

    Article  Google Scholar 

  • Clark, S. (1990). Inference of haplotypes from PCR-amplified samples of diploid populations. Molecular Biology and Evolution, 7, 111–122.

    Google Scholar 

  • Daly, M. J., Rioux, J. D., Schaffner, S. F., Hudson, T. J., & Lander, E. S. (2001). High-resolution haplotype structure in the Human genome. Nature Genetics, 29(2), 229–232.

    Article  Google Scholar 

  • Di Gaspero, L., & Roli, A. (2008). Stochastic local search for large-scale instances of the haplotype inference problem by pure parsimony. Journal of Algorithms, 63(1–3), 55–69.

    Article  MATH  Google Scholar 

  • Do, D. D., Dinh, Q. H., & Hoang, X. H. (2008). On the pheromone update rules of ant colony optimization approaches for the job shop scheduling problem. In T. D. Bui, T. V. Ho, & Q. T. Ha (Eds.), Lecture notes in computer science: Vol. 5357. The 11th pacific rim international conference on multi-agents: intelligent agents and multi-agent systems (pp. 153–160). Heidelberg: Springer.

    Chapter  Google Scholar 

  • Dorigo, M., & Stützle, T. (2004). Ant colony optimization. Cambridge: MIT Press.

    Book  MATH  Google Scholar 

  • Graça, A., Marques-silva, J., Lynce, I., & Oliveira, A. L. (2007). Efficient haplotype inference with pseudo-Boolean optimization. In H. Anai, K. Horimoto, & T. Kutsia (Eds.), Lecture notes in computer science: Vol. 4545. Algebraic biology 2007 (pp. 125–139). Heidelberg: Springer.

    Chapter  Google Scholar 

  • Graça, A., Marques-silva, J., Lynce, I., & Oliveira, A. L. (2008). Efficient haplotype inference with combined CP and OR techniques. In L. Perron & M. Trick (Eds.), Lecture notes in computer science: Vol. 5015. The 5th international conference on integration of AI and OR techniques in constraint programming for combinatorial optimization problems, CPAIOR 2008 (pp. 308–312). Heidelberg: Springer.

    Chapter  Google Scholar 

  • Graça, A., Lynce, I., Marques-Silva, J., & Oliveira, A. L. (2010). Haplotype inference by pure parsimony: a survey. Journal of Computational Biology, 17(8), 969–992.

    Article  MathSciNet  Google Scholar 

  • Gusfield, D. (2001). Inference of haplotypes from samples of diploid populations: complexity and algorithms. Journal of Computational Biology, 8(3), 305–323.

    Article  MathSciNet  Google Scholar 

  • Gusfield, D. (2003). Haplotype inference by pure parsimony. In R. Baeza-Yates, E. Chávez, & M. Crochemore (Eds.), Lecture notes in computer science: Vol. 2676. Combinatorial pattern matching (pp. 144–155). 14th Annual symposium on combinatorial pattern matching, CPM 2003. Heidelberg: Springer.

    Chapter  Google Scholar 

  • Gusfield, D., & Orzack, S. H. (2005). Haplotype inference. In S. Aluru (Ed.), Handbook of computational molecular biology (pp. 1–28). Boca Raton: CRC Press.

    Google Scholar 

  • Hutter, F., Hoos, H., & Stützle, T. (2007). Automatic algorithm configuration based on local search. In H. Robert & A. Howe (Eds.), The twenty-second conference on artificial intelligence (AAAI ’07) (pp. 1152–1157). Menlo Park: AAAI Press.

    Google Scholar 

  • Istrail, S. (2004). Computational methods for SNPs and haplotype inference. Berlin/Heidelberg: Springer.

    Book  MATH  Google Scholar 

  • Li, Z., Zhou, W., Zhang, X., & Chen, L. (2005). A parsimonious tree-grow method for haplotype inference. Bioinformatics, 21(17), 3475–3481.

    Article  Google Scholar 

  • Lynce, I., & Marques-Silva, J. (2008). Haplotype inference with Boolean satisfiability. International Journal Artificial Intelligence Tools, 17, 355–387.

    Article  Google Scholar 

  • Marchini, J., Cutler, D., Patterson, N., Stephens, M., Eskin, E., Halperin, E., Lin, S., Qin, Z., Munro, H., Abecasis, G., & Donnelly, P. (2006). A comparison of phasing algorithms for trios and unrelated individuals. The American Journal of Human Genetics, 78(3), 437–450.

    Article  Google Scholar 

  • Rosa, S., & Guimarães, S. (2010). Insights on haplotype inference on large genotype datasets. In E. Ferreira, S. Miyano, & P. Stadler (Eds.), Lecture notes in computer science: Vol. 6268. Advances in bioinformatics and computational biology (pp. 47–58). 5th Brazilian conference on bioinformatics. Heidelberg: Springer.

    Chapter  Google Scholar 

  • Stützle, T., & Hoos, H. (2000). MAX-MIN ant system. Future Generation Computer Systems, 16(8), 889–914.

    Article  Google Scholar 

  • The International Hapmap Consortium (2007). A second generation human haplotype map of over 3.1 million SNPs. Nature, 449, 851–861.

    Article  Google Scholar 

  • Tininini, L., Bertolazzi, P., Godi, A., & Lancia, G. (2010). CollHaps: a heuristic approach to haplotype inference by parsimony. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 7(3), 511–523.

    Article  Google Scholar 

Download references

Acknowledgements

We thank Gavin Band, Quang Le, and Khoi Le for comments and careful proof reading. We appreciate support from Graça and Tininini for providing us their programs. We also thank anonymous referees and editors for helpful comments and corrections. This work is partly supported by Vietnam National Science and Technology Fund (Nafosted: 102.01-2011.21).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sy Vinh Le.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Do, D.D., Le, S.V. & Hoang, X.H. ACOHAP: an efficient ant colony optimization for the haplotype inference by pure parsimony problem. Swarm Intell 7, 63–77 (2013). https://doi.org/10.1007/s11721-013-0077-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11721-013-0077-8

Keywords

Navigation