Abstract
Researchers in the life sciences (i.e., healthcare and agriculture) commonly use heuristics to process and interpret the vast amount of available DNA sequence data. The application of discrete optimization techniques, such as mixed-integer programming (MIP), remains largely unexplored and has the potential to transform the field. This paper reports on the successful use of MIP to optimize experimental design in a practical genetics application. More generally, our results illustrate the potential benefits of using MIP for subset selection problems in genetics.
Similar content being viewed by others
References
Ansari-Mahyari, S., Berg, P., Lund, M.: Fine mapping quantitative trait loci under selective phenotyping strategies based on linkage and linkage disequilibrium criteria. J. Anim. Breed. Genet. 126(6), 4430–4454 (2009)
Filho, J.D., Telles, M.P.D.C.: Optimization procedures for establishing reserve networks for biodiversity conservation taking into account population genetic structure. Genet. Mol. Biol. 29(2), 207–214 (2006)
Jin, C., Lan, H., Attie, A., Churchill, G., Bulutuglo, D., Yandell, B.S.: Selective phenotyping for increased efficiency in genetic mapping studies. Genetics 168(4), 2285–2293 (2004)
Wang, J.: Optimal marker-assisted selection to increase the effective size of small populations. Genetics 157(2), 867–874 (2001)
Nemhauser, G., Wolsey, L.A.: Integer and Combinatorial Optimization. Wiley-Interscience, New York (1988)
Hsu, N., Yeh, W.: Optimum experimental design for parameter identification in groundwater hydrology. Water Resour. Res. 25(5), 1025–1040 (1989)
van der Linden, W., Veldkamp, B., Carlson, J.: Optimizing balanced incomplete block designs for educational assessments. Appl. Psychol. Meas. 28(5), 317–331 (2004)
Atkinson, A., Donev, A.: Optimum Experimental Designs. Oxford Statistical Science Series, vol. 8. Clarendon Press, Oxford (1992)
Sager, S.: Sampling decisions in optimum experimental design in the light of Pontryagin’s maximum principle. Preprint (2012). http://mathopt.unihd.de/PEOPLE/Sager/submitted.php
Cochran, W., Cox, G.: Experimental Designs. Wiley, New York (1957)
Mead, R.: The Design of Experiments. Cambridge Univ. Press, Cambridge (1990)
Arambula, I., Hicks, I.: Restricted b-factors in bipartite graphs and t-designs. J. Comb. Des. 14(3), 169–182 (2006)
Harris, C., Hoffman, K., Yarrow, L.: Using integer programming techniques for the solution of an experimental design problem. Ann. Oper. Res. 58(3), 243–260 (1995)
Ostrowski, J., Linderoth, J., Rossi, F., Smriglio, S.: Constraint orbital branching. In: Proceedings of the 13th International Conference on Integer Programming and Combinatorial Optimization IPCO08, pp. 225–239 (2008)
Viera, H. Jr., Sanchez, S., Kienitz, K., Belderrain, M.: Generating and improving orthogonal designs by using mixed integer programming. Eur. J. Oper. Res. 215(3), 629–638 (2011)
Allen, S., Fathi, Y., Gross, K., Mace, M.: An optimal and near-optimal strategy to selecting individuals for transfer in captive breeding programs. Biol. Conserv. 143(11), 2858–2863 (2010)
Bafna, V., Narayanan, B., Ravi, R.: Nonoverlapping local alignments (weighted independent sets of axis-parallel rectangles). Discrete Appl. Math. 71, 41–53 (1996)
Butenko, S., Wilhelm, W.: Clique-detection models in computational biochemistry and genomics. Eur. J. Oper. Res. 173, 1–17 (2006)
Greenberg, H., Hart, W., Lancia, G.: Opportunities for combinatorial optimization in computational biology. INFORMS J. Comput. 16(4), 211–231 (2004)
Gusfield, D., Frid, Y., Brown, D.: Integer programming formulations and computations solving phylogenetic and population genetic problems with missing or genotypic data. In: LNCS, vol. 4598, pp. 51–64. Springer, Berlin (2007)
Buckler, E., Gore, M., Zhu, C., Yu, J.: Status and prospects of association mapping in plants. Plant Gen. 1(1), 5–20 (2009)
Morton, N.: Linkage disequilibrium maps and association mapping. J. Clin. Invest. 115(6), 1425–1430 (2005)
Grapes, L., Dekkers, J., Rothschild, M., Fernando, R.: Comparing linkage disequilibrium-based methods for fine mapping quantitative trait loci. Genetics 166(3), 1561–1570 (2004)
Sen, S., Johannes, F., Broman, K.W.: Selective genotyping and phenotyping strategies in a complex trait context. Genetics 181(1), 1613–1623 (2009)
Sen, S., Satagopan, J.M., Churchill, G.A.: Quantitative trait locus study design from an information perspective. Genetics 170(1), 447–464 (2005)
Xu, H.Q.: Minimum moment aberration for nonregular designs and supersaturated designs. Stat. Sin. 13, 691–708 (2003)
Cox, D., Reid, N.: The Theory of the Design of Experiments. Chapman & Hall/CRC Press, London/Boca Raton (2000)
Lynch, M., Walsh, B.: Genetics and Analysis of Quantitative Traits. Sinauer Associates, Sunderland (1998)
Legarra, A., Robert-Granie, C., Croiseau, P., Guillaume, F., Fritz, S.: Improved lasso for genomic selection. Genet. Res. 93(1), 77–87 (2010)
Meuwissen, T., Hayes, B., Goddard, M.: Prediction of total genetic value using genomewide dense marker maps. Genetics 157(4), 1819–1829 (2001)
Seber, G.A.F., Lee, A.J.: Linear Regression Analysis, 2nd edn. Wiley, Hoboken (2003)
Falconer, D., Mackay, T.: Introduction to Quantitative Genetics. Addison-Wesley Longman, Harlow (1996)
Churchill, G., Doerge, R.: Empirical threshold values for quantitative trait mapping. Genetics 138(3), 963–971 (1994)
Kang, H., et al.: Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42(4), 348–354 (2010)
Greene, W.: Econometric Analysis, vol. 2. Macmillan, London (1993)
Weisberg, S.: Applied Linear Regression. Wiley-Interscience, New York (2005)
Gupta, S., Perlman, M.: Power of the noncentral F-test: effect of additional variates on Hotelling’s t 2-test. J. Am. Stat. Assoc. 69(345), 174–180 (1974)
McClosky, B., Ma, X., Tanksley, S.: Quantifying the relative contribution of the heterozygous class to QTL detection power. Stat. Appl. Genet. Mol. Biol. 10(1) (2011)
Farrar, D., Glauber, R.: Multicollinearity in regression analysis: the problem revisited. Rev. Econ. Stat. 49(1), 92–107 (1967)
Welch, B.: The generalization of “student’s” problem when several different population variances are involved. Biometrika 34(1–2), 28–35 (1947)
Strawderman, R.: Personal communication (2011)
Acknowledgements
We thank Jason LaCombe for helping edit this manuscript. We thank Prof. Robert Strawderman for advice on model misspecification, which resulted in the proof of Proposition 3.2.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Alberto D’Onofrio.
Rights and permissions
About this article
Cite this article
McClosky, B., Tanksley, S.D. Optimizing Experimental Design in Genetics. J Optim Theory Appl 157, 520–532 (2013). https://doi.org/10.1007/s10957-012-0172-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10957-012-0172-9