Abstract
The partial digest problem consists in retrieving the positions of a set of points on the real line from their unlabeled pairwise distances. This problem is critical for DNA sequencing, as well as for phase retrieval in X-ray crystallography. When some of the distances are missing, this problem generalizes into a “minimum distance superset problem”, which aims to find a set of points of minimum cardinality such that the multiset of their pairwise distances is a superset of the input. We introduce a quadratic integer programming formulation for the minimum distance superset problem with a pseudo-polynomial number of variables, as well as a polynomial-size integer programming formulation. We investigate three types of solution approaches based on an available integer programming solver: (1) solving a linearization of the pseudo-polynomial-sized formulation, (2) solving the complete polynomial-sized formulation, or (3) performing a binary search over the number of points and solving a simpler feasibility or optimization problem at each step. As illustrated by our computational experiments, the polynomial formulation with binary search leads to the most promising results, allowing to optimally solve most instances with up to 25 distance values and 8 solution points.



Similar content being viewed by others
References
Anstreicher, K.M.: Semidefinite programming versus the reformulation–linearization technique for nonconvex quadratically constrained quadratic programming. J. Glob. Optim. 43(2–3), 471–484 (2009)
Billionnet, A., Elloumi, S.: Using a mixed integer quadratic programming solver for the unconstrained quadratic 0–1 problem. Math. Program. 109(1), 55–68 (2007)
Cieliebak, M., Eidenbenz, S., Penna, P.: Noisy Data Make the Partial Digest Problem NP-Hard. Springer, Berlin (2003)
Cieliebak, M., Eidenbenz, S., Penna, P.: Partial digest is hard to solve for erroneous input data. Theor. Comput. Sci. 349(3), 361–381 (2005)
Dakic, T.: On the turnpike problem. Ph.D. thesis, Simon Fraser University (2000)
Daurat, A., Gérard, Y., Nivat, M.: The chords’ problem. Theor. Comput. Sci. 282(2), 319–336 (2002)
Daurat, A., Gérard, Y., Nivat, M.: Some necessary clarifications about the chords’ problem and the partial digest problem. Theor. Comput. Sci. 347(1–2), 432–436 (2005)
Hollander, M., Wolfe, D., Chicken, E.: Nonparametric Statistical Methods. Wiley, Hoboken (2013)
Lemke, P., Skiena, S.S., Smith, W.D.: Reconstructing sets from interpoint distances. In: Aronov, B., Basu, S., Pach, J., Sharir, M. (eds.) Discrete and Computational Geometry: The Goodman–Pollack Festschrift, pp. 597–631. Springer, Berlin (2003)
Patterson, A.L.: A direct method for the determination of the components of interatomic distances in crystals. Crystal. Mater. 90(1–6), 517–542 (1935)
Patterson, A.L.: Ambiguities in the X-ray analysis of crystal structures. Phys. Rev. 65(1935), 195 (1944)
Sherali, H.D., Adams, W.P.: A Reformulation–Linearization Technique for Solving Discrete and Continuous Nonconvex Problems, vol. 31. Springer, Berlin (2013)
Skiena, S.S., Smith, W.D., Lemke, P.:Reconstructing sets from interpoint distances (extended abstract). In: Proceedings of the Sixth Annual Symposium on Computational Geometry, SCG’90, pp. 332–339. ACM, New York, NY, USA (1990)
Skiena, S.S., Sundaram, G.: A partial digest approach to restriction site mapping. Bull. Math. Biol. 56(2), 275–294 (1994)
Zhang, Z.: An exponential example for a partial digest mapping algorithm. J. Comput. Biol. 1(3), 235–239 (1994)
Acknowledgements
This research is partially supported by CNPq (grants number 310855/2013-6, 308498/2015-1, and 425962/2016-4), CAPES (grant number 1192880), and FAPERJ in Brazil. This support is gratefully acknowledged.
Author information
Authors and Affiliations
Corresponding author
Appendix: Detailed results
Appendix: Detailed results
In Tables 3, 4, 5 and 6, we present the trivial lower and upper bounds and, for each method, the lower and upper bounds obtained, the remaining gap and its running time in seconds. Furthermore, entries which did not reach the time limit of 3600 s and nevertheless have not obtained the optimal solution raised an out of memory exception. These instances are indicated with an “—”.
Rights and permissions
About this article
Cite this article
Fontoura, L., Martinelli, R., Poggi, M. et al. The minimum distance superset problem: formulations and algorithms. J Glob Optim 72, 27–53 (2018). https://doi.org/10.1007/s10898-017-0579-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10898-017-0579-9