Skip to main content

Stochastic Proximity Embedding: A Simple, Fast and Scalable Algorithm for Solving the Distance Geometry Problem

  • Chapter
  • First Online:
Distance Geometry

Abstract

Stochastic proximity embedding (SPE) is a simple, fast, and scalable algorithm for generating low-dimensional Euclidean coordinates for a set of data points so that they satisfy a prescribed set of geometric constraints. Like other related methods, SPE starts with a random initial configuration and iteratively refines it by updating the positions of the data points so as to minimize the violation of the input constraints. However, instead of minimizing all violations at once using a standard gradient minimization technique, SPE stochastically optimizes one constraint at a time, in a manner reminiscent of back-propagation in artificial neural networks. Here, we review the underlying theory that gives rise to the SPE formulation and show how it can be successfully applied to a wide range of problems in data analysis, with particular emphasis on computational chemistry and biology.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agrafiotis, D.K.: Stochastic algorithms for maximizing molecular diversity. J. Chem. Inform. Comput. Sci. 37(5), 841–851 (1997)

    Article  Google Scholar 

  2. Agrafiotis, D.K.: A new method for analyzing protein sequence relationships based on Sammon maps. Protein Sci. 6(2), 287–293 (1997)

    Article  Google Scholar 

  3. Agrafiotis, D.K.: Diversity of chemical libraries. In: Allinger, N.L., Clark, T., Gasteiger, J., Kollman, P.A., Schaefer III, H.F., Schreiner, P.R. (eds.) The Encyclopedia of Computational Chemistry, vol. 1, pp. 742–761. Wiley, Chichester (1998)

    Google Scholar 

  4. Agrafiotis, D.K.: Exploring the nonlinear geometry of sequence homology. Protein Sci. 12, 1604–1612 (2003)

    Article  Google Scholar 

  5. Agrafiotis, D.K.: Stochastic proximity embedding. J. Comput. Chem. 24, 1215–1221 (2003)

    Article  Google Scholar 

  6. Agrafiotis, D.K.: Exploring the nonlinear geometry of sequence homology. Protein Sci. 12, 1604–1612 (2003)

    Article  Google Scholar 

  7. Agrafiotis, D.K., Alex, S., Dai, H., Derkinderen, A., Farnum, M., Gates, P., Izrailev, S., Jaeger, E.P., Konstant, P., Leung, A., Lobanov, V.S., Marichal, P., Martin, D., Rassokhin, D.N., Shemanarev, M., Skalkin, A., Stong, J., Tabruyn, T., Vermeiren, M., Wan, J., Xu, X.Y., Yao, X.: Advanced Biological and Chemical Discovery (ABCD): centralizing discovery knowledge in an inherently decentralized world. J. Chem. Inform. Model. 47(6), 1999–2014 (2007)

    Article  Google Scholar 

  8. Agrafiotis, D.K., Bandyopadhyay, D., Carta, G., Knox, A.J.S., Lloyd, D.G.: On the effects of permuted input on conformational sampling of druglike molecules: an evaluation of stochastic proximity embedding (SPE). Chem. Biol. Drug. Des. 70(2), 123–133 (2007)

    Article  Google Scholar 

  9. Agrafiotis, D.K., Gibbs, A., Zhu, F., Izrailev, S., Martin, E.: Conformational boosting. Aust. J. Chem. 59, 874–878 (2006)

    Article  Google Scholar 

  10. Agrafiotis, D.K., Gibbs, A., Zhu, F., Izrailev, S., Martin, E.: Conformational sampling of bioactive molecules: a comparative study. J. Chem. Inform. Model. 47, 1067–1086 (2007)

    Article  Google Scholar 

  11. Agrafiotis, D.K., Lobanov, V.S.: Nonlinear mapping networks. J. Chem. Inform. Comput. Sci. 40, 1356–1362 (2000)

    Article  Google Scholar 

  12. Agrafiotis, D.K., Lobanov, V.S.: Multidimensional scaling of combinatorial libraries without explicit enumeration. J. Comput. Chem. 22(14), 1712–1722 (2001)

    Article  Google Scholar 

  13. Agrafiotis, D.K., Lobanov, V.S., Salemme, F.R.: Combinatorial informatics in the post-genomics era. Nat. Rev. Drug. Discov. 1, 337–346 (2002)

    Article  Google Scholar 

  14. Agrafiotis, D.K., Rassokhin, D.N., Lobanov, V.S.: Multidimensional scaling and visualization of large molecular similarity tables. J. Comput. Chem. 22(5), 488–500 (2001)

    Google Scholar 

  15. Agrafiotis, D.K., Xu, H.: A self-organizing principle for learning nonlinear manifolds. Proc. Natl. Acad. Sci. USA 99, 15869–15872 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  16. Agrafiotis, D.K., Xu, H.: A geodesic framework for analyzing molecular similarities. J. Chem. Inform. Comput. Sci. 43, 475–484 (2003)

    Article  Google Scholar 

  17. Allor, G., Jacob, L.: Distributed wireless sensor network localization using stochastic proximity embedding. Comput. Comm. 33, 745–755 (2010)

    Article  Google Scholar 

  18. Bandyopadhyay, D., Agrafiotis, D.K.: A self-organizing algorithm for molecular alignment and pharmacophore development. J. Comput. Chem. 29, 965–982 (2009)

    Article  Google Scholar 

  19. Bonnet, P., Agrafiotis, D.K., Zhu, F., Martin, E.J.: Conformational analysis of macrocycles: finding what common search methods miss. J. Chem. Inform. Model. 49, 2242–2259 (2009)

    Article  Google Scholar 

  20. Borg, I., Groenen, P.J.F.: Modern Multidimensional Scaling: Theory and Applications. Springer, New York (1997)

    MATH  Google Scholar 

  21. Cepeda, M.S., Lobanov, V.S., Farnum, M., Weinstein, R., Gates, P., Agrafiotis, D.K., Stang, P., Berlin, J.A.: Broadening access to electronic health care databases. Nat. Rev. Drug. Discov. 9, 84 (2010)

    Article  Google Scholar 

  22. Crippen, G.M.: Rapid calculation of coordinates from distance matrices. J. Comput. Phys. 26, 449–452 (1978)

    Article  MATH  Google Scholar 

  23. Crippen, G.M., Havel, T.F.: Distance Geometry and Molecular Conformation. Wiley, New York (1988)

    MATH  Google Scholar 

  24. Havel, T.F., Wüthrich, K.: An evaluation of the combined use of nuclear magnetic resonance and distance geometry for the determination of protein conformations in solution. J. Mol. Biol. 182, 281–294 (1985)

    Article  Google Scholar 

  25. Huang, E.S., Samudrala, R., Ponder, J.W.: Distance geometry generates native-like folds for small helical proteins using the consensus distances of predicted protein structures. Protein Sci. 7, 1998–2003 (1998)

    Article  Google Scholar 

  26. Izrailev, S., Agrafiotis, D.K.: A method for quantifying and visualizing the diversity of QSAR models. J. Mol. Graph. Model. 22, 275–284 (2004)

    Article  Google Scholar 

  27. Izrailev, S., Zhu, F., Agrafiotis, D.K.: A distance geometry heuristic for expanding the range of geometries sampled during conformational search. J. Comput. Chem. 27(16), 1962–1969 (2006)

    Article  Google Scholar 

  28. Kruskal, J.B.: Non-metric multidimensional scaling: a numerical method. Phychometrika 29, 115–129 (1964)

    Article  MathSciNet  MATH  Google Scholar 

  29. Kuszewski, J., Nilges, M., Brünger, A.T.J.: Sampling and efficiency of metric matrix distance geometry: A novel partial metrization algorithm. J. Biomol. NMR. 2, 33–56 (1992)

    Article  Google Scholar 

  30. Liberti, L., Lavor, C., Mucherino, A., Maculan, N.: Molecular distance geometry methods: from continuous to discrete. Int. Trans. Oper. Res. 18, 33–51 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  31. Liu, P., Agrafiotis, D.K., Theobald, D.L.: Fast determination of the optimal rotation matrix for weighted superpositions. J. Comput. Chem. 31, 1561–1563 (2010)

    Google Scholar 

  32. Liu, P., Zhu, F., Rassokhin, D.N., Agrafiotis, D.K.: A self-organizing algorithm for modeling protein loops. PLoS Comput. Biol. 5(8), e1000478 (2009)

    Article  Google Scholar 

  33. Martin, E.J., Hoeffel, T.J.: Oriented Substituent Pharmacophore PRopErtY Space (OSPPREYS): A substituent-based calculation that describes combinatorial library products better than the corresponding product-based calculation. J. Mol. Graph. Model. 18, 383–403 (2000)

    Article  Google Scholar 

  34. Meng, E.C., Gschwend, D.A., Blaney, J.M., Kuntz, I.D.: Orientational sampling and rigid-body minimization in molecular docking. Proteins: Structure, Function, and Bioinformatics 17, 266–278 (1993)

    Article  Google Scholar 

  35. Mumenthaler, C., Braun, W.: Automated assignment of simulated and experimental NOESY spectra of proteins by feedback filtering and self-correcting distance geometry. J. Mol. Biol. 254, 465–480 (1995)

    Article  Google Scholar 

  36. Rassokhin, D.N., Agrafiotis, D.K.: A modified update rule for stochastic proximity embedding. J. Mol. Graph. Model. 22, 133–140 (2003)

    Article  Google Scholar 

  37. Rassokhin, D.N., Lobanov, V.S., Agrafiotis, D.K.: Nonlinear mapping of massive data sets by fuzzy clustering and neural networks. J. Comput. Chem. 22(4), 373–386 (2011)

    Article  Google Scholar 

  38. Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290, 2323–2326 (2000)

    Article  Google Scholar 

  39. Sammon, J.W.: A nonlinear mapping for data structure analysis. IEEE Trans. Comput. C18, 401–409 (1969)

    Article  Google Scholar 

  40. Smellie, A., Wilson, C.J., Ng, S.C.: Visualization and interpretation of high content screening data. J. Chem. Inform. Model. 46, 201–207 (2006)

    Article  Google Scholar 

  41. Tenenbaum, J.B., de Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 (2000)

    Article  Google Scholar 

  42. Tresadern, G., Agrafiotis, D.K.: Conformational sampling with stochastic proximity embedding (SPE) and self-organizing superimposition (SOS): Establishing reasonable parameters for their practical use. J. Chem. Inform. Model. 49, 2786–2800 (2009)

    Article  Google Scholar 

  43. Witten, I.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann (2010)

    Google Scholar 

  44. Xu, H., Izrailev, S., Agrafiotis, D.K.: Conformational sampling by self-organization. J. Chem. Inform. Comput. Sci. 43, 1186–1191 (2003)

    Article  Google Scholar 

  45. Yang, E., Liu, P., Rassokhin, D., Agrafiotis, D.K.: Stochastic proximity embedding on graphics processing units: Taking multidimensional scaling to a new scale. J. Chem. Inform. Model. 51(11), 2852–2859 (2011)

    Article  Google Scholar 

  46. Zhu, F., Agrafiotis, D.K.: A self-organizing superposition (SOS) algorithm for conformational sampling. J. Comput. Chem. 28, 1234–1239 (2007)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dimitris K. Agrafiotis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this chapter

Cite this chapter

Agrafiotis, D.K., Bandyopadhyay, D., Yang, E. (2013). Stochastic Proximity Embedding: A Simple, Fast and Scalable Algorithm for Solving the Distance Geometry Problem. In: Mucherino, A., Lavor, C., Liberti, L., Maculan, N. (eds) Distance Geometry. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-5128-0_14

Download citation

Publish with us

Policies and ethics