Skip to main content

Generalization Aware Compression of Molecular Trajectories

  • Conference paper
  • First Online:
  • 712 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13389))

Abstract

Molecular Dynamics (MD) simulation is often used to study properties of various chemical interactions in domains such as drug development when executing real experimental studies are costly and/or unsafe. Studying trajectories generated from MD simulations provides detailed atomic level location data of every atom in the experiment. The analysis of this data leads to an atomic and molecular level understanding of interactions among the constituents of the system-of-interest, however, the data is extremely large and poses formidable storage and processing challenges in the analyses and querying of associated atom level motion trajectories. We take a first step towards applying domain-specific generalization techniques for trajectory compression algorithms towards reducing the storage requirements and speeding up the processing of within-distance queries over MD simulation data. We demonstrate that this generalization-aware compression, when applied to the dataset used in this case study yields significant efficiency improvements, without sacrificing the effectiveness of within-distance queries for threshold-based detection of molecular events of interest, such as the formation of hydrogen-bonds (H-Bonds).

Research was partly supported by the Eppley Foundation for Research.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Aminpour, M., Montemagno, C., Tuszynski, J.A.: An overview of molecular modeling for drug discovery with specific illustrative e.g’s of apps. Molecules 24, 1693 (2019)

    Google Scholar 

  2. Barequet, G., Chen, D.Z., Daescu, O., Goodrich, M.T., Snoeyink, J.: Efficiently approx. polygonal paths in 3+ dimensions. Algorithmica 33, 150–167 (2002)

    Google Scholar 

  3. Bibelayi, D.D., Lundemba, A.S., Tsalu, P.V., Kilunga, P.I., Tshishimbi, J.M., Yav, Z.G.: Hydrogen bonds of C=S, C=Se and C=Te with C-H in small-organic molecule compounds derived from the Cambridge structural database (CSD) (2021)

    Google Scholar 

  4. Cao, H., Wolfson, O., Trajcevski, G.: Spatio-temporal data reduction with deterministic error bounds. VLDB J. 15(3), 211–228 (2006)

    Article  Google Scholar 

  5. Chan, W.S., Chin, F.: Approximation of polygonal curves with minimum number of line segments or minimum error. Int. J. Comput. Geom. Appl. 6, 59–77 (1996)

    Article  MathSciNet  Google Scholar 

  6. Chiarot, G., Silvestri, C.: Time series compression: a survey (2021). https://doi.org/10.48550/ARXIV.2101.08784

  7. Douglas, D.H., Peucker, T.K.: Algos for the reduction of the no. of points required to represent a digitized line or its caricature. Cartographica 10, 112–122 (1973)

    Google Scholar 

  8. Guerrero-Corella, A., Fraile, A., Alemán, J.: Intramolecular HB activation: strategies, benefits, and influence in catalysis. ACS Organic & Inorganic Au (2022)

    Google Scholar 

  9. Hagita, K., et al.: Efficient compressed database of equilibrated configurations of ring-linear polymer blends for md simulations. Sci. Data 9, 1–9 (2022)

    Google Scholar 

  10. Jeffrey, G.: An Introduction to Hydrogen Bonding. Oxford University Press, Oxford (1997)

    Google Scholar 

  11. Knight, K.J.: Pharma chemistry. Pharm. J. 282, 105–128 (2021)

    Google Scholar 

  12. Kostal, J.: Computational chemistry in predictive toxicology: status quo et quo vadis? In: Advances in Molecular Toxicology, vol. 10 (2016)

    Google Scholar 

  13. Mcree, D.E.: Comp techniques. Practical Protein Crystallography (1999)

    Google Scholar 

  14. Muckell, J., Olsen, P.W., Hwang, J.H., Lawson, C.T., Ravi, S.S.: Compression of trajectory data: a comprehensive evaluation and new approach. GeoInformatica 18, 435–460 (2013)

    Article  Google Scholar 

  15. Pauling, L.: The Nature of the Chemical Bond, an Introduction to Modern Structural Chemistry, 3 edn. Cornell University Press, Ithaca (1960)

    Google Scholar 

  16. Sandu Popa, I., Zeitouni, K., Oria, V., Kharrat, A.: Spatio-temporal compression of trajectories in road networks. GeoInformatica 19(1), 117–145 (2014). https://doi.org/10.1007/s10707-014-0208-4

    Article  Google Scholar 

  17. Saalfeld, A.: Topologically consistent line simplification with the Douglas-Peucker algorithm. Cartogr. Geogr. Inf. Sci. 26(1), 7–18 (1999)

    Article  Google Scholar 

  18. Sayood, K.: Intro to Data Compression. Morgan Kaufmann Publisher, Burlington (2017)

    Google Scholar 

  19. Singh, A.K., Aggarwal, V., Saxena, P., Prakash, O.: Performance analysis of trajectory compression algorithms on marine surveillance data. In: ICACCI 2017 (2017)

    Google Scholar 

  20. Steiniger, S.: Enabling pattern-aware automated map generalization (2007)

    Google Scholar 

  21. Trajcevski, G.: Compression of spatio-temporal data (tutorial). In: IEEE International Conference on Mobile Data Management (MDM) (2016)

    Google Scholar 

  22. Wang, X., et al.: Md sims of the chiral recognition mechanism for a polysaccharide chiral stationary phase in enantiomeric chromatographic separations. Mol. Phys. 117(23–24), 3569–3588 (2019)

    Article  Google Scholar 

  23. Wang, X., Jameson, C.J., Murad, S.: Modeling enantiomeric separations as an interfacial process using amylose tris (3, 5-dimethylphenyl carbamate) (ADMPC) polymers coated on amorphous silica. Langmuir 36, 1113–1124 (2020)

    Google Scholar 

  24. Weibel, R.: Generalization of spatial data: principles and selected algorithms. In: van Kreveld, M., Nievergelt, J., Roos, T., Widmayer, P. (eds.) CISM School 1996. LNCS, vol. 1340, pp. 99–152. Springer, Heidelberg (1997). https://doi.org/10.1007/3-540-63818-0_5

    Chapter  Google Scholar 

  25. Wibowo, E.S., Park, B.D.: Two-dimensional nuclear magnetic resonance analysis of hydrogen-bond formation in thermosetting crystalline urea-formaldehyde resins at a low molar ratio. ACS Appl. Polym. Mater. 4(2), 1084–1094 (2022)

    Article  Google Scholar 

  26. Zhang, D., Ding, M., Yang, D., Liu, Y., Fan, J., Shen, H.T.: Trajectory simplification. Proc. VLDB Endow. 11, 934–946 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Md Hasan Anowar , Abdullah Shamail or Goce Trajcevski .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Anowar, M.H. et al. (2022). Generalization Aware Compression of Molecular Trajectories. In: Chiusano, S., Cerquitelli, T., Wrembel, R. (eds) Advances in Databases and Information Systems. ADBIS 2022. Lecture Notes in Computer Science, vol 13389. Springer, Cham. https://doi.org/10.1007/978-3-031-15740-0_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-15740-0_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-15739-4

  • Online ISBN: 978-3-031-15740-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics