skip to main content
10.1145/3457388.3459983acmconferencesArticle/Chapter ViewAbstractPublication PagescfConference Proceedingsconference-collections
invited-talk

Scalable topological data analysis for life science applications

Published:11 May 2021Publication History

ABSTRACT

Enabling discoveries and foundational understanding in modern day life sciences have largely become centered on our ability to effectively analyze large swathes of complex data from a diverse range of sources, capturing complex information encapsulated across the different layers of the nature-built system. While this data-centric approach has been the primary driver in computational life sciences and discovery pipelines for several decades now, the field has decisively diverged in the last few years on how and why these data are collected. More specifically, in contrast to yesteryear genomic and other -omic projects, modern day data collection by and large happens in an analysis-agnostic fashion---i.e., complex data are collected without any specific hypotheses to drive them; instead data are being collected because of easy availability of affordable high-throughput technologies. This has led to a fundamental shift in how we process these data and what we could glean from these data.

In this work, we present a novel algorithmic and software framework called Hyppo-X, which is based on algebraic topology to discover hidden structure within complex biological data sets [1, 3]. Topology is the field of computational mathematics that deals with structure at large. Computational topology and its applications constitute an emerging area of research with ample scope for development and data-driven discovery. We present results of our extensive collaborative studies in developing and applying our methods to analyze two types of data---plant phenomics data obtained from agricultural fields [2], and patient trajectories obtained from a network of hospitals toward antimicrobial stewardship [4]. Topological data analysis holds tremendous promise to model and analyze high-dimensional data sets in numerous scientific domains, and are likely to become part of future machine learning pipelines. These early studies demonstrate its potential while also highlighting a number of challenges and opportunities for future research.

The software is available for download at https://mhmethun.com/HYPPO-X/.

References

  1. Ananth Kalyanaraman, Methun Kamruzzaman, and Bala Krishnamoorthy. Interesting paths in the mapper complex. Journal of Computational Geometry, 10(1):500--531, 2019.Google ScholarGoogle Scholar
  2. Methun Kamruzzaman, Ananth Kalyanaraman, and Bala Krishnamoorthy. Detecting divergent subpopulations in phenomics data using interesting flares. In Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, pages 155--164, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Methun Kamruzzaman, Ananth Kalyanaraman, Bala Krishnamoorthy, Stefan Hey, and Pat Schnable. Hyppo-X: A scalable exploratory framework for analyzing complex phenomics data. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2019.Google ScholarGoogle Scholar
  4. Kaniz Fatema Madhobi, Methun Kamruzzaman, Ananth Kalyanaraman, Eric Lofgren, Rebekah Moehring, and Bala Krishnamoorthy. A visual analytics framework for analysis of patient trajectories. In Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, pages 15--24, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Scalable topological data analysis for life science applications

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            CF '21: Proceedings of the 18th ACM International Conference on Computing Frontiers
            May 2021
            254 pages
            ISBN:9781450384049
            DOI:10.1145/3457388

            Copyright © 2021 Owner/Author

            Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 11 May 2021

            Check for updates

            Qualifiers

            • invited-talk

            Acceptance Rates

            Overall Acceptance Rate240of680submissions,35%

            Upcoming Conference

            CF '24
          • Article Metrics

            • Downloads (Last 12 months)7
            • Downloads (Last 6 weeks)0

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader