Skip to main content

An Asset Management Approach to Continuous Integration of Heterogeneous Biomedical Data

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 8574))

Abstract

Increasingly, advances in biomedical research are the result of combining and analyzing heterogeneous data types from different sources, spanning genomic, proteomic, imaging, and clinical data. Yet despite the proliferation of data-driven methods, tools to support the integration and management of large collections of data for purposes of data driven discovery are scarce, leaving scientists with ad hoc and inefficient processes. The scientific process could benefit significantly from lightweight methods for data integration that allow for exploratory, incrementally refined integration of heterogeneous data. In this paper, we address this problem by introducing a new asset management based approach designed to support continuous integration of biomedical data. We describe the system and our experiences using it in the context of several scientific applications.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   34.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   44.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Howe, B., Cole, G., Souroush, E., Koutris, P., Key, A., Khoussainova, N., Battle, L.: Database-as-a-Service for Long-Tail Science. In: SSDBM 2011. LNCS, vol. 6809, pp. 480–489. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  2. Halevy, A., Franklin, M., Maier, D.: Principles of Dataspace Systems. In: PODS 2006, ACM, Chicago (2006)

    Google Scholar 

  3. Digital Asset Management. Wikipedia (2014)

    Google Scholar 

  4. Tunkelang, D.: Faceted Search. Synthesis Lectures on Information Concepts, Retrieval, and Services 1, 1–80 (2009)

    Article  Google Scholar 

  5. Halevy, A., Rajaraman, A., Ordille, J.: Data integration: the teenage years. In: VLDB 2006, pp. 9–16. VLDB Endowment, Seoul (2006)

    Google Scholar 

  6. Corwin, J., et al.: Dynamic tables: An architecture for managing evolving, heterogeneous biomedical data in relational database management systems. Journal of the American 14, 86–93 (2007)

    Google Scholar 

  7. Plale, B., et al.: SEAD Virtual Archive: Building a Federation of Institutional Repositories for Long-Term Data Preservation in Sustainability Science. International Journal of Digital Curation 8, 172–180 (2013)

    Article  Google Scholar 

  8. Hellerstein, J.M., et al.: The MADlib analytics library: or MAD skills, the SQL. In: Proceedings of the VLDB Endowment, pp. 1700–1711 (2012)

    Google Scholar 

  9. Smith, M., et al.: DSpace: An Open Source Dynamic Digital Repository. D-Lib Magazine 9 (2003)

    Google Scholar 

  10. Singh, G., et al.: A Metadata Catalog Service for Data Intensive Applica-tions. In: SuperComputing (SC 2003). ACM, Phoenix (2003)

    Google Scholar 

  11. Marcus, D.S., et al.: The Extensible Neuroimaging Archive Toolkit: an in-formatics platform for managing, exploring, and sharing neuroimaging data. Neuroinformatics 5, 11–34 (2007)

    Google Scholar 

  12. Shoshani, A., Sim, A., Gu, J.: Storage resource managers: Middleware com-ponents for grid storage. In: NASA Conference Publication, pp. 209–224 (2002)

    Google Scholar 

  13. Rajasekar, A., et al.: iRODS Primer: Integrated Rule-Oriented Data System. Synthesis Lectures on Information Concepts, Retrieval, and Services 2, 1–143 (2010)

    Article  MathSciNet  Google Scholar 

  14. Bittman, T.: Mind the Gap: Here Comes the Hybrid Cloud. In: Gartner Blog Network (2012)

    Google Scholar 

  15. Cattuto, C., Loreto, V., Pietronero, L.: Semiotic dynamics and collaborative tagging. Proceedings of the National Academy of Sciences 104(5), 1461–1464 (2007)

    Article  Google Scholar 

  16. Davis, P.M., Connolly, M.J.L.: Institutional Repositories: Evaluating the Reasons for Non-use of Cornell University’s Installation of DSpace. D-Lib Magazine 13 (2007)

    Google Scholar 

  17. Greenberg, J.: Metadata Extraction and Harvesting: A Comparison of Two Automatic Metadata Generation Applications. Journal of Internet Cataloging 6, 59–82 (2004)

    Article  Google Scholar 

  18. Lagoze, C., de Sompel, H.: The making of the open archives initiative proto-col for metadata harvesting. Library hi tech 21, 118–128 (2003)

    Article  Google Scholar 

  19. Tuchinda, R., Szekely, P., Knoblock, C.A.: Building data integration queries by demonstration. In: Proceedings of the 12th International Conference on Intelligent User Interfaces - IUI 2007, p. 170. ACM Press, New York (2007)

    Google Scholar 

  20. Allen, B., et al.: Software as a service for data scientists. Communications of the ACM 55, 81 (2012)

    Article  Google Scholar 

  21. Ananthakrishnan, R., et al.: Globus Nexus: An identity, profile, and group management platform for science gateways and other collaborative science applications. In: 2013 IEEE International Conference on Cluster Computing (CLUSTER), pp. 1–3 (2013)

    Google Scholar 

  22. Agus, D.B., et al.: A physical sciences network characterization of non-tumorigenic and metastatic cells. Scientific Reports 3, 1449 (2013)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Schuler, R.E., Kesselman, C., Czajkowski, K. (2014). An Asset Management Approach to Continuous Integration of Heterogeneous Biomedical Data. In: Galhardas, H., Rahm, E. (eds) Data Integration in the Life Sciences. DILS 2014. Lecture Notes in Computer Science(), vol 8574. Springer, Cham. https://doi.org/10.1007/978-3-319-08590-6_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-08590-6_1

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-08589-0

  • Online ISBN: 978-3-319-08590-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics