Skip to main content

A Lightweight Method of Metadata and Data Management with DataNet

  • Chapter
eScience on Distributed Computing Infrastructure

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8500))

Abstract

Scientific computation is a source of many large data sets, which are often structured in a non-interoperable manner. Data and metadata are stored on computing infrastructures or local computers in databases or in files. The discoverability and verifiability of published results represented by such data are poorly established. It is also difficult to manage access to data by applying permission granting mechanisms in the available file systems or databases. Moreover, accessibility of data from external systems is limited by security restrictions imposed by storage facilities. In this paper we present a novel method for managing scientific data, addressing the aforementioned issues by providing a web-based data model management interface, which supports design of metadata structures and their relation to data stored in files, exposing REST-based repositories for data recording and providing easy access level configuration to limit data visibility during the publication process. The method implemented by DataNet tools exploits one of the available PaaS platforms. We present a typical use case scenario and provide an evaluation of DataNet deployment in the PL-Grid Infrastructure.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Belhajjame, K., Corcho, O., Garijo, D., Zhao, J., Missier, P., Newman, D., Palma, R., Bechhofer, S., García Cuesta, E., Gómez-Pérez, J.M., Soiland-Reyes, S., Verdes-Montenegro, L., De Roure, D., Goble, C.: Workflow-centric research objects: First class citizens in scholarly discourse. In: Proceedings of Workshop on the Semantic Publishing (2012), https://www.escholar.manchester.ac.uk/api/datastream?publicationPid=uk-ac-man-scw:192020&datastreamId=POST-PEER-REVIEW-NON-PUBLISHERS.PDF

  2. Ciepiela, E., Harężlak, D., Kasztelnik, M., Meizner, J., Dyk, G., Nowakowski, P., Bubak, M.: The collage authoring environment: From proof-of-concept prototype to pilot service. In: Proceedings of the International Conference on Computational Science. Procedia Computer Science, vol. 18, pp. 769–778 (2013), http://www.sciencedirect.com/science/article/pii/S1877050913003840

  3. Ciepiela, E., et al.: Managing Entire Lifecycles of e-Science Applications in the GridSpace2 Virtual Laboratory – From Motivation through Idea to Operable Web-Accessible Environment Built on Top of PL-Grid e-Infrastructure. In: Bubak, M., Szepieniec, T., Wiatr, K. (eds.) PL-Grid 2011. LNCS, vol. 7136, pp. 228–239. Springer, Heidelberg (2012), http://dl.acm.org/citation.cfm?id=2184180.2184198

    Chapter  Google Scholar 

  4. Cloudify – the open paas stack web page (January 2014), http://www.cloudifysource.org/

  5. Crosas, M.: A data sharing story. Journal of eScience Librarianship 1, 173–179 (2013), http://escholarship.umassmed.edu/jeslib/vol1/iss3/7/

    Google Scholar 

  6. Cushing, R., Belloum, A., Bubak, M., Oprescu, A., de Laat, C.: Exploratory data processing using non-deterministic finite automata

    Google Scholar 

  7. De Roure, D., Belhajjame, K., Missier, P., Manuel, J., Palma, R., Ruiz, J.E., Hettne, K., Roos, M., Klyne, G., Goble, C.: Towards the preservation of scientific workflows. In: Procs. of the 8th International Conference on Preservation of Digital Objects (iPRES 2011). ACM (2011)

    Google Scholar 

  8. Deis web page (January 2014), http://deis.io/

  9. Dspace web page (January 2014), http://www.dspace.org

  10. Figshare repository web page (January 2014), http://figshare.com

  11. Fundulaki, I., Auer, S.: Introduction to the special theme: Linked open data. ERCIM News 2014(96) (2014), http://ercim-news.ercim.eu/images/stories/EN96/EN96-web.pdf

  12. Grape: an opinionated micro-framework for creating rest-like apis in ruby web page (January 2014), https://github.com/intridea/grape

  13. Greenberg, J., White, H.C., Carrier, S., Scherle, R.: A metadata best practice for a scientific data repository. Journal of Library Metadata 9(3-4), 194–212 (2009), http://www.tandfonline.com/doi/abs/10.1080/19386380903405090

    Article  Google Scholar 

  14. Heroku cloud application platform web page (January 2014), https://www.heroku.com/

  15. Json schema web page (January 2014), http://json-schema.org

  16. Koulouzis, S., Vasyunin, D., Cushing, R., Belloum, A., Bubak, M.: Cloud data federation for scientific applications. In: an Mey, D., et al. (eds.) Euro-Par 2013. LNCS, vol. 8374, pp. 13–22. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  17. Ludäscher, B., Altintas, I., Berkley, C., Higgins, D., Jaeger, E., Jones, M., Lee, E.A., Tao, J., Zhao, Y.: Scientific workflow management and the kepler system: Research articles. Concurr. Comput.: Pract. Exper. 18(10), 1039–1065 (Aug 2006), http://dx.doi.org/10.1002/cpe.v18:10

  18. Missier, P., Soiland-Reyes, S., Owen, S., Tan, W., Nenadic, A., Dunlop, I., Williams, A., Oinn, T., Goble, C.: Taverna, Reloaded. In: Gertz, M., Ludäscher, B. (eds.) SSDBM 2010. LNCS, vol. 6187, pp. 471–481. Springer, Heidelberg (2010), http://dx.doi.org/10.1007/978-3-642-13818-8_33

    Chapter  Google Scholar 

  19. Mobley, A., Linder, S.K., Braeuer, R., Ellis, L.M., Zwelling, L.: A survey on data reproducibility in cancer research provides insights into our limited ability to translate findings from the laboratory to the clinic. PLoS ONE 8(5), e63221 (2013), http://dx.doi.org/10.1371%2Fjournal.pone.0063221

    Article  Google Scholar 

  20. Mongodb web page (January 2014), http://www.mongodb.org

  21. Openid foundation web site (January 2014), http://openid.net/

  22. Openshift by red hat web page (January 2014), https://www.openshift.com/

  23. Pivotal: Cloud foundry web site (January 2014), http://www.cloudfoundry.com

  24. Rack – modular ruby webserver interface web site (January 2014), https://github.com/rack/rack

  25. Ruby-ffi web page (January 2014), https://github.com/ffi/ffi/wiki

  26. Ruby json schema validator web page (January 2014), https://github.com/hoxworth/json-schema

  27. Stodden, V., Hurlin, C., Perignon, C.: Runmycode.org: A novel dissemination and collaboration platform for executing published computational results. In: 2012 IEEE 8th International Conference on E-Science, pp. 1–8 (2012)

    Google Scholar 

  28. Toolkit, G.: Grid ftp web site (January 2014), http://toolkit.globus.org/toolkit/data/gridftp

  29. Witt, S.D., Sinclair, R., Sansum, A., Wilson, M.: Managing large data volumes from scientific facilities. ERCIM News 2012(89) (2012), http://dblp.uni-trier.de/db/journals/ercim/ercim2012.html#WittSSW12

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Harężlak, D., Kasztelnik, M., Pawlik, M., Wilk, B., Bubak, M. (2014). A Lightweight Method of Metadata and Data Management with DataNet. In: Bubak, M., Kitowski, J., Wiatr, K. (eds) eScience on Distributed Computing Infrastructure. Lecture Notes in Computer Science, vol 8500. Springer, Cham. https://doi.org/10.1007/978-3-319-10894-0_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-10894-0_12

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-10893-3

  • Online ISBN: 978-3-319-10894-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics