Abstract
Scientific computation is a source of many large data sets, which are often structured in a non-interoperable manner. Data and metadata are stored on computing infrastructures or local computers in databases or in files. The discoverability and verifiability of published results represented by such data are poorly established. It is also difficult to manage access to data by applying permission granting mechanisms in the available file systems or databases. Moreover, accessibility of data from external systems is limited by security restrictions imposed by storage facilities. In this paper we present a novel method for managing scientific data, addressing the aforementioned issues by providing a web-based data model management interface, which supports design of metadata structures and their relation to data stored in files, exposing REST-based repositories for data recording and providing easy access level configuration to limit data visibility during the publication process. The method implemented by DataNet tools exploits one of the available PaaS platforms. We present a typical use case scenario and provide an evaluation of DataNet deployment in the PL-Grid Infrastructure.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Belhajjame, K., Corcho, O., Garijo, D., Zhao, J., Missier, P., Newman, D., Palma, R., Bechhofer, S., García Cuesta, E., Gómez-Pérez, J.M., Soiland-Reyes, S., Verdes-Montenegro, L., De Roure, D., Goble, C.: Workflow-centric research objects: First class citizens in scholarly discourse. In: Proceedings of Workshop on the Semantic Publishing (2012), https://www.escholar.manchester.ac.uk/api/datastream?publicationPid=uk-ac-man-scw:192020&datastreamId=POST-PEER-REVIEW-NON-PUBLISHERS.PDF
Ciepiela, E., Harężlak, D., Kasztelnik, M., Meizner, J., Dyk, G., Nowakowski, P., Bubak, M.: The collage authoring environment: From proof-of-concept prototype to pilot service. In: Proceedings of the International Conference on Computational Science. Procedia Computer Science, vol. 18, pp. 769–778 (2013), http://www.sciencedirect.com/science/article/pii/S1877050913003840
Ciepiela, E., et al.: Managing Entire Lifecycles of e-Science Applications in the GridSpace2 Virtual Laboratory – From Motivation through Idea to Operable Web-Accessible Environment Built on Top of PL-Grid e-Infrastructure. In: Bubak, M., Szepieniec, T., Wiatr, K. (eds.) PL-Grid 2011. LNCS, vol. 7136, pp. 228–239. Springer, Heidelberg (2012), http://dl.acm.org/citation.cfm?id=2184180.2184198
Cloudify – the open paas stack web page (January 2014), http://www.cloudifysource.org/
Crosas, M.: A data sharing story. Journal of eScience Librarianship 1, 173–179 (2013), http://escholarship.umassmed.edu/jeslib/vol1/iss3/7/
Cushing, R., Belloum, A., Bubak, M., Oprescu, A., de Laat, C.: Exploratory data processing using non-deterministic finite automata
De Roure, D., Belhajjame, K., Missier, P., Manuel, J., Palma, R., Ruiz, J.E., Hettne, K., Roos, M., Klyne, G., Goble, C.: Towards the preservation of scientific workflows. In: Procs. of the 8th International Conference on Preservation of Digital Objects (iPRES 2011). ACM (2011)
Deis web page (January 2014), http://deis.io/
Dspace web page (January 2014), http://www.dspace.org
Figshare repository web page (January 2014), http://figshare.com
Fundulaki, I., Auer, S.: Introduction to the special theme: Linked open data. ERCIM News 2014(96) (2014), http://ercim-news.ercim.eu/images/stories/EN96/EN96-web.pdf
Grape: an opinionated micro-framework for creating rest-like apis in ruby web page (January 2014), https://github.com/intridea/grape
Greenberg, J., White, H.C., Carrier, S., Scherle, R.: A metadata best practice for a scientific data repository. Journal of Library Metadata 9(3-4), 194–212 (2009), http://www.tandfonline.com/doi/abs/10.1080/19386380903405090
Heroku cloud application platform web page (January 2014), https://www.heroku.com/
Json schema web page (January 2014), http://json-schema.org
Koulouzis, S., Vasyunin, D., Cushing, R., Belloum, A., Bubak, M.: Cloud data federation for scientific applications. In: an Mey, D., et al. (eds.) Euro-Par 2013. LNCS, vol. 8374, pp. 13–22. Springer, Heidelberg (2014)
Ludäscher, B., Altintas, I., Berkley, C., Higgins, D., Jaeger, E., Jones, M., Lee, E.A., Tao, J., Zhao, Y.: Scientific workflow management and the kepler system: Research articles. Concurr. Comput.: Pract. Exper. 18(10), 1039–1065 (Aug 2006), http://dx.doi.org/10.1002/cpe.v18:10
Missier, P., Soiland-Reyes, S., Owen, S., Tan, W., Nenadic, A., Dunlop, I., Williams, A., Oinn, T., Goble, C.: Taverna, Reloaded. In: Gertz, M., Ludäscher, B. (eds.) SSDBM 2010. LNCS, vol. 6187, pp. 471–481. Springer, Heidelberg (2010), http://dx.doi.org/10.1007/978-3-642-13818-8_33
Mobley, A., Linder, S.K., Braeuer, R., Ellis, L.M., Zwelling, L.: A survey on data reproducibility in cancer research provides insights into our limited ability to translate findings from the laboratory to the clinic. PLoS ONE 8(5), e63221 (2013), http://dx.doi.org/10.1371%2Fjournal.pone.0063221
Mongodb web page (January 2014), http://www.mongodb.org
Openid foundation web site (January 2014), http://openid.net/
Openshift by red hat web page (January 2014), https://www.openshift.com/
Pivotal: Cloud foundry web site (January 2014), http://www.cloudfoundry.com
Rack – modular ruby webserver interface web site (January 2014), https://github.com/rack/rack
Ruby-ffi web page (January 2014), https://github.com/ffi/ffi/wiki
Ruby json schema validator web page (January 2014), https://github.com/hoxworth/json-schema
Stodden, V., Hurlin, C., Perignon, C.: Runmycode.org: A novel dissemination and collaboration platform for executing published computational results. In: 2012 IEEE 8th International Conference on E-Science, pp. 1–8 (2012)
Toolkit, G.: Grid ftp web site (January 2014), http://toolkit.globus.org/toolkit/data/gridftp
Witt, S.D., Sinclair, R., Sansum, A., Wilson, M.: Managing large data volumes from scientific facilities. ERCIM News 2012(89) (2012), http://dblp.uni-trier.de/db/journals/ercim/ercim2012.html#WittSSW12
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Harężlak, D., Kasztelnik, M., Pawlik, M., Wilk, B., Bubak, M. (2014). A Lightweight Method of Metadata and Data Management with DataNet. In: Bubak, M., Kitowski, J., Wiatr, K. (eds) eScience on Distributed Computing Infrastructure. Lecture Notes in Computer Science, vol 8500. Springer, Cham. https://doi.org/10.1007/978-3-319-10894-0_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-10894-0_12
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10893-3
Online ISBN: 978-3-319-10894-0
eBook Packages: Computer ScienceComputer Science (R0)