Skip to main content

Towards Research Infrastructures that Curate Scientific Information: A Use Case in Life Sciences

  • Conference paper
  • First Online:
Data Integration in the Life Sciences (DILS 2018)

Abstract

Scientific information communicated in scholarly literature remains largely inaccessible to machines. The global scientific knowledge base is little more than a collection of (digital) documents. The main reason is in the fact that the document is the principal form of communication and—since underlying data, software and other materials mostly remain unpublished—the fact that the scholarly article is, essentially, the only form used to communicate scientific information. Based on a use case in life sciences, we argue that virtual research environments and semantic technologies are transforming the capability of research infrastructures to systematically acquire and curate machine readable scientific information communicated in scholarly literature.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Open Research Knowledge Graph: http://orkg.org (Accessed: October 16, 2018).

  2. 2.

    https://rdflib.readthedocs.io/ (Accessed: October 16, 2018).

References

  1. Aamodt, A., Nygård, M.: Different roles and mutual dependencies of data, information, and knowledge - an AI perspective on their integration. Data Knowl. Eng. 16(3), 191–222 (1995)

    Article  Google Scholar 

  2. Allan, R.: Virtual Research Environments: From Portals to Science Gateways. Chandos Publishing, Oxford (2009)

    Book  Google Scholar 

  3. Aryani, A., Wang, J.: Research graph: building a distributed graph of scholarly works using research data switchboard. In: Open Repositories Conference (2017)

    Google Scholar 

  4. Atkinson, M., Filgueira, R., Spinuso, A., Trani, L.: Download considered harmful (2018). Manuscript in preparation

    Google Scholar 

  5. Auer, S.: Towards an open research knowledge graph, January 2018

    Google Scholar 

  6. Auer, S. Kovtun, V., Prinz, M., Kasprzik, A., Stocker, M., Vidal, M.E.: Towards a Knowledge Graph for Science. In: Proceedings of the 8th International Conference on Web Intelligence, Mining and Semantics, WIMS 2018, pp. 1:1–1:6. ACM, New York (2018)

    Google Scholar 

  7. Barwise, J., Perry, J.: Situations and attitudes. J. Philos. 78(11), 668–691 (1981)

    Article  Google Scholar 

  8. Bechhofer, S., Roure, D.D., Gamble, M., Goble, C., Buchan, I.: Research objects: towards exchange and reuse of digital knowledge. In: Nature Precedings, July 2010

    Google Scholar 

  9. Bornmann, L., Mutz, R.: Growth rates of modern science: a bibliometric analysis based on the number of publications and cited references. J. Assoc. Inf. Sci. Technol. 66(11), 2215–2222 (2015)

    Article  Google Scholar 

  10. Burton, A.: The Scholix framework for interoperability in data-literature information exchange. D-Lib Mag. 23(1/2) (2017)

    Google Scholar 

  11. Candela, L., Castelli, D., Pagano, P.: D4Science: an e-infrastructure for supporting virtual research environments. In: Agosti, M., Esposito, F., Thanos, C. (eds) Proceedings of the 5th Italian Research Conference on Digital Libraries (IRCDL 2009), Padova January 2009

    Google Scholar 

  12. Candela, L., Castelli, D., Pagano, P.: Virtual research environments: an overview and a research agenda. Data Sci. J. 12 GRDI75-GRDI81 (2013)

    Article  Google Scholar 

  13. Capadisli, S., Guy, A., Verborgh, R., Lange, C., Auer, S., Berners-Lee, T.: Decentralised authoring, annotations and notifications for a read-write web with dokieli. In: Cabot, J., De Virgilio, R., Torlone, R. (eds.) ICWE 2017. LNCS, vol. 10360, pp. 469–481. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-60131-1_33

    Chapter  Google Scholar 

  14. Chen, X., Dallmeier-Tiessen, S., Dani, A., Dasler, R., Fernández, J.D., Fokianos, P., Herterich, P., Šimko, T.: CERN analysis preservation: a novel digital library service to enable reusable and reproducible research. In: Fuhr, N., Kovács, L., Risse, T., Nejdl, W. (eds.) Research and Advanced Technology for Digital Libraries. pp, pp. 347–356. Springer International Publishing, Cham (2016)

    Chapter  Google Scholar 

  15. Ciccarese, P., Ocana, M., Castro, L.J.G., Das, S., Clark, T.: An open annotation ontology for science on web 3.0. J. Biomed. Semant. 2(2), S4 (2011)

    Article  Google Scholar 

  16. de Sompel, H.V., Payette, S., Erickson, J., Lagoze, C., Warner, S.: Rethinking scholarly communication. D-Lib Mag. 10(9), (2004)

    Google Scholar 

  17. de Waard, A., Breure, L., Kircz, J.G., van Oostendorp, H.: Modeling rhetoric in scientific publications. In Proceedings of the International Conference on Multidisciplinary Information Sciences and Technologies (InSciT 2006) (2006)

    Google Scholar 

  18. de Waard, A., Shum, S.M., Carusi, A., Park, J., Samwald, M., Sándor, Á.: Hypotheses, evidence and relationships: the HypER approach for representing scientific knowledge claims. In: Clark, T., Luciano, J.S., Marshall, M.S., Prud’hommeaux, E.., Stephens, S. (eds), Proceedings of the Workshop on Semantic Web Applications in Scientific Discourse (SWASD 2009), vol. 523, Washington October 2009. CEUR

    Google Scholar 

  19. Devlin, K.: Logic and Information. Cambridge University Press, Cambridge (1991)

    MATH  Google Scholar 

  20. Fathalla, S., Vahdati, S., Auer, S., Lange, C.: Towards a knowledge graph representing research findings by semantifying survey articles. In: Kamps, J., Tsakonas, G., Manolopoulos, Y., Iliadis, L., Karydis, I. (eds.) Research and Advanced Technology for Digital Libraries. pp, pp. 315–327. Springer International Publishing, Cham (2017). https://doi.org/10.1007/978-3-319-67008-9_25

    Chapter  Google Scholar 

  21. Floridi, L.: The Philosophy of Information. Oxford University Press, Oxford (2011)

    Book  Google Scholar 

  22. García-Castro, L.J., Giraldo, O.X., García-Castro, A.: Using annotations to model discourse: an extension to the annotation ontology. In García-Castro, A. Lange, C., van Harmelen, F., Good, B. (eds), Proceedings of the 2nd Workshop on Semantic Publishing, vol. 903, pp. 13–22, Hersonissos, May 2012. CEUR

    Google Scholar 

  23. Groth, P., Gibson, A., Velterop, J.: The anatomy of a nanopublication. Inf. Serv. Use 30(1–2), 51–56 (2010)

    Article  Google Scholar 

  24. Haddad, S.: Iron-regulatory proteins secure iron availability in cardiomyocytes to prevent heart failure. Eur. Heart J. 38(5), 362–372 (2017)

    Google Scholar 

  25. Hanson, K.L., DiLauro, T., Donoghue, M.: The RMap project: capturing and preserving associations amongst multi-part distributed publications. In: Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL 2015, pp. 281–282. ACM, New York (2015)

    Google Scholar 

  26. Hentze, M.W., Muckenthaler, M.U., Galy, B., Camaschella, C.: Two to tango: regulation of mammalian iron metabolism. Cell 142(1), 24–38 (2010)

    Article  Google Scholar 

  27. Jinha, A.E.: Article 50 million: an estimate of the number of scholarly articles in existence. Learn. Publishing 23(3), 258–263 (2010)

    Article  Google Scholar 

  28. Jones, E., Oliphant, T., Peterson, P. et al.: SciPy: Open source scientific tools for Python (2001)

    Google Scholar 

  29. Kluyver, T.: Jupyter notebooks–a publishing format for reproducible computational workflows. In: Loizides, F., Schmidt, B. (eds), Positioning and Power in Academic Publishing: Players, Agents and Agendas, pp. 87–90. IOS Press (2016)

    Google Scholar 

  30. Manola, F., Miller, E., McBride, B.: RDF Primer. W3C Recommendation 10(1–107), 6 (2004)

    Google Scholar 

  31. Mons, B., Velterop, J.: Nano-publication in the e-science era. In: Workshop on Semantic Web Applications in Scientific Discourse (SWASD 2009), Washington (2009)

    Google Scholar 

  32. Priem, J.: Beyond the paper. Nature 495(7442), 437–440 (2013)

    Article  Google Scholar 

  33. Schneider, C.A., Rasband, W.S., Eliceiri, K.W.: NIH Image to imageJ: 25 years of image analysis. Nat. Methods 9(7), 671–675 (2012)

    Article  Google Scholar 

  34. Smith, B., et al.: The OBO foundry: coordinated evolution of ontologies to support biomedical data integration. Nat. Biotechnol. 25(11), 1251–1255 (2007)

    Article  Google Scholar 

  35. Star, S.L.: The ethnography of infrastructure. Am. Behav. Sci. 43(3), 377–391 (1999)

    Article  Google Scholar 

  36. Stocker, M.: Advancing the software systems of environmental knowledge infrastructures. In: Chabbi, A., Loescher, H.W. (eds.) Terrestrial Ecosystem Research Infrastructures: Challenges and Opportunities, pp. 399–423. CRC Press, Taylor & Francis Group (2017)

    Chapter  Google Scholar 

  37. Stocker, M.: From data to machine readable information aggregated in research objects. D-Lib Mag. 23(1/2) (2017)

    Google Scholar 

  38. Stocker, M.: Jupyter notebook for DILS 2018 paper on research infrastructures that curate scientific information. Figshare, July 2018

    Google Scholar 

  39. Stocker, M., Baranizadeh, E., Portin, H., Komppula, M., Rönkkö, M., Hamed, A., Virtanen, A., Lehtinen, K., Laaksonen, A., Kolehmainen, M.: Representing situational knowledge acquired from sensor data for atmospheric phenomena. Environ. Model. Softw. 58, 27–47 (2014)

    Article  Google Scholar 

  40. Stocker, M., et al.: Representing situational knowledge for disease outbreaks in agriculture. J. Agric. Inf. 7(2), 29–39 (2016)

    Google Scholar 

  41. Stocker, M., Paasonen, P., Fiebig, M., Zaidan, M.A., Hardisty, A.: Curating scientific information in knowledge infrastructures. Data Sci. J. 17 (2018). https://doi.org/10.5334/dsj-2018-021

  42. Stocker, M., Rönkkö, M., Kolehmainen, M.: Situational knowledge representation for traffic observed by a pavement vibration sensor network. IEEE Trans. Intell. Transp. Syst. 15(4), 1441–1450 (2014)

    Article  Google Scholar 

  43. White, K.E., Robbins, C., Khan, B., Freyman, C.: Science and engineering publication output trends: 2014 shows rise of developing country output while developed countries dominate highly cited publications. Technical Report NSF 18–300, National Science Foundation, October 2017

    Google Scholar 

  44. Wilkinson, M.D., et al.. The FAIR guiding principles for scientific data management and stewardship. Scientific Data, 3 March 2016

    Google Scholar 

Download references

Acknowledgements

We thank the TIB Leibniz Information Centre for Science and Technology for supporting this project and our colleagues and the participants of the project’s workshop series for their contributions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Markus Stocker .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Stocker, M., Prinz, M., Rostami, F., Kempf, T. (2019). Towards Research Infrastructures that Curate Scientific Information: A Use Case in Life Sciences. In: Auer, S., Vidal, ME. (eds) Data Integration in the Life Sciences. DILS 2018. Lecture Notes in Computer Science(), vol 11371. Springer, Cham. https://doi.org/10.1007/978-3-030-06016-9_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-06016-9_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-06015-2

  • Online ISBN: 978-3-030-06016-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics