Abstract
The published scientific results should be reproducible, otherwise the scientific findings reported in the publications are less valued by the community. Several undertakings, like myExperiment, RunMyCode, or DIRECT, contribute to the availability of data, experiments, and algorithms. Some of these experiments and algorithms are even referenced or mentioned in later publications. Generally, research articles that present experimental results only summarize the used algorithms and data. In the better cases, the articles do refer to a web link where the code can be found. We give here an account of our experience with extracting the necessary data to possibly reproduce IR experiments. We also make considerations on automating this information extraction and storing the data as IR nanopublications which can later be queried and aggregated by automated processes, as the need arises.
Keywords
- Natural Language Processing
- Information Extraction
- Information Retrieval System
- SPARQL Query
- Name Entity
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This research was partly funded by the Austrian Science Fund (FWF) project number P25905-N23 (ADmIRE).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Aizawa, A., Kohlhase, M., Ounis, I.: NTCIR-10 math pilot task overview. In: Proceedings of the 10th NTCIR Conference, Tokyo, Japan (2013)
Armstrong, T.G., Moffat, A., Webber, W., Zobel, J.: EvaluatIR: An Online Tool for Evaluating and Comparing IR Systems. In: Proceedings of the 32nd International ACM SIGIR Conference, SIGIR 2009, p. 833. ACM, New York (2009)
Bontcheva, K., Tablan, V., Maynard, D., Cunningham, H.: Evolving GATE to Meet New Challenges in Language Engineering. Natural Language Engineering 10(3/4), 349–373 (2004)
De Roure, D.: Towards computational research objects. In: Proceedings of the 1st International Workshop on Digital Preservation of Research Methods and Artefacts, DPRMA 2013, pp. 16–19. ACM (2013)
De Roure, D., Goble, C., Stevens, R.: The design and realisation of the virtual research environment for social sharing of workflows. Future Generation Computer Systems 25(5), 561–567 (2009)
Dussin, M., Ferro, N.: DIRECT: Applying the DIKW hierarchy to large-scale evaluation campaigns. In: Larsen, R.L., Paepcke, A., Borbinha, J.L., Naaman, M. (eds.) Proceedings of JCDL, p. 424 (2008)
Lipani, A., Piroi, F., Andersson, L., Hanbury, A.: An Information Retrieval Ontology for Information Retrieval Nanopublications. In: Kanoulas, E., Lupu, M., Clough, P., Sanderson, M., Hall, M., Hanbury, A., Toms, E. (eds.) CLEF 2014. LNCS, vol. 8685, pp. 44–49. Springer, Heidelberg (2014)
Maynard, D., Li, Y., Peters, W.: NLP techniques for term extraction and ontology population. In: Proceeding of the 2008 Conference on Ontology Learning and Population: Bridging the Gap Between Text and Knowledge, pp. 107–127 (2008)
Nekrutenko, A., Taylor, J.: Next-generation sequencing data interpretation: Enhancing reproducibility and accessibility. Nat. Rev. Genet. 13(9), 667–672 (2012)
Pedersen, T.: Empiricism is not a matter of faith. Computational Linguistics 34(3), 465–470 (2008)
van Rijn, J.N., et al.: OpenML: A collaborative science platform. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013, Part III. LNCS (LNAI), vol. 8190, pp. 645–649. Springer, Heidelberg (2013)
Stodden, V.: The reproducible research movement in statistics. Statistical Journal of the IAOS: Journal of the International Association for Official Statistics 30(2), 91–93 (2014)
Stodden, V., Hurlin, C., Perignon, C.: RunMyCode.org: A novel dissemination and collaboration platform for executing published computational results. Technical Report ID 2147710, Social Science Research Network (2012)
Vitek, J., Kalibera, T.: Repeatability, reproducibility and rigor in systems research. In: 2011 Proceedings of the International Conference on Embedded Software (EMSOFT), pp. 33–38 (2011)
Witte, R., Khamis, N., Rilling, J.: Flexible ontology population from text: The OwlExporter. In: Calzolari, N., Choukri, K., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S., Rosner, M., Tapias, D. (eds.) Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010). European Language Resources Association, ELRA (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Lipani, A., Piroi, F., Andersson, L., Hanbury, A. (2014). Extracting Nanopublications from IR Papers. In: Lamas, D., Buitelaar, P. (eds) Multidisciplinary Information Retrieval. IRFC 2014. Lecture Notes in Computer Science, vol 8849. Springer, Cham. https://doi.org/10.1007/978-3-319-12979-2_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-12979-2_5
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12978-5
Online ISBN: 978-3-319-12979-2
eBook Packages: Computer ScienceComputer Science (R0)