skip to main content
10.1145/2993318.2993320acmotherconferencesArticle/Chapter ViewAbstractPublication PagessemanticsConference Proceedingsconference-collections
research-article

MEX Interfaces: Automating Machine Learning Metadata Generation

Published: 12 September 2016 Publication History

Abstract

Despite recent efforts to achieve a high level of interoperability of Machine Learning (ML) experiments, positively collaborating with the Reproducible Research context, we still run into problems created due to the existence of different ML platforms: each of those have a specific conceptualization or schema for representing data and metadata. This scenario leads to an extra coding-effort to achieve both the desired interoperability and a better provenance level as well as a more automatized environment for obtaining the generated results. Hence, when using ML libraries, it is a common task to re-design specific data models (schemata) and develop wrappers to manage the produced outputs. In this article, we discuss this gap focusing on the solution for the question: "What is the cleanest and lowest-impact solution, i.e., the minimal effort to achieve both higher interoperability and provenance metadata levels in the Integrated Development Environments (IDE) context and how to facilitate the inherent data querying task?". We introduce a novel and low-impact methodology specifically designed for code built in that context, combining Semantic Web concepts and reflection in order to minimize the gap for exporting ML metadata in a structured manner, allowing embedded code annotations that are, in run-time, converted in one of the state-of-the-art ML schemas for the Semantic Web: MEX Vocabulary.

References

[1]
Diego Esteves, Diego Moussallem, Ciro Baron Neto, Tommaso Soru, Ricardo Usbeck, Markus Ackermann, and Jens Lehmann. MEX Vocabulary: A Lightweight Interchange Format for Machine Learning Experiments. In Proceedings of the 11th International Conference on Semantic Systems, pages 169--176. ACM, 2015.
[2]
Steven P. Callahan, Juliana Freire, Emanuele Santos, Carlos E. Scheidegger, Cláudio T. Silva, and Huy T. Vo. Vistrails: Visualization meets data management. In Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, SIGMOD '06, pages 745--747, New York, NY, USA, 2006. ACM.
[3]
Olga Tcheremenskaia, Romualdo Benigni, Ivelina Nikolova, Nina Jeliazkova, Sylvia Escher, Monika Batke, Thomas Baier, Vladimir Poroikov, Alexey Lagunin, Micha Rautenberg, et al. Opentox predictive toxicology framework: toxicological ontology and semantic media wiki-based opentoxipedia. J. Biomedical Semantics, 3(S-1):S7, 2012.
[4]
Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten. The weka data mining software: An update. SIGKDD Explor. Newsl., 11(1):10--18, November 2009.
[5]
Christine M Micheel, Sharly J Nass, Gilbert S Omenn, et al. Evolution of translational omics: lessons learned and the path forward. National Academies Press, 2012.
[6]
Panče Panov, Larisa Soldatova, and Sašo Džeroski. Ontodm-kdd: ontology for representing the knowledge discovery process. In International Conference on Discovery Science, pages 126--140. Springer, 2013.
[7]
Joaquin Vanschoren and Larisa Soldatova. Exposé: An ontology for data mining experiments. International workshop on third generation data mining: Towards service-oriented knowledge discovery (SoKD-2010), pages 31--46, 2010.
[8]
C. Maria Keet, Agnieszka Ławrynowicz, Claudia d'Amato, Alexandros Kalousis, Phong Nguyen, Raul Palma, Robert Stevens, and Melanie Hilario. The Data Mining OPtimization Ontology. Web Semantics: Science, Services and Agents on the World Wide Web, pages 43--53, 2015.
[9]
Alex Guazzelli, Michael Zeller, Wen-Ching Lin, and Graham Williams. PMML: An open standard for sharing models. The R Journal, 1(1):60--65, June 2009.
[10]
Diego Esteves, Diego Moussallem, Jens Lehmann, Maria Claudia Cavalcanti Reis, and Julio Cesar Duarte. Interoperable Machine Learning Metadata using MEX. In International Semantic Web Conference 2015 (ISWC2015), Demos & Posters, 2015.

Cited By

View all
  • (2017)LOG4MEXProceedings of the International Conference on Web Intelligence10.1145/3106426.3106530(139-145)Online publication date: 23-Aug-2017
  • (2017)An interoperable service for the provenance of machine learning experimentsProceedings of the International Conference on Web Intelligence10.1145/3106426.3106496(132-138)Online publication date: 23-Aug-2017

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
SEMANTiCS 2016: Proceedings of the 12th International Conference on Semantic Systems
September 2016
207 pages
ISBN:9781450347525
DOI:10.1145/2993318
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • Ghent University: Ghent University
  • AIT: Austrian Institute of Technology
  • Stanford University: Stanford University
  • Wolters Kluwer: Wolters Kluwer, Germany
  • Semantic Web Company: Semantic Web Company

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 September 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Annotation
  2. Interoperability
  3. MEX
  4. Machine Learning Outputs
  5. Metadata
  6. Provenance
  7. Reflection
  8. Reproducible Research

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

SEMANTiCS 2016

Acceptance Rates

SEMANTiCS 2016 Paper Acceptance Rate 18 of 85 submissions, 21%;
Overall Acceptance Rate 40 of 182 submissions, 22%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)1
Reflects downloads up to 14 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2017)LOG4MEXProceedings of the International Conference on Web Intelligence10.1145/3106426.3106530(139-145)Online publication date: 23-Aug-2017
  • (2017)An interoperable service for the provenance of machine learning experimentsProceedings of the International Conference on Web Intelligence10.1145/3106426.3106496(132-138)Online publication date: 23-Aug-2017

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media