Skip to main content

Key Discovery for Numerical Data: Application to Oenological Practices

  • Conference paper
  • First Online:
Graph-Based Representation and Reasoning (ICCS 2016)

Abstract

The key discovery problem has been recently investigated for symbolical RDF data and tested on large datasets such as DBpedia and YAGO. The advantage of such methods is that they allow the automatic extraction of combinations of properties that uniquely identify every resource in a dataset (i.e., ontological rules). However, none of the existing approaches is able to treat real world numerical data. In this paper we propose a novel approach that allows to handle numerical RDF datasets for key discovery. We test the significance of our approach on the context of an oenological application and consider a wine dataset that represents the different chemical based flavourings. Discovering keys in this context contributes in the investigation of complementary flavors that allow to distinguish various wine sorts amongst themselves.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    http://www.qualimediterranee.fr/projets-et-produits/consulter/les-projets/theme- 1-agriculture-competitive-et-durable/das2-tic-chaine-alimentaire/theme-1-devel opper-une-agriculture-competitive-et-durable/das-2-contribution-des-tic-a-la- chaine-alimentaire-en-amont/pilotype.

References

  1. https://en.wikipedia.org/wiki/birthday_problem

  2. http://wiki.dbpedia.org/downloads39

  3. http://www.mpi-inf.mpg.de/yago-naga/yago/downloads.html

  4. Atencia, M., Chein, M., Croitoru, M., David, J., Leclère, M., Pernelle, N., Saïs, F., Scharffe, F., Symeonidou, D.: Defining key semantics for the RDF datasets: experiments and evaluations. In: Hernandez, N., Jäschke, R., Croitoru, M. (eds.) ICCS 2014. LNCS, vol. 8577, pp. 65–78. Springer, Heidelberg (2014)

    Google Scholar 

  5. Atencia, M., David, J., Euzenat, J.: Data interlinking through robust linkkey extraction. In: ECAI 2014–21st European Conference on Artificial Intelligence, pp. 18–22 , Prague, Czech Republic - Including Prestigious Applications of Intelligent Systems (PAIS 2014), pp. 15–20, August 2014

    Google Scholar 

  6. Atencia, M., David, J., Scharffe, F.: Keys and pseudo-keys detection for web datasets cleansing and interlinking. In: ten Teije, A., Völker, J., Handschuh, S., Stuckenschmidt, H., d’Acquin, M., Nikolov, A., Aussenac-Gilles, N., Hernandez, N. (eds.) EKAW 2012. LNCS, vol. 7603, pp. 144–153. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  7. Chen, P.Y., Popovitch, P.M.: Correlation: Parametric and Nonparametric Measures. Sage University Papers Series on Quantitative Applications in the Social Sciences (2002)

    Google Scholar 

  8. Husson, F., Lê, S., Pagé, J.: Analyse de données avec R, 2éme édition revue et augmentée (2016)

    Google Scholar 

  9. Gunopulos, D., Khardon, R., Mannila, H., Saluja, S., Toivonen, H., Sharma, R.S.: Discovering all most specific sentences. ACM Trans. Database Syst. 28(2), 140–174 (2003)

    Article  Google Scholar 

  10. Holmes, S.: Multivariate analysis: the french way, pp. 1–14 (2006)

    Google Scholar 

  11. Hyndman, R.J., Fan, Y.: Sample quantiles in statistical packages. Am. Stat. 50, 361–365 (1996)

    Google Scholar 

  12. Pernelle, N., Saïs, F., Symeonidou, D.: An automatic key discovery approach for data. J. Web Sem. 23, 16–30 (2013)

    Article  Google Scholar 

  13. R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2015)

    Google Scholar 

  14. Sismanis, Y., Brown, P., Haas, P.J., Reinwald, B.: Gordian: efficient and scalable discovery of composite keys. In: VLDB, pp. 691–702 (2006)

    Google Scholar 

  15. Soru, T., Marx, E., Ngomo, A.-C.N.: ROCKER - a refinement operator for key discovery. In: Proceedings of the 24th International Conference on World Wide Web, WWW 2015 (2015)

    Google Scholar 

  16. Symeonidou, D., Armant, V., Pernelle, N., Saïs, F.: SAKey: Scalable Almost Key discovery in RDF data. In: Mika, P., et al. (eds.) ISWC 2014, Part I. LNCS, vol. 8796, pp. 33–49. Springer, Heidelberg (2014)

    Google Scholar 

Download references

Acknowledgments

The third author acknowledges the support of ANR grants ASPIQ (ANR-12-BS02-0003), QUALINCA (ANR-12-0012) and DURDUR (ANR-13-ALID-0002). The work of the third author has been carried out part of the research delegation at INRA MISTEA Montpellier and INRA IATE CEPIA Axe 5 Montpellier.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Madalina Croitoru .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Symeonidou, D. et al. (2016). Key Discovery for Numerical Data: Application to Oenological Practices. In: Haemmerlé, O., Stapleton, G., Faron Zucker, C. (eds) Graph-Based Representation and Reasoning. ICCS 2016. Lecture Notes in Computer Science(), vol 9717. Springer, Cham. https://doi.org/10.1007/978-3-319-40985-6_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-40985-6_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-40984-9

  • Online ISBN: 978-3-319-40985-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics