Skip to main content

Machine Learning and Context-Based Approaches to Get Quality Improved Food Data

  • Conference paper
  • First Online:
Proceedings of Sixth International Congress on Information and Communication Technology

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 236))

  • 1032 Accesses

Abstract

The provision of high-quality food data presents challenges for developers of health apps. There are no standardized data sources with information on all food products available in Europe. Commercial data sources are expensive and do not allow long-term storage, whereas open data sources from communities often contain inconsistent, duplicate, and incomplete data. In this thesis, methods are presented to load data from multiple sources via extract, transform, and load process into a central food data warehouse and to improve the data quality. Data profiling is used to detect inconsistencies and duplicates. With the help of machine learning methods and ontologies, data is completed and checked for plausibility using similar datasets. Via a specific API, an usage context can send to the central food data warehouse together with the search word to be queried. The API send a response with the food data results which were checked based on the context and provides further information as to whether the quality of the result data is sufficient in the respective context. All developed methods are tested using linear sampled test data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Muenzberg A, Sauer J, Laemmel S, Teichmann S, Hein A, Roesch N (2019) Optimization and merging of food product data and food composition databases for medical use. In: European academy of allergy & clinical immunology (EAACI) Congress, Lisbon

    Google Scholar 

  2. Roesch N, Muenzberg A, Sauer J, Arens-Volland A, Laemmel S, Teichmann S, Eichelberg M, Hein A (2019) Digital supported diagnostics in food allergy by analyzing app-based diaries. In: European academy of allergy & clinical immunology (EAACI) Congress Lisbon

    Google Scholar 

  3. Dig D, Johnson R (2006) How do APIs evolve? A story of refactoring. J Softw Maint Evol Res Pract. (John Wiley & Sons, Ltd.)

    Google Scholar 

  4. Muenzberg A, Sauer J, Hein A, Roesch N (2018) The use of ETL and data profiling to integrate data and improve quality in food databases. In: 14th international conference on wireless and mobile computing, networking and communications (WiMob 2018), Limassol, pp 231–238

    Google Scholar 

  5. Neuleben I (2020) Dokumentationspflicht und Aufbewahrungsfristen. Kassenärztliche Vereinigung Nordrhein. Düsseldorf, Deutschland: KVNO unterwegs, https://www.kvno.de/10praxis/30honorarundrecht/30recht/20dokupflicht/15_05_aufbewahrungsfristen/index.html. Accessed 12 July 2020

  6. Elfert P et al (2017) DiDiER-digitized services in dietary counselling for people with increased health risks related to malnutrition and food allergies. In: Computers and communications (ISCC), Heraklion, Greece, pp 100–104

    Google Scholar 

  7. Muenzberg A, Sauer J, Hein A, Roesch N (2020) Intelligent combination of food composition databases and food product databases for use in health applications. In: O’Hare G, O’Grady M, O’Donoghue J, Henn P (eds) Wireless mobile communication and healthcare. MobiHealth 2019. Lecture notes of the institute for computer sciences, social informatics and telecommunications engineering, vol 320. Springer, Cham

    Google Scholar 

  8. Kusumasari TF, Fitria (2016) Data profiling for data quality improvement with OpenRefine. In: IEEE international conference on information technology systems and innovation (ICITSI)

    Google Scholar 

  9. The IEEE and The Open Group, The Open Group Base Specifications Issue 6, 9. Regular Expressions, https://pubs.open-group.org/onlinepubs/009695399/basedefs/xbd_chap9.html#tag_09_03_05. Accessed 29 Oct 2020

  10. Olson JE (2003) Data quality: the accuracy dimension, Morgan Kaufmann Publishers

    Google Scholar 

  11. Abedjan Z, Golab L, Naumann F (2016) Data profiling. In: IEEE international conference on data engineering (ICDE), pp 1432–1435

    Google Scholar 

  12. NIST, Statistical Data Engineering Division Dataplot, COSINE DISTANCE, https://www.itl.nist.gov/div898/software/dataplot/refman2/auxillar/cosdist.htm. Accessed 29 Oct 2020

  13. Snowball, https://snowballstem.org/. Accessed 29 Oct 2020

  14. Cleve J, Laemmel U (2014) Data mining. De Gruyter, Oldenburg

    Book  Google Scholar 

  15. Fink L (2020) Hidden treasures in our groceries. https://www.kaggle.com/allunia/hidden-treasures-in-our-groceries. Accessed 29 Oct 2020

  16. Ng A, Soo K (2018) Data science–was ist das eigentlich?!. Springer, Berlin

    Book  Google Scholar 

  17. Abdi H, Williams LJ (2010) Principle component analysis. In: Wiley interdisciplinary reviews: computational statistics, vol 2. In Press (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexander Muenzberg .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Muenzberg, A., Sauer, J., Hein, A., Roesch, N. (2022). Machine Learning and Context-Based Approaches to Get Quality Improved Food Data. In: Yang, XS., Sherratt, S., Dey, N., Joshi, A. (eds) Proceedings of Sixth International Congress on Information and Communication Technology. Lecture Notes in Networks and Systems, vol 236. Springer, Singapore. https://doi.org/10.1007/978-981-16-2380-6_37

Download citation

Publish with us

Policies and ethics