Skip to main content

Big Data Integration for Industry 4.0

  • Chapter
  • First Online:
Digital Transformation

Abstract

The fourth industrial revolution promises a new quality of automation with smart manufacturing devices sharing enormous amounts of data. A crucial step in fulfilling this promise is developing advanced data integration methods that are able to consolidate and combine heterogeneous data from multiple sources. We outline the use of knowledge graphs for data integration and provide an overview of proposed approaches to create and update such knowledge graphs, in particular for schema and ontology matching, data lifting and especially for entity resolution. Furthermore, we present data integration use cases for Industry 4.0 and discuss open problems.

This work was supported by the German Federal Ministry of Education and Research (BMBF, 01/S18026A-F) by funding the Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI) Dresden/Leipzig.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://ontop-vkg.org

  2. 2.

    https://www.w3.org/TR/rdf-primer/

  3. 3.

    https://www.w3.org/TR/rdf11-concepts/#vocabularies

  4. 4.

    http://www.w3.org/TR/rdb-direct-mapping/

  5. 5.

    https://www.w3.org/TR/r2rml/

  6. 6.

    https://spark.apache.org/

  7. 7.

    https://flink.apache.org/

  8. 8.

    https://neo4j.com/

References

  1. Altwaijry, H., Kalashnikov, D.V., Mehrotra, S.: Query-driven approach to entity resolution. Proceedings of the VLDB Endowment 6(14), 1846–1857 (2013)

    Article  Google Scholar 

  2. Ayala, D., Hernández, I., Ruiz, D., Rahm, E.: Leapme: Learning-based property matching with embeddings (2020)

    Google Scholar 

  3. Bader, S.R., Grangel-González, I., Nanjappa, P., Vidal, M.E., Maleshkova, M.: A knowledge graph for industry 4.0. The Semantic Web 12123, 465 – 480 (2020)

    Google Scholar 

  4. Barlaug, N., Gulla, J.A.: Neural networks for entity matching: A survey. arXiv preprint arXiv:2010.11075 (2020)

  5. Bhattacharya, I., Getoor, L.: Collective entity resolution in relational data. ACM Transactions on Knowledge Discovery from Data (TKDD) 1(1), 5–es (2007)

    Google Scholar 

  6. Bilenko, M., Kamath, B., Mooney, R.J.: Adaptive blocking: Learning to scale up record linkage. In: Sixth International Conference on Data Mining (ICDM’06). pp. 87–96. IEEE (2006)

    Google Scholar 

  7. Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics 5, 135–146 (2017)

    Article  Google Scholar 

  8. Chierichetti, F., Dalvi, N., Kumar, R.: Correlation clustering in mapreduce. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. pp. 641–650 (2014)

    Google Scholar 

  9. Christen, P.: The data matching process. In: Data Matching, pp. 23–35. Springer (2012)

    Google Scholar 

  10. Christophides, V., Efthymiou, V., Palpanas, T., Papadakis, G., Stefanidis, K.: An overview of end-to-end entity resolution for big data. ACM Computing Surveys (2020)

    Google Scholar 

  11. Chu, X., Ilyas, I.F., Koutris, P.: Distributed data deduplication. Proceedings of the VLDB Endowment 9(11), 864–875 (2016)

    Article  Google Scholar 

  12. Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Communications of the ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  13. Dong, X.L., Srivastava, D.: Big data integration. Synthesis Lectures on Data Management 7(1), 1–198 (2015)

    Article  Google Scholar 

  14. Ebraheem, M., Thirumuruganathan, S., Joty, S., Ouzzani, M., Tang, N.: Distributed representations of tuples for entity resolution pp. 1454–1467 (2018)

    Google Scholar 

  15. Efthymiou, V., Papadakis, G., Stefanidis, K., Christophides, V.: Minoaner: Schema-agnostic, non-iterative, massively parallel resolution of web entities. arXiv preprint arXiv:1905.06170 (2019)

  16. Ehrlinger, L., Wöß, W.: Towards a definition of knowledge graphs. In: SEMANTiCS (Posters, Demos, SuCCESS) (2016)

    Google Scholar 

  17. Ekaputra, F.J., Sabou, M., Biffl, S., Einfalt, A., Krammer, L., Kastner, W., Ekaputra, F.J.: Semantics for Cyber-Physical Systems: A cross-domain perspective. Semantic Web 11(1), 115–124 (2020). https://doi.org/10.3233/SW-190381, https://doi.org/10.3233/SW-190381

  18. Elmer, S., Jrad, F., Liebig, T., Ul Mehdi, A., Opitz, M., Stauß, T., Weidig, D.: Ontologies and reasoning to capture product complexity in automation industry. CEUR Workshop Proceedings 1963,  1–2 (2017)

    Google Scholar 

  19. Fellegi, I.P., Sunter, A.B.: A theory for record linkage. Journal of the American Statistical Association 64(328), 1183–1210 (1969)

    Article  MATH  Google Scholar 

  20. Giang, P.H.: A machine learning approach to create blocking criteria for record linkage. Health care management science 18(1), 93–105 (2015)

    Article  Google Scholar 

  21. Gölzer, P., Cato, P., Amberg, M.: Data processing requirements of industry 4.0 - use cases for big data applications. In: Becker, J., vom Brocke, J., de Marco, M. (eds.) 23rd European Conference on Information Systems, ECIS 2015, Münster, Germany, May 26-29, 2015 (2015), http://aisel.aisnet.org/ecis2015_rip/61

  22. Gröger, C.: Building an industry 4.0 analytics platform - practical challenges, approaches and future research directions. Datenbank-Spektrum 18(1), 5–14 (2018). https://doi.org/10.1007/s13222-018-0273-1, https://doi.org/10.1007/s13222-018-0273-1

  23. Gröger, C., Schwarz, H., Mitschang, B.: The manufacturing knowledge repository - consolidating knowledge to enable holistic process knowledge management in manufacturing. In: Hammoudi, S., Maciaszek, L.A., Cordeiro, J. (eds.) ICEIS 2014 - Proceedings of the 16th International Conference on Enterprise Information Systems, Volume 1, Lisbon, Portugal, 27-30 April, 2014. pp. 39–51. SciTePress (2014). https://doi.org/10.5220/0004891200390051, https://doi.org/10.5220/0004891200390051

  24. Gross, A., Hartung, M., Kirsten, T., Rahm, E.: On matching large life science ontologies in parallel. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 6254 LNBI, 35–49 (2010). https://doi.org/10.1007/978-3-642-15120-0_4

  25. Gruenheid, A., Dong, X.L., Srivastava, D.: Incremental record linkage. Proceedings of the VLDB Endowment 7(9), 697–708 (2014)

    Article  Google Scholar 

  26. Gruetze, T., Böhm, C., Naumann, F.: Holistic and scalable ontology alignment for linked open data. CEUR Workshop Proceedings 937 (2012)

    Google Scholar 

  27. Hassanzadeh, O., Chiang, F., Lee, H.C., Miller, R.J.: Framework for evaluating clustering algorithms in duplicate detection. Proceedings of the VLDB Endowment 2(1), 1282–1293 (2009)

    Article  Google Scholar 

  28. Hernández, M.A., Stolfo, S.J.: The merge/purge problem for large databases. ACM Sigmod Record 24(2), 127–138 (1995)

    Article  Google Scholar 

  29. Hernández, M.A., Stolfo, S.J.: Real-world data is dirty: Data cleansing and the merge/purge problem. Data mining and knowledge discovery 2(1), 9–37 (1998)

    Article  Google Scholar 

  30. Hitzler, P., Krötzsch, M., Rudolph, S.: Foundations of Semantic Web Technologies. Chapman & Hall/CRC (2009)

    Google Scholar 

  31. Hubauer, T., Lamparter, S., Haase, P., Herzig, D.: Use cases of the industrial knowledge graph at siemens. CEUR Workshop Proceedings 2180 (2018)

    Google Scholar 

  32. Ilyas, I.F., Chu, X.: Data cleaning. Morgan & Claypool (2019)

    Google Scholar 

  33. Ioannou, E., Nejdl, W., Niederée, C., Velegrakis, Y.: On-the-fly entity-aware query processing in the presence of linkage. Proceedings of the VLDB Endowment 3(1-2), 429–438 (2010)

    Article  Google Scholar 

  34. Isele, R., Bizer, C.: Learning expressive linkage rules using genetic programming. arXiv preprint arXiv:1208.0291 (2012)

  35. Jirkovský, V., Kadera, P., Rychtyckyj, N.: Semi-automatic ontology matching approach for integration of various data models in automotive. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 10444 LNAI(August), 53–65 (2017). https://doi.org/10.1007/978-3-319-64635-0_5

  36. Jirkovsky, V., Obitko, M., Marik, V.: Understanding data heterogeneity in the context of cyber-physical systems integration. IEEE Transactions on Industrial Informatics 13(2) (2017). https://doi.org/10.1109/TII.2016.2596101

  37. Kagermann, H., Wahlster, W., Helbig, J.: Recommendations for implementing the strategic initiative industrie 4.0 – securing the future of german manufacturing industry. Final report of the industrie 4.0 working group, acatech – National Academy of Science and Engineering, München (2013), https://en.acatech.de/wp-content/uploads/sites/6/2018/03/Final_report__Industrie_4.0_accessible.pdf

  38. Kalaycı, E.G., Grangel González, I., Lösch, F., Xiao, G., Ul-Mehdi, A., Kharlamov, E., Calvanese, D.: Semantic Integration of Bosch Manufacturing Data Using Virtual Knowledge Graphs, vol. 12507 LNCS. Springer International Publishing (2020). https://doi.org/10.1007/978-3-030-62466-8_29, http://dx.doi.org/10.1007/978-3-030-62466-8_29

  39. Kejriwal, M., Miranker, D.P.: An unsupervised algorithm for learning blocking schemes. In: 2013 IEEE 13th International Conference on Data Mining. pp. 340–349. IEEE (2013)

    Google Scholar 

  40. Koepcke, H., Thor, A., Rahm, E.: Learning-based approaches for matching web data entities. IEEE Internet Computing 14(4), 23–31 (2010)

    Article  Google Scholar 

  41. Kolb, L., Rahm, E.: Parallel entity resolution with dedoop. Datenbank-Spektrum 13(1), 23–32 (2013)

    Article  Google Scholar 

  42. Kolb, L., Thor, A., Rahm, E.: Parallel sorted neighborhood blocking with mapreduce. arXiv preprint arXiv:1010.3053 (2010)

  43. Kolb, L., Thor, A., Rahm, E.: Load balancing for mapreduce-based entity resolution. In: 2012 IEEE 28th international conference on data engineering. pp. 618–629. IEEE (2012)

    Google Scholar 

  44. Kolb, L., Thor, A., Rahm, E.: Multi-pass sorted neighborhood blocking with mapreduce. Computer Science-Research and Development 27(1), 45–63 (2012)

    Article  Google Scholar 

  45. Konda, P., Das, S., Suganthan GC, P., Doan, A., Ardalan, A., Ballard, J.R., Li, H., Panahi, F., Zhang, H., Naughton, J., et al.: Magellan: Toward building entity matching management systems. Proceedings of the VLDB Endowment 9(12), 1197–1208 (2016)

    Google Scholar 

  46. Kotis, K., Katasonov, A.: Semantic interoperability on the web of things: The semantic smart gateway framework. In: Barolli, L., Xhafa, F., Vitabile, S., Uehara, M. (eds.) Sixth International Conference on Complex, Intelligent, and Software Intensive Systems, CISIS 2012, Palermo, Italy, July 4-6, 2012. pp. 630–635. IEEE Computer Society (2012). https://doi.org/10.1109/CISIS.2012.200, https://doi.org/10.1109/CISIS.2012.200

  47. Kuhn, H.W.: The hungarian method for the assignment problem. Naval research logistics quarterly 2(1-2), 83–97 (1955)

    Article  MATH  Google Scholar 

  48. Lerm, S., Saeedi, A., Rahm, E.: Extended affinity propagation clustering for multi-source entity resolution. Datenbank-Spektrum (2021)

    Google Scholar 

  49. Liebig, T., Maisenbacher, A., Opitz, M., Seyler, J.R., Sudra, G., Wissmann, J.: Building a knowledge graph for products and solutions in the automation industry. CEUR Workshop Proceedings 2489, 13–23 (2019)

    Google Scholar 

  50. Ma, C., Molnár, B.: Use of Ontology Learning in Information System Integration: A Literature Survey. Communications in Computer and Information Science 1178 CCIS, 342–353 (2020). https://doi.org/10.1007/978-981-15-3380-8_30

  51. Maedche, A., Staab, S.: Ontology learning for the semantic web. IEEE Intell. Syst. 16(2), 72–79 (2001). https://doi.org/10.1109/5254.920602, https://doi.org/10.1109/5254.920602

  52. Mazumdar, S., Varga, A., Lanfranchi, V., Petrelli, D., Ciravegna, F.: A knowledge dashboard for manufacturing industries. In: Garcia-Castro, R., Fensel, D., Antoniou, G. (eds.) The Semantic Web: ESWC 2011 Workshops - ESWC 2011 Workshops, Heraklion, Greece, May 29-30, 2011, Revised Selected Papers. Lecture Notes in Computer Science, vol. 7117, pp. 112–124. Springer (2011). https://doi.org/10.1007/978-3-642-25953-1_10, https://doi.org/10.1007/978-3-642-25953-1_10

  53. McVitie, D.G., Wilson, L.B.: Stable marriage assignment for unequal sets. BIT Numerical Mathematics 10(3), 295–309 (1970)

    Article  MATH  Google Scholar 

  54. Megdiche, I., Teste, O., dos Santos, C.T.: An extensible linear approach for holistic ontology matching. In: Groth, P., Simperl, E., Gray, A.J.G., Sabou, M., Krötzsch, M., Lécué, F., Flöck, F., Gil, Y. (eds.) The Semantic Web - ISWC 2016 - 15th International Semantic Web Conference, Kobe, Japan, October 17-21, 2016, Proceedings, Part I. Lecture Notes in Computer Science, vol. 9981, pp. 393–410 (2016). https://doi.org/10.1007/978-3-319-46523-4_24, https://doi.org/10.1007/978-3-319-46523-4_24

  55. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26, 3111–3119 (2013)

    Google Scholar 

  56. Modoni, G.E., Doukas, M., Terkaj, W., Sacco, M., Mourtzis, D.: Enhancing factory data integration through the development of an ontology: from the reference models reuse to the semantic conversion of the legacy models. International Journal of Computer Integrated Manufacturing 30(10), 1043–1059 (2017). https://doi.org/10.1080/0951192X.2016.1268720, https://doi.org/10.1080/0951192X.2016.1268720

  57. Mudgal, S., Li, H., Rekatsinas, T., Doan, A., Park, Y., Krishnan, G., Deep, R., Arcaute, E., Raghavendra, V.: Deep learning for entity matching: A design space exploration. In: Proceedings of the 2018 International Conference on Management of Data. pp. 19–34 (2018)

    Google Scholar 

  58. do Nascimento, D.C., Pires, C.E.S., Mestre, D.G.: Heuristic-based approaches for speeding up incremental record linkage. Journal of Systems and Software 137, 335–354 (2018)

    Google Scholar 

  59. Nentwig, M., Groß, A., Rahm, E.: Holistic entity clustering for linked data. In: 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW). pp. 194–201. IEEE (2016)

    Google Scholar 

  60. Nentwig, M., Rahm, E.: Incremental clustering on linked data. In: 2018 IEEE International Conference on Data Mining Workshops (ICDMW). pp. 531–538. IEEE (2018)

    Google Scholar 

  61. Niedermann, F., Schwarz, H., Mitschang, B.: Managing insights: A repository for process analytics, optimization and decision support. In: Filipe, J., Liu, K. (eds.) KMIS 2011 - Proceedings of the International Conference on Knowledge Management and Information Sharing, Paris, France, 26-29 October, 2011. pp. 424–429. SciTePress (2011)

    Google Scholar 

  62. Nin, J., Muntés-Mulero, V., Martinez-Bazan, N., Larriba-Pey, J.L.: On the use of semantic blocking techniques for data cleansing and integration. In: 11th International Database Engineering and Applications Symposium (IDEAS 2007). pp. 190–198. IEEE (2007)

    Google Scholar 

  63. Otero-Cerdeira, L., Rodríguez-Martínez, F.J., Gómez-Rodríguez, A.: Ontology matching: A literature review. Expert Systems with Applications 42(2) (2015). https://doi.org/10.1016/j.eswa.2014.08.032

  64. Pan, X., Papailiopoulos, D., Oymak, S., Recht, B., Ramchandran, K., Jordan, M.I.: Parallel correlation clustering on big graphs. In: Advances in Neural Information Processing Systems. pp. 82–90 (2015)

    Google Scholar 

  65. Papadakis, G., Ioannou, E., Palpanas, T., Niederee, C., Nejdl, W.: A blocking framework for entity resolution in highly heterogeneous information spaces. IEEE Transactions on Knowledge and Data Engineering 25(12), 2665–2682 (2012)

    Article  Google Scholar 

  66. Papadakis, G., Papastefanatos, G., Palpanas, T., Koubarakis, M.: Scaling entity resolution to large, heterogeneous data with enhanced meta-blocking. In: EDBT. pp. 221–232 (2016)

    Google Scholar 

  67. Papadakis, G., Skoutas, D., Thanos, E., Palpanas, T.: A survey of blocking and filtering techniques for entity resolution. CoRR, abs/1905.06167 (2019)

    Google Scholar 

  68. Papadakis, G., Tsekouras, L., Thanos, E., Pittaras, N., Simonini, G., Skoutas, D., Isaris, P., Giannakopoulos, G., Palpanas, T., Koubarakis, M.: Jedai3: beyond batch, blocking-based entity resolution. In: EDBT. pp. 603–606 (2020)

    Google Scholar 

  69. Pennington, J., Socher, R., Manning, C.D.: Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). pp. 1532–1543 (2014)

    Google Scholar 

  70. Peukert, E., Berthold, H., Rahm, E.: Rewrite techniques for performance optimization of schema matching processes. Advances in Database Technology - EDBT 2010 - 13th International Conference on Extending Database Technology, Proceedings pp. 453–464 (2010). https://doi.org/10.1145/1739041.1739096

  71. Prestes, E., Carbonera, J.L., Fiorini, S.R., Jorge, V.A.M., Abel, M., Madhavan, R., Locoro, A., Gonçalves, P.J.S., Barreto, M.E., Habib, M.K., Chibani, A., Gérard, S., Amirat, Y., Schlenoff, C.: Towards a core ontology for robotics and automation. Robotics Auton. Syst. 61(11), 1193–1204 (2013). 10.1016/j.robot.2013.04.005, https://doi.org/10.1016/j.robot.2013.04.005

  72. Qadri, Y.A., Nauman, A., Zikria, Y.B., Vasilakos, A.V., Kim, S.W.: The future of healthcare internet of things: A survey of emerging technologies. IEEE Commun. Surv. Tutorials 22(2), 1121–1167 (2020). https://doi.org/10.1109/COMST.2020.2973314, https://doi.org/10.1109/COMST.2020.2973314

  73. Rahm, E.: Towards Large-Scale Schema and Ontology Matching. Schema Matching and Mapping pp. 3–27 (2011). https://doi.org/10.1007/978-3-642-16518-4_1

  74. Rahm, E.: The case for holistic data integration. In: Proc. ADBIS. pp. 11–27. Springer (2016)

    Google Scholar 

  75. Rahm, E., Do, H.H.: Data cleaning: Problems and current approaches. IEEE Data Eng. Bull. 23(4), 3–13 (2000)

    Google Scholar 

  76. Ramadan, B., Christen, P., Liang, H., Gayler, R.W.: Dynamic sorted neighborhood indexing for real-time entity resolution. Journal of Data and Information Quality (JDIQ) 6(4), 1–29 (2015)

    Article  Google Scholar 

  77. Rastogi, V., Dalvi, N., Garofalakis, M.: Large-scale collective entity matching. arXiv preprint arXiv:1103.2410 (2011)

  78. Ringsquandl, M., Kharlamov, E., Stepanova, D., Lamparter, S., Lepratti, R., Horrocks, I., Kroger, P.: On event-driven knowledge graph completion in digital factories. Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017 2018-Janua, 1676–1681 (2017). https://doi.org/10.1109/BigData.2017.8258105

  79. Ristoski, P., Petrovski, P., Mika, P., Paulheim, H.: A machine learning approach for product matching and categorization. Semantic Web 9(5), 707–728 (2018)

    Article  Google Scholar 

  80. Rost, C., Thor, A., Fritzsche, P., Gómez, K., Rahm, E.: Evolution analysis of large graphs with gradoop. In: Cellier, P., Driessens, K. (eds.) Machine Learning and Knowledge Discovery in Databases - International Workshops of ECML PKDD 2019, Würzburg, Germany, September 16-20, 2019, Proceedings, Part I. Communications in Computer and Information Science, vol. 1167, pp. 402–408. Springer (2019). https://doi.org/10.1007/978-3-030-43823-4_33, https://doi.org/10.1007/978-3-030-43823-4_33

  81. Roussille, P., Megdiche, I., Teste, O., Trojahn, C.: Boosting holistic ontology matching: Generating graph clique-based relaxed reference alignments for holistic evaluation. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 11313(November), 355–369 (2018). https://doi.org/10.1007/978-3-030-03667-6_23

    Article  Google Scholar 

  82. Saeedi, A., Nentwig, M., Peukert, E., Rahm, E.: Scalable matching and clustering of entities with famer. Complex Systems Informatics and Modeling Quarterly 16, 61–83 (2018)

    Article  Google Scholar 

  83. Saeedi, A., Peukert, E., Rahm, E.: Comparative evaluation of distributed clustering schemes for multi-source entity resolution. In: European Conference on Advances in Databases and Information Systems. pp. 278–293. Springer (2017)

    Google Scholar 

  84. Saeedi, A., Peukert, E., Rahm, E.: Using link features for entity clustering in knowledge graphs. In: European Semantic Web Conference. pp. 576–592. Springer (2018)

    Google Scholar 

  85. Saeedi, A., Peukert, E., Rahm, E.: Incremental multi-source entity resolution for knowledge graph completion. In: European Semantic Web Conference. pp. 393–408. Springer (2020)

    Google Scholar 

  86. Sampath Kumar, V.R., Khamis, A., Fiorini, S., Carbonera, J.L., Alarcos, A.O., Habib, M., Goncalves, P., Howard, L.I., Olszewska, J.I.: Ontologies for industry 4.0. Knowledge Engineering Review 34 (2019). https://doi.org/10.1017/S0269888919000109

  87. Santodomingo, R., Rohjans, S., Uslar, M., Rodríguez-Mondéjar, J.A., Sanz-Bobi, M.A.: Ontology matching system for future energy smart grids. Engineering Applications of Artificial Intelligence 32 (2014). https://doi.org/10.1016/j.engappai.2014.02.005

  88. Schmidt, M., Galende, M., Saludes, S., Sarris, N., Rodriguez, J., Unal, P., Stojanovic, N., Vidal, I.G.M., Corchero, A., Berre, A., Cattaneo, G., Geogoulias, K., Stojanovic, L., Decubber, C.: Big data challenges in smart manufacturing: A discussion paper on big data challenges for bdva and effra research & innovation roadmaps alignment. Tech. rep., Big Data Value Association (2018), https://bdva.eu/sites/default/files/BDVA_SMI_Discussion_Paper_Web_Version.pdf

  89. Simonini, G., Bergamaschi, S., Jagadish, H.: Blast: a loosely schema-aware meta-blocking approach for entity resolution. pvldb 9, 12 (2016), 1173–1184 (2016)

    Google Scholar 

  90. Skjæveland, M.G., Gjerver, A., Hansen, C.M., Klüwer, J.W., Strand, M.R., Waaler, A., Øverli, P.Ø.: Semantic material master data management at AibEL. CEUR Workshop Proceedings 2180,  4–5 (2018)

    Google Scholar 

  91. Song, D., Schilder, F., Hertz, S., Saltini, G., Smiley, C., Nivarthi, P., Hazai, O., Landau, D., Zaharkin, M., Zielund, T., Molina-Salgado, H., Brew, C., Bennett, D.: Building and Querying an Enterprise Knowledge Graph. IEEE Transactions on Services Computing 12(3), 356–369 (2019). https://doi.org/10.1109/TSC.2017.2711600

    Article  Google Scholar 

  92. Villazon-Terrazas, B., Garcia-Santa, N., Ren, Y., Faraotti, A., Wu, H., Zhao, Y., Vetere, G., Pan, J.Z.: Knowledge Graph Foundations, pp. 17–55. Springer International Publishing, Cham (2017). https://doi.org/10.1007/978-3-319-45654-6_2, https://doi.org/10.1007/978-3-319-45654-6_2

  93. Wang, J., Krishnan, S., Franklin, M.J., Goldberg, K., Kraska, T., Milo, T.: A sample-and-clean framework for fast and accurate query processing on dirty data. In: Proceedings of the 2014 ACM SIGMOD international conference on Management of data. pp. 469–480 (2014)

    Google Scholar 

  94. Yan, Y., Meyles, S., Haghighi, A., Suciu, D.: Entity matching in the wild: A consistent and versatile framework to unify data in industrial applications. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. pp. 2287–2301 (2020)

    Google Scholar 

  95. Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauly, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In: 9th \(\{\)USENIX\(\}\) Symposium on Networked Systems Design and Implementation (\(\{\)NSDI\(\}\) 12). pp. 15–28 (2012)

    Google Scholar 

  96. Zhao, C., He, Y.: Auto-em: End-to-end fuzzy entity-matching using pre-trained deep models and transfer learning. In: The World Wide Web Conference. pp. 2413–2424 (2019)

    Google Scholar 

  97. Zhou, B., Svetashova, Y., Byeon, S., Pychynski, T., Mikut, R., Kharlamov, E.: Predicting Quality of Automated Welding with Machine Learning and Semantics: A Bosch Case Study. International Conference on Information and Knowledge Management, Proceedings pp. 2933–2940 (2020). https://doi.org/10.1145/3340531.3412737

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Erhard Rahm .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer-Verlag GmbH, DE, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Obraczka, D., Saeedi, A., Christen, V., Rahm, E. (2023). Big Data Integration for Industry 4.0. In: Vogel-Heuser, B., Wimmer, M. (eds) Digital Transformation. Springer Vieweg, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-65004-2_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-65004-2_10

  • Published:

  • Publisher Name: Springer Vieweg, Berlin, Heidelberg

  • Print ISBN: 978-3-662-65003-5

  • Online ISBN: 978-3-662-65004-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics