Skip to main content

Abstract

The Database Group (DBGroup, www.dbgroup.unimore.it) and Information System Group (ISGroup, www.isgroup.unimore.it) research activities have been mainly devoted to the Data Integration Reserach Area. The DBGroup designed and developed the MOMIS data integration system, giving raise to a successful innovative enterprise DataRiver (www.datariver.it), distributing MOMIS as open source. MOMIS provides an integrated access to structured and semistructured data sources and allows a user to pose a single query and to receive a single unified answer. Description Logics, Automatic Annotation of schemata plus clustering techniques constitute the theoretical framework. In the context of data integration, the ISGroup addressed problems related to the management and querying of heterogeneous data sources in large-scale and dynamic scenarios. The reference architectures are the Peer Data Management Systems and its evolutions toward dataspaces. In these contexts, the ISGroup proposed and evaluated effective and efficient mechanisms for network creation with limited information loss and solutions for mapping management query reformulation and processing and query routing. The main issues of data integration have been faced: automatic annotation, mapping discovery, global query processing, provenance, multidimensional Information integration, keyword search, within European and national projects. With the incoming new requirements of integrating open linked data, textual and multimedia data in a big data scenario, the research has been devoted to the Big Data Integration Research Area. In particular, the most relevant achieved research results are: a scalable entity resolution method, a scalable join operator and a tool, LODEX, for automatically extracting metadata from Linked Open Data (LOD) resources and for visual querying formulation on LOD resources. Moreover, in collaboration with DATARIVER, Data Integration was successfully applied to smart e-health.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    NORMS will be included in the next release of the MOMIS Open Source version, available at http://www.datariver.it/data-integration/momis/.

  2. 2.

    http://www.tpc.org/tpch.

  3. 3.

    For a complete description see http://dbgroup.ing.unimore.it/MomisDashboard.

  4. 4.

    http://stravanni.github.io/blast/.

References

  1. I. Bartolini, D. Beneventano, S. Bergamaschi, P. Ciaccia, A. Corni, M. Orsini, M. Patella, M.M. Santese, MOMIS goes multimedia: WINDSURF and the case of top-k queries, in SEBD’15, Gaeta, 14–17 June 2015. (2015), pp. 200–207

    Google Scholar 

  2. F. Benedetti, S. Bergamaschi, L. Po, Lodex: a tool for visual querying linked open data, in ISWC’15 Posters & Demonstrations Track (2015)

    Google Scholar 

  3. F. Benedetti, S. Bergamaschi, L. Po, Visual querying LOD sources with lodex, in K-CAP’15, Palisades, NY, USA, 7-10 Oct 2015 (2015), pp. 12:1–12:8

    Google Scholar 

  4. D. Beneventano, Provenance based conflict handling strategies, in DASFAA’12, Busan, South Korea, 15–18 Apr 2012 (2012), pp. 286–297

    Google Scholar 

  5. D. Beneventano, S. Bergamaschi, The momis methodology for integrating heterogeneous data sources, in IFIP 18th World Computer Congress 22–27 Aug 2004 Toulouse, France (Springer, US, 2004), pp. 19–24

    Google Scholar 

  6. D. Beneventano, S. Bergamaschi, Provenance-aware semantic search engines based on data integration systems. IJOCI 4(2), 1–30 (2014)

    Google Scholar 

  7. D. Beneventano, S. Bergamaschi, A.R. Dannaoui, Integration and provenance of cereals genotypic and phenotypic data, in SEBD’12 (2012), pp. 91–98

    Google Scholar 

  8. D. Beneventano, S. Bergamaschi, L. Gagliardelli, L. Po, Driving innovation in youth policies with open data, in IC3K’15, Revised Selected Papers, Communications in Computer and Information Science (Springer, 2016)

    Google Scholar 

  9. D. Beneventano, S. Bergamaschi, F. Guerra, M. Vincini, The SEWASIE network of mediator agents for semantic search. J. UCS 13(12), 1936–1969 (2007)

    Google Scholar 

  10. D. Beneventano, S. Bergamaschi, R. Martoglia, Exploiting semantics for searching agricultural bibliographic data. J. of Inf. Sci. 42(6), 748–762 (2016)

    Article  Google Scholar 

  11. D. Beneventano, S. Bergamaschi, S. Sorrentino, M. Vincini, F. Benedetti, Semantic annotation of the CEREALAB database by the AGROVOC linked dataset. Ecol. Inf. 26(2), 119–126 (2015)

    Article  Google Scholar 

  12. D. Beneventano, A.R. Dannaoui, A. Sala, On provenance of data fusion queries, in SEBD’11, 26–29 June 2011 (2011), pp. 84–94

    Google Scholar 

  13. D. Beneventano, C. Gennaro, S. Bergamaschi, F. Rabitti, A mediator-based approach for integrating heterogeneous multimedia sources. Multimed. Tools Appl. 62(2), 427–450 (2013)

    Article  Google Scholar 

  14. D. Beneventano, F. Guerra, S. Magnani, M. Vincini, A web service based framework for the semantic mapping amongst product classification schemas. J. Electron. Commer. Res. 5(2), 114–127 (2004)

    Google Scholar 

  15. D. Beneventano, F. Guerra, A. Maurino, M. Palmonari, G. Pasi, A. Sala, Unified semantic search of data and services, in MTSR’09 (2009), pp. 95–107

    Google Scholar 

  16. D. Beneventano, S.E. Haoum, D. Montanari, Mapping of heterogeneous schemata, business structures, and terminologies, in Workshop at DEXA’07 (2007), pp. 412–418

    Google Scholar 

  17. D. Beneventano, M. Olaru, M. Vincini, Analyzing dimension mappings and properties in data warehouse integration, in OTM’13 (2013), pp. 616–623

    Google Scholar 

  18. S. Bergamaschi, D. Beneventano, F. Guerra, M. Orsini, Data integration, in Handbook of Conceptual Modeling: Theory, Practice and Research Challenges, ed. By D.W. Embley, B. Thalheim (Springer, 2011)

    Google Scholar 

  19. S. Bergamaschi, D. Beneventano, F. Guerra, M. Vincini, Building a tourism information provider with the MOMIS system. J. Inf. Technol. Tour. 7(3–4), 221–238 (2004)

    Google Scholar 

  20. S. Bergamaschi, S. Castano, M. Vincini, Semantic integration of semistructured and structured data sources. SIGMOD Rec. 28(1) (1999)

    Google Scholar 

  21. S. Bergamaschi, E. Domnori, F. Guerra, M. Orsini, R. Trillo-Lado, Y. Velegrakis, Keymantic: semantic keyword-based searching in data integration systems. PVLDB 3(2) (2010)

    Google Scholar 

  22. S. Bergamaschi, E. Domnori, F. Guerra, R. Trillo-Lado, Y. Velegrakis, Keyword search over relational databases: a metadata approach, in SIGMOD (ACM, 2011), pp. 565–576

    Google Scholar 

  23. S. Bergamaschi, D. Ferrari, F. Guerra, G. Simonini, Y. Velegrakis, Providing insight into data source topics. J. Data Semant. 5(4), 211–228 (2016)

    Article  Google Scholar 

  24. S. Bergamaschi, N. Ferro, F. Guerra, G. Silvello, Keyword-based search over databases: a roadmap for a reference architecture paired with an evaluation framework. Trans. Comput. Collect. Intell. 21, 1–20 (2016)

    Google Scholar 

  25. S. Bergamaschi, F. Guerra, M. Interlandi, R.T. Lado, Y. Velegrakis, QUEST: a keyword search system for relational data based on semantic and machine learning techniques. PVLDB 6(12), 1222–1225 (2013)

    Google Scholar 

  26. S. Bergamaschi, F. Guerra, M. Interlandi, R.T. Lado, Y. Velegrakis, Combining user and database perspective for solving keyword queries over relational databases. Inf. Syst. 55, 1–19 (2016)

    Article  Google Scholar 

  27. S. Bergamaschi, F. Guerra, S. Rota, Y. Velegrakis, A hidden markov model approach to keyword-based search over relational databases, in ER, vol. 6998 (LNCS, Springer, 2011), pp. 411–420

    Google Scholar 

  28. S. Bergamaschi, L. Po, S. Sorrentino, Automatic annotation for mapping discovery in integration systems, in SEBD’08 (2008), pp. 334–341

    Google Scholar 

  29. J. Bleiholder, F. Naumann, Data fusion. ACM Comp. Surv. 41, 1–41 (2008)

    Article  Google Scholar 

  30. G.H.L. Fletcher, F. Mandreoli, No users no dataspaces! query-driven dataspace orchestration? in Proceedings of SEBD (2016), pp. 150–157

    Google Scholar 

  31. B. Glavic, G. Alonso, R.J. Miller, L.M. Haas, Tramp: Understanding the behavior of schema mappings through provenance. PVLDB 3(1), 1314–1325 (2010)

    Google Scholar 

  32. M. Golfarelli, F. Mandreoli, W. Penzo, S. Rizzi, E. Turricchia, Towards OLAP query reformulation in peer-to-peer data warehousing, in Proceedings of ACM (DOLAP) (2010), pp. 37–44

    Google Scholar 

  33. A.Y. Halevy, M.J. Franklin, D. Maier, Principles of dataspace systems, in ACM PODS (2006), pp. 1–9

    Google Scholar 

  34. A.Y. Halevy, Z.G. Ives, D. Suciu, I. Tatarinov, Schema mediation for large-scale semantic data sharing. VLDB J. 14(1), 68–83 (2005)

    Article  Google Scholar 

  35. J. Hammer, M. Stonebraker, O. Topsakal, Thalia: test harness for the assessment of legacy information integration, in ICDE (2005), pp. 485–486

    Google Scholar 

  36. M. Lenzerini, Data integration: a theoretical perspective, in PODS (2002), pp. 233–246

    Google Scholar 

  37. R. Lenzi, C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatelli, A unified multimedia and semantic perspective for data retrieval in the semantic web. Inf. Syst. 36(2), 174–191 (2011)

    Article  Google Scholar 

  38. J.N. Levi, The Syntax and Semantics of Complex Nominals(Academic Press, Cambridge, 1978)

    Google Scholar 

  39. F. Mandreoli, R. Martoglia, Knowledge-based sense disambiguation (almost) for all structures. Inf. Syst. 36(2), 406–430 (2011)

    Article  Google Scholar 

  40. F. Mandreoli, R. Martoglia, W. Penzo, Approximating expressive queries on graph-modeled data: the gex approach. J. Syst. Softw. 2015(109), 106–123 (2015)

    Article  Google Scholar 

  41. F. Mandreoli, R. Martoglia, W. Penzo, S. Sassatelli, Data-sharing p2p networks with semantic approximation capabilities. IEEE IC 13(5), 60–70 (2009)

    MATH  Google Scholar 

  42. F. Mandreoli, R. Martoglia, W. Penzo, S. Sassatelli, G. Villani, Sri@work: efficient and effective routing strategies in a pdms, in WISE (2007), pp. 285–297

    Google Scholar 

  43. F. Mandreoli, R. Martoglia, W. Penzo, S. Sassatelli, G. Villani, Building a pdms infrastructure for xml data sharing with sunrise, in EDBT-DATAX (2008)

    Google Scholar 

  44. F. Mandreoli, R. Martoglia, W. Penzo, G. Villani, Flexible query answering on graph-modeled data. Proc. EDBT 2009, 216–227 (2009)

    Article  Google Scholar 

  45. F. Mandreoli, R. Martoglia, E. Ronchetti, Versatile structural disambiguation for semantic-aware applications, in Proceedings of ACM CIKM (2005), pp. 209–216

    Google Scholar 

  46. F. Mandreoli, R. Martoglia, E. Ronchetti, Strider: a versatile system for structural disambiguation. Proc. EDBT 2006, 1194–1197 (2006)

    Google Scholar 

  47. F. Mandreoli, R. Martoglia, S. Sassatelli, W. Penzo, Sri: exploiting semantic information for effective query routing in a pdms, in Proceedings of of the ACM CIKM Workshop WIDM (2006), pp. 19–26

    Google Scholar 

  48. F. Mandreoli, W. Penzo, S. Rizzi, M. Golfarelli, E. Turricchia, Olap query reformulation in peer-to-peer data warehousing. Inf. Syst. 37(5), 393–411 (2012)

    Article  Google Scholar 

  49. F. Mandreoli, W. Penzo, S. Sassatelli, S. Lodi, R. Martoglia, Semantic peer, here are the neighbors you want!. Proc. EDBT 2008, 26–37 (2008)

    Google Scholar 

  50. J. Milc, A. Sala, S. Bergamaschi, N. Pecchioni, A genotypic and phenotypic information source: the cerealab database. Database (2011)

    Google Scholar 

  51. G.A. Miller, Wordnet: a lexical database for english. C. ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  52. R.J. Miller, D. Fisla, M. Huang, F. Kymlicka, V. Lee, The amalgam schema and data integration test suite (2001), www.cs.toronto.edu/~miller/amalgam

  53. S. Rota, S. Bergamaschi, F. Guerra, The list viterbi training algorithm and its application to keyword search over databases, in CIKM (2011), pp. 1601–1606

    Google Scholar 

  54. G. Simonini, S. Bergamaschi, Enhancing Entity Resolution Efficiency with Loosely Schema-Aware Techniques (2016), pp. 270–277

    Google Scholar 

  55. G. Simonini, S. Bergamaschi, H.V. Jagadish, BLAST: a loosely schema-aware meta-blocking approach for entity resolution. PVLDB 9(12), 1173–1184 (2016)

    Google Scholar 

  56. S. Sorrentino, S. Bergamaschi, E. Fusari, D. Beneventano, Semantic annotation and publication of linked open data. Comput. Sci. Appl. - ICCSA 2013, 462–474 (2013)

    Google Scholar 

  57. S. Sorrentino, S. Bergamaschi, M. Gawinecki, NORMS: an automatic tool to perform schema label normalization, in ICDE’11 (2011), pp. 1344–1347

    Google Scholar 

  58. S. Sorrentino, S. Bergamaschi, M. Gawinecki, L. Po, Schema label normalization for improving schema matching. DKE 69(12), 1254–1273 (2010)

    Article  Google Scholar 

  59. M. Vincini, D. Beneventano, S. Bergamaschi, Semantic integration of heterogeneous data sources in the momis data transformation system. J. UCS - J. Univers. Comput. Sci. 19(13), 1986–2012 (2013)

    Google Scholar 

  60. G. Wiederhold, Intelligent integration of information, in SIGMOD’93, Washington, D.C., 26–28 May 1993 (ACM Press, 1993), pp. 434–437

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sonia Bergamaschi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this chapter

Cite this chapter

Bergamaschi, S. et al. (2018). From Data Integration to Big Data Integration. In: Flesca, S., Greco, S., Masciari, E., Saccà, D. (eds) A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years. Studies in Big Data, vol 31. Springer, Cham. https://doi.org/10.1007/978-3-319-61893-7_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-61893-7_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-61892-0

  • Online ISBN: 978-3-319-61893-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics