Skip to main content

Leveraging Semantic Approximations in Heterogeneous XML Data Sharing Networks: The SUNRISE Approach

  • Chapter
Soft Computing in XML Data Management

Part of the book series: Studies in Fuzziness and Soft Computing ((STUDFUZZ,volume 255))

  • 619 Accesses

Abstract

In recent years, the huge amount of data available from Internet information sources has focused much attention on the sharing of distributed information through P2P and, in line with the Semantic Web vision, through Peer Data Management Systems (PDMSs). On the other hand, XML is with no doubt the most popular data representation and exchange format on the Web and more and more Internet applications are conforming to this de facto standard for data sharing. In this chapter we present SUNRISE (System for Unified Network Routing, Indexing and Semantic Exploration) for XML data sharing.

SUNRISE is a complete PDMS infrastructure aiming at semantic interoperability in heterogeneous networks. Decentralized data sharing is supported by a set of autonomous peers which model their local data through schemas and which are locally connected through semantic mappings. SUNRISE leverages the semantic approximations originating from schemas’ heterogeneity for an effective and efficient organization and exploration of the network. For these purposes, SUNRISE implements soft computing techniques which cluster peers in Semantic Overlay Networks according to their own contents, and promote the routing of queries towards the semantically best directions in the network.

This work is partially supported by the Italian Council co-funded Project NeP4B.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aberer, K., Cudré-Mauroux, P., Hauswirth, M., Pelt, T.V.: GridVine: Building Internet-Scale Semantic Overlay Networks. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 107–121. Springer, Heidelberg (2004)

    Google Scholar 

  2. Abiteboul, S., Allard, T., Chatalic, P., Gardarin, G., Ghitescu, A., Goasdoué, F., Manolescu, I., Nguyen, B., Ouazara, M., Somani, A., Travers, N., Vasile, G., Zoupanos, S.: WebContent: Efficient P2P Warehousing of Web Data. In: Proceedings of the 34th International Conference on Very Large Databases (VLDB), vol. 1(2), pp. 1428–1431 (2008)

    Google Scholar 

  3. Abiteboul, S., Manolescu, I., Polyzotis, N., Preda, N., Sun, C.: XML Processing in DHT Networks. In: Proceedings of the 24th International Conference on Data Engineering (ICDE), pp. 606–615 (2008)

    Google Scholar 

  4. Arenas, M., Kantere, V., Kementsietsidis, A., Kiringa, I., Miller, R., Mylopoulos, J.: The Hyperion Project: from Data Integration to Data Coordination. SIGMOD Record 32(3), 53–58 (2003)

    Article  Google Scholar 

  5. Aumueller, D., Do, H.H., Massmann, S., Rahm, E.: Schema and Ontology Matching with COMA++. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD), pp. 906–908 (2005)

    Google Scholar 

  6. Bawa, M., Manku, G., Raghavan, P.: SETS: Search Enhanced by Topic Segmentation. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pp. 306–313 (2003)

    Google Scholar 

  7. Berners-Lee, T., Hendler, J., Lassila, O.: The Semantic Web. Scientific American (2001)

    Google Scholar 

  8. Bonifati, A., Cuzzocrea, A.: Storing and Retrieving XPath Fragments in Structured P2P Networks. Data Knowledge Engineering 59(2), 247–269 (2006)

    Article  Google Scholar 

  9. Comito, C., Patarin, S., Talia, D.: PARIS: A Peer-to-Peer Architecture for Large-Scale Semantic Data Integration. In: Moro, G., Bergamaschi, S., Joseph, S., Morin, J.-H., Ouksel, A.M. (eds.) DBISP2P 2005 and DBISP2P 2006. LNCS, vol. 4125, pp. 163–170. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  10. Cooper, B.: Using Information Retrieval Techniques to Route Queries in an InfoBeacons Network. In: Ng, W.S., Ooi, B.-C., Ouksel, A.M., Sartori, C. (eds.) DBISP2P 2004. LNCS, vol. 3367, pp. 46–60. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  11. Crespo, A., Garcia-Molina, H.: Routing Indices for Peer-to-Peer Systems. In: Proceedings of the 22nd IEEE International Conference on Distributed Computing Systems (ICDCS), pp. 23–33 (2002)

    Google Scholar 

  12. Crespo, A., Garcia-Molina, H.: Semantic Overlay Networks for P2P Systems. In: Moro, G., Bergamaschi, S., Aberer, K. (eds.) AP2PC 2004. LNCS (LNAI), vol. 3601, pp. 1–13. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  13. Cudré-Mauroux, P., Agarwal, S., Aberer, K.: GridVine: An Infrastructure for Peer Information Management. IEEE Internet Computing 11(5), 36–44 (2007)

    Article  Google Scholar 

  14. Cuenca-Acuna, F., Peery, C., Martin, R., Nguyen, T.: PlanetP: Using Gossiping to Build Content Addressable Peer-to-Peer Information Sharing Communities. In: Proceedings of the 12th International Symposium on High-Performance Distributed Computing (HPDC), pp. 236–249 (2003)

    Google Scholar 

  15. Do, H., Melnik, S., Rahm, E.: Comparison of Schema Matching Evaluations. In: Aksit, M., Mezini, M., Unland, R. (eds.) NODe 2002. LNCS, vol. 2591, pp. 221–237. Springer, Heidelberg (2003)

    Google Scholar 

  16. Do, H., Rahm, E.: COMA – A System for Flexible Combination of Schema Matching Approaches. In: Proceedings of 28th International Conference on Very Large Data Bases (VLDB), pp. 610–621 (2002)

    Google Scholar 

  17. Doulkeridis, C., Nørvåg, K., Vazirgiannis, M.: DESENT: Decentralized and Distributed Semantic Overlay Generation in P2P Networks. IEEE Journal on Selected Areas in Communications 25(1), 25–34 (2007)

    Article  Google Scholar 

  18. Fagin, R.: Combining Fuzzy Information: an Overview. SIGMOD Record 31(2), 109–118 (2002)

    Article  Google Scholar 

  19. Ganti, V., Ramakrishnan, R., Gehrke, J., Powell, A., French, J.: Clustering Large Datasets in Arbitrary Metric Spaces. In: Proceedings of the 15th International Conference on Data Engineering (ICDE), pp. 502–511 (1999)

    Google Scholar 

  20. Haase, P., Siebes, R., van Harmelen, F.: Peer Selection in Peer-to-Peer Networks with Semantic Topologies. In: Proceedings of the 1st International Conference on Semantics of a Networked World (ICNSW), pp. 108–125 (2004)

    Google Scholar 

  21. Halevy, A., Ives, Z., Madhavan, J., Mork, P., Suciu, D., Tatarinov, I.: The Piazza Peer Data Management System. IEEE Transactions on Knowledge and Data Engineering 16(7), 787–798 (2004)

    Article  Google Scholar 

  22. Halevy, A., Ives, Z., Mork, P., Tatarinov, I.: Piazza: Data Management Infrastructure for Semantic Web Applications. In: Proceedings of the 12th International World Wide Web Conference (WWW), pp. 556–567 (2003)

    Google Scholar 

  23. Halevy, A., Ives, Z., Suciu, D., Tatarinov, I.: Schema Mediation for Large-Scale Semantic Data Sharing. VLDB Journal 14(1), 68–83 (2005)

    Article  Google Scholar 

  24. Hernández, M.A., Miller, R.J., Haas, L.M.: Clio: A Semi-Automatic Tool For Schema Mapping. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD), p. 607 (2001)

    Google Scholar 

  25. Joseph, S.: NeuroGrid: Semantically Routing Queries in Peer-to-Peer Networks. In: Gregori, E., Cherkasova, L., Cugola, G., Panzieri, F., Picco, G.P. (eds.) NETWORKING 2002. LNCS, vol. 2376, pp. 202–214. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  26. Klir, G.J., Yuan, B.: Fuzzy Sets and Fuzzy Logic: Theory and Applications. Prentice-Hall, Englewood Cliffs (1995)

    MATH  Google Scholar 

  27. Koloniari, G., Pitoura, E.: Content-Based Routing of Path Queries in Peer-to-Peer Systems. In: Bertino, E., Christodoulakis, S., Plexousakis, D., Christophides, V., Koubarakis, M., Böhm, K., Ferrari, E. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 29–47. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  28. Li, M., Lee, W., Sivasubramaniam, A.: Semantic Small World: An Overlay Network for Peer-to-Peer Search. In: Proceedings of the 12th IEEE International Conference on Network Protocols (ICNP), pp. 228–238 (2004)

    Google Scholar 

  29. Linari, A., Weikum, G.: Efficient Peer-to-Peer Semantic Overlay Networks Based on Statistical Language Models. In: Proceedings of the Information Retrieval in Peer-to-Peer Networks Workshop (P2PIR) (in conj. with the ACM 15th Conference on Information and Knowledge Management (CIKM)), pp. 9–16 (2006)

    Google Scholar 

  30. Lodi, S., Mandreoli, F., Martoglia, R., Penzo, W., Sassatelli, S.: Semantic Peer, Here are the Neighbors You Want! In: Proceedings of the 11th International Conference on Extending Database Technology (EDBT), pp. 26–37 (2008)

    Google Scholar 

  31. Madhavan, J., Bernstein, P.A., Rahm, E.: Generic Schema Matching with Cupid. In: Proceedings of 27th International Conference on Very Large Data Bases (VLDB), pp. 49–58 (2001)

    Google Scholar 

  32. Mandreoli, F., Martoglia, R., Penzo, W., Sassatelli, S.: SRI: Exploiting Semantic Information for Effective Query Routing in a PDMS. In: Proceedings of the 8th ACM International Workshop on Web Information and Data Management (WIDM) (in conj. with the ACM 15th Conference on Information and Knowledge Management (CIKM)), pp. 19–26 (2006)

    Google Scholar 

  33. Mandreoli, F., Martoglia, R., Penzo, W., Sassatelli, S.: Data-Sharing P2P Networks with Semantic Approximation Capabilities. IEEE Internet Computing 13(5), 60–70 (2009)

    Article  Google Scholar 

  34. Mandreoli, F., Martoglia, R., Penzo, W., Sassatelli, S., Villani, G.: SRI@work: Efficient and Effective Routing Strategies in a PDMS. In: Benatallah, B., Casati, F., Georgakopoulos, D., Bartolini, C., Sadiq, W., Godart, C. (eds.) WISE 2007. LNCS, vol. 4831, pp. 285–297. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  35. Mandreoli, F., Martoglia, R., Penzo, W., Sassatelli, S., Villani, G.: SUNRISE: Exploring PDMS Networks with Semantic Routing Indexes. In: Proceedings of the 4th European Semantic Web Conference, ESWC (2007)

    Google Scholar 

  36. Mandreoli, F., Martoglia, R., Penzo, W., Sassatelli, S., Villani, G.: Building a PDMS Infrastructure for XML Data Sharing with SUNRISE. In: Proc. of DATAX (in conj. with EDBT) (2008)

    Google Scholar 

  37. Mandreoli, F., Martoglia, R., Ronchetti, E.: Versatile Structural Disambiguation for Semantic-aware Applications. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management (CIKM), pp. 209–216 (2005)

    Google Scholar 

  38. Mandreoli, F., Martoglia, R., Ronchetti, E.: STRIDER: a Versatile System for Structural Disambiguation. In: Ioannidis, Y., Scholl, M.H., Schmidt, J.W., Matthes, F., Hatzopoulos, M., Böhm, K., Kemper, A., Grust, T., Böhm, C. (eds.) EDBT 2006. LNCS, vol. 3896, pp. 1194–1197. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  39. Mandreoli, F., Martoglia, R., Tiberio, P.: Approximate Query Answering for a Heterogeneous XML Document Base. In: Zhou, X., Su, S., Papazoglou, M.P., Orlowska, M.E., Jeffery, K. (eds.) WISE 2004. LNCS, vol. 3306, pp. 337–351. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  40. Melnik, S., Garcia-Molina, H., Rahm, E.: Similarity Flooding: A Versatile Graph Matching Algorithm and Its Application to Schema Matching. In: Proceedings of the 18th International Conference on Data Engineering (ICDE), pp. 117–128 (2002)

    Google Scholar 

  41. Michel, S., Bender, M., Triantafillou, P., Weikum, G.: IQN Routing: Integrating Quality and Novelty in P2P Querying and Ranking. In: Ioannidis, Y., Scholl, M.H., Schmidt, J.W., Matthes, F., Hatzopoulos, M., Böhm, K., Kemper, A., Grust, T., Böhm, C. (eds.) EDBT 2006. LNCS, vol. 3896, pp. 149–166. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  42. Miller, R., Haas, L., Hernández, M.: Schema Mapping as Query Discovery. In: Proceedings of 26th International Conference on Very Large Data Bases (VLDB), pp. 77–88 (2000)

    Google Scholar 

  43. Nejdl, W., Wolpers, M., Siberski, W., Schmitz, C., Schlosser, M.T., Brunkhorst, I., Loser, A.: Superpeer-based Routing and Clustering Strategies for RDF-based Peer-to-Peer Networks. Journal of Web Semantics 1(2), 177–186 (2004)

    Google Scholar 

  44. Nejdl, W., Wolpers, M., Siberski, W., Schmitz, C., Schlosser, M., Brunkhorst, I., Löser, A.: Super-Peer-Based Routing and Clustering Strategies for RDF-based Peer-to-Peer Networks. In: Proceedings of the 12th World Wide Web Conference (WWW), pp. 536–543 (2003)

    Google Scholar 

  45. Parreira, J., Michel, S., Weikum, G.: P2PDating: Real Life Inspired Semantic Overlay Networks for Web Search. Information Processing and Management 43(3), 643–664 (2007)

    Article  Google Scholar 

  46. Penzo, W.: Rewriting Rules To Permeate Complex Similarity and Fuzzy Queries within a Relational Database System. IEEE Transactions on Knowledge and Data Engineering 17(2), 255–270 (2005)

    Article  Google Scholar 

  47. Rao, P., Moon, B.: An Internet-Scale Service for Publishing and Locating XML Documents. In: Proceedings of the 25th International Conference on Data Engineering (ICDE), pp. 1459–1462 (2009)

    Google Scholar 

  48. Sartiani, C., Manghi, P., Ghelli, G., Conforti, G.: XPeer: A self-organizing XML P2P database system. In: Lindner, W., Mesiti, M., Türker, C., Tzitzikas, Y., Vakali, A.I. (eds.) EDBT 2004. LNCS, vol. 3268, pp. 456–465. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  49. Stoica, I., Morris, R., Karger, D., Kaashoek, M., Balakrishnan, H.: Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications. In: Proceedings of the ACM SIGCOMM Conference on Application, Technologies, Architectures and Protocols for Computer Communication (SIGCOMM), pp. 149–160 (2001)

    Google Scholar 

  50. Tatarinov, I., Halevy, A.: Efficient Query Reformulation in Peer Data Management Systems. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD), pp. 539–550 (2004)

    Google Scholar 

  51. Tempich, C., Staab, S., Wranik, A.: REMINDIN’: Semantic Query Routing in Peer-to-Peer Networks Based on Social Metaphors. In: Proceedings of the 13th International Conference on World Wide Web (WWW), pp. 640–649 (2004)

    Google Scholar 

  52. Triantafillou, P., Xiruhaki, C., Koubarakis, M., Ntarmos, N.: Towards High Performance Peer-to-Peer Content and Resource Sharing Systems. In: Proceedings of the 1st biennial Conference on Innovative Data Systems Research, CIDR (2003)

    Google Scholar 

  53. Winter, J., Drobnik, O.: SPIRIX: A Peer-to-Peer Search Engine for XML-Retrieval. Advances in Focused Retrieval, 237–242 (2009)

    Google Scholar 

  54. Yang, B., Garcia-Molina, H.: Improving Search in Peer-to-Peer Networks. In: Proceedings of the 22nd IEEE International Conference on Distributed Computing Systems (ICDCS), pp. 5–14 (2002)

    Google Scholar 

  55. Yu, C., Jagadish, H.: Schema Summarization. In: Proceedings of 32nd International Conference on Very Large Data Bases (VLDB), pp. 319–330 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Mandreoli, F., Martoglia, R., Penzo, W., Sassatelli, S., Villani, G. (2010). Leveraging Semantic Approximations in Heterogeneous XML Data Sharing Networks: The SUNRISE Approach. In: Ma, Z., Yan, L. (eds) Soft Computing in XML Data Management. Studies in Fuzziness and Soft Computing, vol 255. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14010-5_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-14010-5_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-14009-9

  • Online ISBN: 978-3-642-14010-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics