Abstract
In recent years, the huge amount of data available from Internet information sources has focused much attention on the sharing of distributed information through P2P and, in line with the Semantic Web vision, through Peer Data Management Systems (PDMSs). On the other hand, XML is with no doubt the most popular data representation and exchange format on the Web and more and more Internet applications are conforming to this de facto standard for data sharing. In this chapter we present SUNRISE (System for Unified Network Routing, Indexing and Semantic Exploration) for XML data sharing.
SUNRISE is a complete PDMS infrastructure aiming at semantic interoperability in heterogeneous networks. Decentralized data sharing is supported by a set of autonomous peers which model their local data through schemas and which are locally connected through semantic mappings. SUNRISE leverages the semantic approximations originating from schemas’ heterogeneity for an effective and efficient organization and exploration of the network. For these purposes, SUNRISE implements soft computing techniques which cluster peers in Semantic Overlay Networks according to their own contents, and promote the routing of queries towards the semantically best directions in the network.
This work is partially supported by the Italian Council co-funded Project NeP4B.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aberer, K., Cudré-Mauroux, P., Hauswirth, M., Pelt, T.V.: GridVine: Building Internet-Scale Semantic Overlay Networks. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 107–121. Springer, Heidelberg (2004)
Abiteboul, S., Allard, T., Chatalic, P., Gardarin, G., Ghitescu, A., Goasdoué, F., Manolescu, I., Nguyen, B., Ouazara, M., Somani, A., Travers, N., Vasile, G., Zoupanos, S.: WebContent: Efficient P2P Warehousing of Web Data. In: Proceedings of the 34th International Conference on Very Large Databases (VLDB), vol. 1(2), pp. 1428–1431 (2008)
Abiteboul, S., Manolescu, I., Polyzotis, N., Preda, N., Sun, C.: XML Processing in DHT Networks. In: Proceedings of the 24th International Conference on Data Engineering (ICDE), pp. 606–615 (2008)
Arenas, M., Kantere, V., Kementsietsidis, A., Kiringa, I., Miller, R., Mylopoulos, J.: The Hyperion Project: from Data Integration to Data Coordination. SIGMOD Record 32(3), 53–58 (2003)
Aumueller, D., Do, H.H., Massmann, S., Rahm, E.: Schema and Ontology Matching with COMA++. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD), pp. 906–908 (2005)
Bawa, M., Manku, G., Raghavan, P.: SETS: Search Enhanced by Topic Segmentation. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pp. 306–313 (2003)
Berners-Lee, T., Hendler, J., Lassila, O.: The Semantic Web. Scientific American (2001)
Bonifati, A., Cuzzocrea, A.: Storing and Retrieving XPath Fragments in Structured P2P Networks. Data Knowledge Engineering 59(2), 247–269 (2006)
Comito, C., Patarin, S., Talia, D.: PARIS: A Peer-to-Peer Architecture for Large-Scale Semantic Data Integration. In: Moro, G., Bergamaschi, S., Joseph, S., Morin, J.-H., Ouksel, A.M. (eds.) DBISP2P 2005 and DBISP2P 2006. LNCS, vol. 4125, pp. 163–170. Springer, Heidelberg (2007)
Cooper, B.: Using Information Retrieval Techniques to Route Queries in an InfoBeacons Network. In: Ng, W.S., Ooi, B.-C., Ouksel, A.M., Sartori, C. (eds.) DBISP2P 2004. LNCS, vol. 3367, pp. 46–60. Springer, Heidelberg (2005)
Crespo, A., Garcia-Molina, H.: Routing Indices for Peer-to-Peer Systems. In: Proceedings of the 22nd IEEE International Conference on Distributed Computing Systems (ICDCS), pp. 23–33 (2002)
Crespo, A., Garcia-Molina, H.: Semantic Overlay Networks for P2P Systems. In: Moro, G., Bergamaschi, S., Aberer, K. (eds.) AP2PC 2004. LNCS (LNAI), vol. 3601, pp. 1–13. Springer, Heidelberg (2005)
Cudré-Mauroux, P., Agarwal, S., Aberer, K.: GridVine: An Infrastructure for Peer Information Management. IEEE Internet Computing 11(5), 36–44 (2007)
Cuenca-Acuna, F., Peery, C., Martin, R., Nguyen, T.: PlanetP: Using Gossiping to Build Content Addressable Peer-to-Peer Information Sharing Communities. In: Proceedings of the 12th International Symposium on High-Performance Distributed Computing (HPDC), pp. 236–249 (2003)
Do, H., Melnik, S., Rahm, E.: Comparison of Schema Matching Evaluations. In: Aksit, M., Mezini, M., Unland, R. (eds.) NODe 2002. LNCS, vol. 2591, pp. 221–237. Springer, Heidelberg (2003)
Do, H., Rahm, E.: COMA – A System for Flexible Combination of Schema Matching Approaches. In: Proceedings of 28th International Conference on Very Large Data Bases (VLDB), pp. 610–621 (2002)
Doulkeridis, C., Nørvåg, K., Vazirgiannis, M.: DESENT: Decentralized and Distributed Semantic Overlay Generation in P2P Networks. IEEE Journal on Selected Areas in Communications 25(1), 25–34 (2007)
Fagin, R.: Combining Fuzzy Information: an Overview. SIGMOD Record 31(2), 109–118 (2002)
Ganti, V., Ramakrishnan, R., Gehrke, J., Powell, A., French, J.: Clustering Large Datasets in Arbitrary Metric Spaces. In: Proceedings of the 15th International Conference on Data Engineering (ICDE), pp. 502–511 (1999)
Haase, P., Siebes, R., van Harmelen, F.: Peer Selection in Peer-to-Peer Networks with Semantic Topologies. In: Proceedings of the 1st International Conference on Semantics of a Networked World (ICNSW), pp. 108–125 (2004)
Halevy, A., Ives, Z., Madhavan, J., Mork, P., Suciu, D., Tatarinov, I.: The Piazza Peer Data Management System. IEEE Transactions on Knowledge and Data Engineering 16(7), 787–798 (2004)
Halevy, A., Ives, Z., Mork, P., Tatarinov, I.: Piazza: Data Management Infrastructure for Semantic Web Applications. In: Proceedings of the 12th International World Wide Web Conference (WWW), pp. 556–567 (2003)
Halevy, A., Ives, Z., Suciu, D., Tatarinov, I.: Schema Mediation for Large-Scale Semantic Data Sharing. VLDB Journal 14(1), 68–83 (2005)
Hernández, M.A., Miller, R.J., Haas, L.M.: Clio: A Semi-Automatic Tool For Schema Mapping. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD), p. 607 (2001)
Joseph, S.: NeuroGrid: Semantically Routing Queries in Peer-to-Peer Networks. In: Gregori, E., Cherkasova, L., Cugola, G., Panzieri, F., Picco, G.P. (eds.) NETWORKING 2002. LNCS, vol. 2376, pp. 202–214. Springer, Heidelberg (2002)
Klir, G.J., Yuan, B.: Fuzzy Sets and Fuzzy Logic: Theory and Applications. Prentice-Hall, Englewood Cliffs (1995)
Koloniari, G., Pitoura, E.: Content-Based Routing of Path Queries in Peer-to-Peer Systems. In: Bertino, E., Christodoulakis, S., Plexousakis, D., Christophides, V., Koubarakis, M., Böhm, K., Ferrari, E. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 29–47. Springer, Heidelberg (2004)
Li, M., Lee, W., Sivasubramaniam, A.: Semantic Small World: An Overlay Network for Peer-to-Peer Search. In: Proceedings of the 12th IEEE International Conference on Network Protocols (ICNP), pp. 228–238 (2004)
Linari, A., Weikum, G.: Efficient Peer-to-Peer Semantic Overlay Networks Based on Statistical Language Models. In: Proceedings of the Information Retrieval in Peer-to-Peer Networks Workshop (P2PIR) (in conj. with the ACM 15th Conference on Information and Knowledge Management (CIKM)), pp. 9–16 (2006)
Lodi, S., Mandreoli, F., Martoglia, R., Penzo, W., Sassatelli, S.: Semantic Peer, Here are the Neighbors You Want! In: Proceedings of the 11th International Conference on Extending Database Technology (EDBT), pp. 26–37 (2008)
Madhavan, J., Bernstein, P.A., Rahm, E.: Generic Schema Matching with Cupid. In: Proceedings of 27th International Conference on Very Large Data Bases (VLDB), pp. 49–58 (2001)
Mandreoli, F., Martoglia, R., Penzo, W., Sassatelli, S.: SRI: Exploiting Semantic Information for Effective Query Routing in a PDMS. In: Proceedings of the 8th ACM International Workshop on Web Information and Data Management (WIDM) (in conj. with the ACM 15th Conference on Information and Knowledge Management (CIKM)), pp. 19–26 (2006)
Mandreoli, F., Martoglia, R., Penzo, W., Sassatelli, S.: Data-Sharing P2P Networks with Semantic Approximation Capabilities. IEEE Internet Computing 13(5), 60–70 (2009)
Mandreoli, F., Martoglia, R., Penzo, W., Sassatelli, S., Villani, G.: SRI@work: Efficient and Effective Routing Strategies in a PDMS. In: Benatallah, B., Casati, F., Georgakopoulos, D., Bartolini, C., Sadiq, W., Godart, C. (eds.) WISE 2007. LNCS, vol. 4831, pp. 285–297. Springer, Heidelberg (2007)
Mandreoli, F., Martoglia, R., Penzo, W., Sassatelli, S., Villani, G.: SUNRISE: Exploring PDMS Networks with Semantic Routing Indexes. In: Proceedings of the 4th European Semantic Web Conference, ESWC (2007)
Mandreoli, F., Martoglia, R., Penzo, W., Sassatelli, S., Villani, G.: Building a PDMS Infrastructure for XML Data Sharing with SUNRISE. In: Proc. of DATAX (in conj. with EDBT) (2008)
Mandreoli, F., Martoglia, R., Ronchetti, E.: Versatile Structural Disambiguation for Semantic-aware Applications. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management (CIKM), pp. 209–216 (2005)
Mandreoli, F., Martoglia, R., Ronchetti, E.: STRIDER: a Versatile System for Structural Disambiguation. In: Ioannidis, Y., Scholl, M.H., Schmidt, J.W., Matthes, F., Hatzopoulos, M., Böhm, K., Kemper, A., Grust, T., Böhm, C. (eds.) EDBT 2006. LNCS, vol. 3896, pp. 1194–1197. Springer, Heidelberg (2006)
Mandreoli, F., Martoglia, R., Tiberio, P.: Approximate Query Answering for a Heterogeneous XML Document Base. In: Zhou, X., Su, S., Papazoglou, M.P., Orlowska, M.E., Jeffery, K. (eds.) WISE 2004. LNCS, vol. 3306, pp. 337–351. Springer, Heidelberg (2004)
Melnik, S., Garcia-Molina, H., Rahm, E.: Similarity Flooding: A Versatile Graph Matching Algorithm and Its Application to Schema Matching. In: Proceedings of the 18th International Conference on Data Engineering (ICDE), pp. 117–128 (2002)
Michel, S., Bender, M., Triantafillou, P., Weikum, G.: IQN Routing: Integrating Quality and Novelty in P2P Querying and Ranking. In: Ioannidis, Y., Scholl, M.H., Schmidt, J.W., Matthes, F., Hatzopoulos, M., Böhm, K., Kemper, A., Grust, T., Böhm, C. (eds.) EDBT 2006. LNCS, vol. 3896, pp. 149–166. Springer, Heidelberg (2006)
Miller, R., Haas, L., Hernández, M.: Schema Mapping as Query Discovery. In: Proceedings of 26th International Conference on Very Large Data Bases (VLDB), pp. 77–88 (2000)
Nejdl, W., Wolpers, M., Siberski, W., Schmitz, C., Schlosser, M.T., Brunkhorst, I., Loser, A.: Superpeer-based Routing and Clustering Strategies for RDF-based Peer-to-Peer Networks. Journal of Web Semantics 1(2), 177–186 (2004)
Nejdl, W., Wolpers, M., Siberski, W., Schmitz, C., Schlosser, M., Brunkhorst, I., Löser, A.: Super-Peer-Based Routing and Clustering Strategies for RDF-based Peer-to-Peer Networks. In: Proceedings of the 12th World Wide Web Conference (WWW), pp. 536–543 (2003)
Parreira, J., Michel, S., Weikum, G.: P2PDating: Real Life Inspired Semantic Overlay Networks for Web Search. Information Processing and Management 43(3), 643–664 (2007)
Penzo, W.: Rewriting Rules To Permeate Complex Similarity and Fuzzy Queries within a Relational Database System. IEEE Transactions on Knowledge and Data Engineering 17(2), 255–270 (2005)
Rao, P., Moon, B.: An Internet-Scale Service for Publishing and Locating XML Documents. In: Proceedings of the 25th International Conference on Data Engineering (ICDE), pp. 1459–1462 (2009)
Sartiani, C., Manghi, P., Ghelli, G., Conforti, G.: XPeer: A self-organizing XML P2P database system. In: Lindner, W., Mesiti, M., Türker, C., Tzitzikas, Y., Vakali, A.I. (eds.) EDBT 2004. LNCS, vol. 3268, pp. 456–465. Springer, Heidelberg (2004)
Stoica, I., Morris, R., Karger, D., Kaashoek, M., Balakrishnan, H.: Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications. In: Proceedings of the ACM SIGCOMM Conference on Application, Technologies, Architectures and Protocols for Computer Communication (SIGCOMM), pp. 149–160 (2001)
Tatarinov, I., Halevy, A.: Efficient Query Reformulation in Peer Data Management Systems. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD), pp. 539–550 (2004)
Tempich, C., Staab, S., Wranik, A.: REMINDIN’: Semantic Query Routing in Peer-to-Peer Networks Based on Social Metaphors. In: Proceedings of the 13th International Conference on World Wide Web (WWW), pp. 640–649 (2004)
Triantafillou, P., Xiruhaki, C., Koubarakis, M., Ntarmos, N.: Towards High Performance Peer-to-Peer Content and Resource Sharing Systems. In: Proceedings of the 1st biennial Conference on Innovative Data Systems Research, CIDR (2003)
Winter, J., Drobnik, O.: SPIRIX: A Peer-to-Peer Search Engine for XML-Retrieval. Advances in Focused Retrieval, 237–242 (2009)
Yang, B., Garcia-Molina, H.: Improving Search in Peer-to-Peer Networks. In: Proceedings of the 22nd IEEE International Conference on Distributed Computing Systems (ICDCS), pp. 5–14 (2002)
Yu, C., Jagadish, H.: Schema Summarization. In: Proceedings of 32nd International Conference on Very Large Data Bases (VLDB), pp. 319–330 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Mandreoli, F., Martoglia, R., Penzo, W., Sassatelli, S., Villani, G. (2010). Leveraging Semantic Approximations in Heterogeneous XML Data Sharing Networks: The SUNRISE Approach. In: Ma, Z., Yan, L. (eds) Soft Computing in XML Data Management. Studies in Fuzziness and Soft Computing, vol 255. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14010-5_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-14010-5_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14009-9
Online ISBN: 978-3-642-14010-5
eBook Packages: EngineeringEngineering (R0)