Skip to main content

An Approach to Classify Semi-Structured Objects

  • Conference paper
  • First Online:
ECOOP’ 99 — Object-Oriented Programming (ECOOP 1999)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1628))

Included in the following conference series:

Abstract

Several advanced applications, such as those dealing with the Web, need to handle data whose structure is not known a-priori. Such requirement severely limits the applicability of traditional database techniques, that are based on the fact that the structure of data (e.g. the database schema) is known before data are entered into the database. Moreover, in traditional database systems, whenever a data item (e.g. a tuple, an object, and so on) is entered, the application specifies the collection (e.g. relation, class, and so on) the data item belongs to. Collections are the basis for handling queries and indexing and therefore a proper classification of data items in collections is crucial. In this paper, we address this issue in the context of an extended object-oriented data model. We propose an approach to classify objects, created without specifying the class they belong to, in the most appropriate class of the schema, that is, the class closest to the object state. In particular, we introduce the notion of weak membership of an object in a class, and define two measures, the conformity and the heterogeneity degrees, ex- ploited by our classification algorithm to identify the most appropriate class in which an object can be classified, among the ones of which it is a weak member.

Work partially supported by the Italian MURST under the Interdata Project.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. S. Abiteboul. Querying Semi-Structured Data. In F. Afrati and P. Kolaitis, editors, Database Theory-ICDT’97, pages 1–18, 1997.

    Google Scholar 

  2. S. Abiteboul, S. Cluet, and T. Milo. Correspondence and Traslation for Heterogeneous Data. In F. Afrati and P. Kolaitis, editors, Database Theory-ICDT’97, pages 351–363, 1997.

    Google Scholar 

  3. S. Abiteboul, R. Motwani, and S. Nestorov. Inferring Structure in Semistructured Data. In Proc. Workshop on Management of Semistructured Data, SIGMOD Record, 26(4):39–43, 1997.

    Google Scholar 

  4. S. Abiteboul, D. Quass, J. McHugh, J. Widom, and J. L. Wiener. The Lorel Query Language for Semistructured Data. Journal of Digital Libraries, 1(1):68–88, 1996.

    Google Scholar 

  5. S. Abiteboul and V. Vianu. Queries and Computation on the Web. In F. Afrati and P. Kolaitis, editors, Database Theory-ICDT’97, pages 262–275, 1997.

    Google Scholar 

  6. R. Agrawal, A. Borgida, and H. Jagadish. Effcient Management of Transitive Relationships in Large Data and Knowledge Bases. In J. Clifford, B. Lindsay, and D. Maier, editors, Proc. of the ACM SIGMOD Int’l Conf. on Management of Data, pages 253–262, 1989.

    Google Scholar 

  7. P. L. Bergstein and K. J. Lieberherr. Incremental Class Dictionary Learning and Optimization. In P. America, editor, Proc. Fifth European Conference on Object-Oriented Programming, number 512 in Lecture Notes in Computer Science, pages 377–396, 1991.

    Chapter  Google Scholar 

  8. E. Bertino, G. Guerrini, I. Merlo, and M. Mesiti. An Object-Oriented Data Model for Semi-Structured Data. Technical Report DISI-TR-99-06, University of Genova, Department of Computer Science (DISI), 1998.

    Google Scholar 

  9. R. Breitl, D. Maier, A. Otis, J. Penney, B. Schuchardt, J. Stein, E. H. Williams, and M. Williams. The GemStone Data Management System. In W. Kim and F. H. Lochovsky, editors, Object-Oriented Concepts, Databases, and Applications, pages 283–308. Addison-Wesley, 1989.

    Google Scholar 

  10. P. Buneman. Semistructured Data. In Proc. of 6th ACM SIGACT-SIGMOD-SIGART Symposium on PODS, pages 117–121, 1997. Tutorial.

    Google Scholar 

  11. P. Buneman, S. Davidson, M. Fernandez, and D. Suciu. Adding Structure to Unstructured Data. In F. Afrati and P. Kolaitis, editors, Database Theory-ICDT’97, pages 336–350, 1997.

    Google Scholar 

  12. P. Buneman, S. Davidson, D. Suciu, and G. Hillebrand. A Query Language and Optimization Techniques for Unstructured Data. In Proc. of the ACM SIGMOD Int’l Conf. on Management of Data, pages 505–516, 1996.

    Google Scholar 

  13. V. Christophides, S. Abiteboul, S. Cluet, and M. Scholl. From Structured Documents to Novel Query Facilities. In Proc. of the ACM SIGMOD Int’l Conf. on Management of Data, pages 313–324, 1994.

    Google Scholar 

  14. S. Cluet. Modeling and Querying Semi-Structured Data. In M. T. Pazienza, editor, Information Extraction. LNAI 1299, pages 192–213, 1997.

    Google Scholar 

  15. O. Deux et al. The Story of o2. IEEE Transactions on Knowledge and Data Engineering, 2(1):91–108, 1990.

    Article  Google Scholar 

  16. A. Goldberg and D. Robson. Smalltalk-80: The Language and its Implementation. Addison-Wesley, 1983.

    Google Scholar 

  17. R. Goldman and J. Widom. Dataguides: Enabling Query Formulation and Optimization in Semistructured Databases. In Proc. Twentythird Int’l Conf. on Very Large Data Bases, pages 436–445, 1997.

    Google Scholar 

  18. G. Guerrini, E. Bertino, and R. Bal. A Formal De nition of the Chimera Object-Oriented Data Model. Journal of Intelligent Information Systems, 11(1):5–40, 1998.

    Article  Google Scholar 

  19. J. Hammer, H. Garcia-Molina, J. Cho, R. Aranha, and A. Crespo. Extracting Semistructured Information from the Web, 1997. Available via anonymous ftp at ftp://db.stanford.edu/pub/paper/extract.ps.

  20. M. Henzinger, T. Henzinger, and P. Kopke. Computing Simulation on Finite and Infinite Graphs. In Proc. of 20th Symposium on Foundations on Computer Science, pages 453–462, 1995.

    Google Scholar 

  21. S. Holzner. XML Complete. McGraw-Hill, 1998.

    Google Scholar 

  22. R. Milner. An Algebraic Definition of Simulation between Programs. In Proc. of the 2nd IJCAI, pages 481–489, London, UK, 1971.

    Google Scholar 

  23. S. Nestorov, S. Abiteboul, and R. Motwani. Extracting Schema from Semistructured Data. In L. M. Haas and A. Tiwary, editors, Proc. of the ACM SIGMOD Int’l Conf. on Management of Data, pages 295–306, 1998.

    Google Scholar 

  24. Y. Papakonstantinou, H. Garcia-Molina, and J. Widom. Object Exchange Across Heterogeneous Information Sources. In Proc. of the 11th Int’l Conf. on Data Engineering, pages 251–260, 1995.

    Google Scholar 

  25. C. Peltason, A. Schmiedel, C. Kindermann, and J. Quantz. The BACK System Revisited. Technical Report KIT-Report 75, Technische Universitat Berlin, 1989.

    Google Scholar 

  26. F. Rabitti. The Multos Document Model, volume Human Factors in Information Technology of 6, chapter 3, pages 17–52. North-Holland, 1990.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bertino, E., Guerrini, G., Merlo, I., Mesiti, M. (1999). An Approach to Classify Semi-Structured Objects. In: Guerraoui, R. (eds) ECOOP’ 99 — Object-Oriented Programming. ECOOP 1999. Lecture Notes in Computer Science, vol 1628. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48743-3_19

Download citation

  • DOI: https://doi.org/10.1007/3-540-48743-3_19

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-66156-6

  • Online ISBN: 978-3-540-48743-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics